feat: scaffold avatar pipeline with ComfyUI driver, MLflow logging, and rclone promotion

- setup.sh: automated desktop env setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray) - ray-join.sh: join Ray cluster as external worker with 3d_gen resource label - vrm_export.py: headless Blender GLB→VRM conversion script - generate.py: ComfyUI API driver (submit workflow JSON, poll, download outputs) - log_mlflow.py: REST-only MLflow experiment tracking (no SDK dependency) - promote.py: rclone promotion of VRM files to gravenhollow S3 - CLI entry points: avatar-generate, avatar-promote - workflows/ placeholder for ComfyUI exported workflow JSONs Implements ADR-0063 (ComfyUI + TRELLIS + UniRig 3D avatar pipeline)
2026-02-24 05:44:04 -05:00
parent a0c24406bd
commit 202b4e1d61
11 changed files with 1138 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -1,2 +1,191 @@
 # avatar-pipeline

+ComfyUI-based image-to-VRM avatar generation pipeline using [TRELLIS](https://github.com/microsoft/TRELLIS) (image → 3D mesh) and [UniRig](https://github.com/VAST-AI-Research/UniRig) (automatic rigging), running on a desktop NVIDIA GPU as an on-demand [Ray](https://www.ray.io/) worker.
+
+See [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) for the full design decision.
+
+## Overview
+
+```
+Reference Image → TRELLIS (1.2B) → Textured GLB → UniRig → Rigged GLB → Blender CLI → VRM
+                                                                                        │
+                                                                               rclone → gravenhollow
+                                                                               MLflow → experiment tracking
+```
+
+**Hardware target:** Arch Linux desktop with Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 (12 GB VRAM).
+
+## Project Structure
+
+```
+avatar-pipeline/
+├── pyproject.toml             # uv project, CLI entry points
+├── renovate.json              # Renovate dependency updates
+├── scripts/
+│   ├── setup.sh               # Desktop environment setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray)
+│   ├── ray-join.sh            # Join the Ray cluster as external worker
+│   └── vrm_export.py          # Blender headless GLB → VRM conversion
+├── workflows/
+│   └── (ComfyUI workflow JSONs — exported from UI)
+├── avatar_pipeline/
+│   ├── __init__.py
+│   ├── generate.py            # ComfyUI API driver: submit workflow → poll → download
+│   ├── log_mlflow.py          # Log params/metrics/artifacts to MLflow REST API
+│   └── promote.py             # rclone promote VRM files to gravenhollow storage
+├── LICENSE
+└── README.md
+```
+
+## Setup
+
+### Prerequisites
+
+- NVIDIA drivers + CUDA toolkit (`sudo pacman -S nvidia nvidia-utils cuda cudnn`)
+- [uv](https://astral.sh/uv) installed
+- [Blender](https://www.blender.org/) 4.x with [VRM Add-on](https://vrm-addon-for-blender.info/en/)
+- [rclone](https://rclone.org/) configured with `gravenhollow` remote
+
+### Install Everything
+
+```bash
+./scripts/setup.sh
+```
+
+This installs ComfyUI, ComfyUI-3D-Pack (includes TRELLIS nodes), UniRig, Ray, and the avatar-pipeline package.
+
+For partial installs:
+```bash
+./scripts/setup.sh --comfyui-only   # ComfyUI + 3D-Pack only
+./scripts/setup.sh --unirig-only    # UniRig only
+```
+
+### Join Ray Cluster
+
+```bash
+# Set the Ray head address (exposed via NodePort from the Talos cluster)
+./scripts/ray-join.sh 192.168.100.50:6379
+
+# Or via env var
+RAY_HEAD_ADDRESS=192.168.100.50:6379 ./scripts/ray-join.sh
+
+# Verify
+ray status
+```
+
+The desktop joins with resource labels `{"3d_gen": 1, "rtx4070": 1}` so only 3D generation workloads get scheduled here.
+
+## Usage
+
+### 1. Build a Workflow in ComfyUI
+
+```bash
+cd ComfyUI && source .venv/bin/activate
+python main.py  # Starts at http://localhost:8188
+```
+
+Build a node graph: Load Image → TRELLIS Image-to-3D → Mesh Simplify → UniRig Skeleton → UniRig Skinning → Save GLB.
+
+Export the workflow in API format and save to `workflows/`.
+
+### 2. Generate via CLI
+
+```bash
+# Run a saved workflow
+avatar-generate \
+    --workflow workflows/image-to-vrm.json \
+    --image reference.png \
+    --seed 42 \
+    --output-dir exports/
+
+# Verbose output
+avatar-generate -v --workflow workflows/image-to-vrm.json --image photo.png
+```
+
+### 3. Convert to VRM
+
+```bash
+blender --background --python scripts/vrm_export.py -- \
+    --input exports/model.glb \
+    --output exports/Silver-Mage.vrm \
+    --name "Silver Mage"
+```
+
+### 4. Log to MLflow
+
+Generation parameters and metrics are logged to the cluster's MLflow instance. Set the tracking URI:
+
+```bash
+export MLFLOW_TRACKING_URI=http://mlflow.lab.daviestechlabs.io:5000
+```
+
+Logging is integrated into the generate workflow, or can be called directly:
+
+```python
+from avatar_pipeline.log_mlflow import log_generation
+
+log_generation(
+    avatar_name="Silver-Mage",
+    params={"trellis_seed": 42, "trellis_simplify": 0.95, "texture_size": 1024},
+    metrics={"vertex_count": 12345, "face_count": 8000, "duration_s": 92.5},
+)
+```
+
+### 5. Promote to Production
+
+```bash
+# Preview what would be copied
+avatar-promote --dry-run exports/Silver-Mage.vrm
+
+# Promote to gravenhollow
+avatar-promote exports/Silver-Mage.vrm
+
+# Promote all VRM files
+avatar-promote exports/*.vrm
+```
+
+After promotion, register the model in companions-frontend by adding it to `AllowedAvatarModels` in the Go + JS allowlists.
+
+## Key Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `trellis_seed` | random | Reproducibility seed for TRELLIS generation |
+| `trellis_steps` | 12 | Sampling steps (sparse structure + SLAT) |
+| `trellis_cfg_strength` | 7.5 | Classifier-free guidance strength |
+| `trellis_simplify` | 0.95 | Triangle reduction ratio (1.0 = no reduction) |
+| `texture_size` | 1024 | Output texture resolution |
+
+## VRAM Budget (RTX 4070 — 12 GB)
+
+Models run sequentially, not concurrently:
+
+| Step | VRAM | Time |
+|------|------|------|
+| TRELLIS image-large (1.2B, fp16) | ~10 GB | ~30s |
+| UniRig skeleton prediction | ~4 GB | ~30s |
+| UniRig skinning weights | ~4 GB | ~30s |
+| Blender CLI (VRM export) | CPU only | ~10s |
+
+Peak: ~10 GB during TRELLIS. 64 GB system RAM handles model loading overhead.
+
+## Development
+
+```bash
+# Install in dev mode
+uv pip install -e ".[dev]"
+
+# Lint
+uv run ruff check .
+uv run ruff format --check .
+
+# Auto-fix
+uv run ruff check --fix . && uv run ruff format .
+```
+
+## Related
+
+- [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) — Design decision
+- [ADR-0011](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0011-kuberay-unified-gpu-backend.md) — KubeRay GPU backend
+- [TRELLIS](https://github.com/microsoft/TRELLIS) — Image-to-3D generation (CVPR'25)
+- [UniRig](https://github.com/VAST-AI-Research/UniRig) — Automatic rigging (SIGGRAPH'25)
+- [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — 3D nodes for ComfyUI