# avatar-pipeline ComfyUI-based image-to-VRM avatar generation pipeline using [TRELLIS](https://github.com/microsoft/TRELLIS) (image → 3D mesh) and [UniRig](https://github.com/VAST-AI-Research/UniRig) (automatic rigging), running on a desktop NVIDIA GPU as an on-demand [Ray](https://www.ray.io/) worker. See [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) for the full design decision. ## Overview ``` Reference Image → TRELLIS (1.2B) → Textured GLB → UniRig → Rigged GLB → Blender CLI → VRM │ rclone → gravenhollow MLflow → experiment tracking ``` **Hardware target:** Arch Linux desktop with Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 (12 GB VRAM). ## Project Structure ``` avatar-pipeline/ ├── pyproject.toml # uv project, CLI entry points ├── renovate.json # Renovate dependency updates ├── scripts/ │ ├── setup.sh # Desktop environment setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray) │ ├── ray-join.sh # Join the Ray cluster as external worker │ └── vrm_export.py # Blender headless GLB → VRM conversion ├── workflows/ │ └── (ComfyUI workflow JSONs — exported from UI) ├── avatar_pipeline/ │ ├── __init__.py │ ├── generate.py # ComfyUI API driver: submit workflow → poll → download │ ├── log_mlflow.py # Log params/metrics/artifacts to MLflow REST API │ └── promote.py # rclone promote VRM files to gravenhollow storage ├── LICENSE └── README.md ``` ## Setup ### Prerequisites - NVIDIA drivers + CUDA toolkit (`sudo pacman -S nvidia nvidia-utils cuda cudnn`) - [uv](https://astral.sh/uv) installed - [Blender](https://www.blender.org/) 4.x with [VRM Add-on](https://vrm-addon-for-blender.info/en/) - [rclone](https://rclone.org/) configured with `gravenhollow` remote ### Install Everything ```bash ./scripts/setup.sh ``` This installs ComfyUI, ComfyUI-3D-Pack (includes TRELLIS nodes), UniRig, Ray, and the avatar-pipeline package. For partial installs: ```bash ./scripts/setup.sh --comfyui-only # ComfyUI + 3D-Pack only ./scripts/setup.sh --unirig-only # UniRig only ``` ### Join Ray Cluster ```bash # Set the Ray head address (exposed via NodePort from the Talos cluster) ./scripts/ray-join.sh 192.168.100.50:6379 # Or via env var RAY_HEAD_ADDRESS=192.168.100.50:6379 ./scripts/ray-join.sh # Verify ray status ``` The desktop joins with resource labels `{"3d_gen": 1, "rtx4070": 1}` so only 3D generation workloads get scheduled here. ## Usage ### 1. Build a Workflow in ComfyUI ```bash cd ComfyUI && source .venv/bin/activate python main.py # Starts at http://localhost:8188 ``` Build a node graph: Load Image → TRELLIS Image-to-3D → Mesh Simplify → UniRig Skeleton → UniRig Skinning → Save GLB. Export the workflow in API format and save to `workflows/`. ### 2. Generate via CLI ```bash # Run a saved workflow avatar-generate \ --workflow workflows/image-to-vrm.json \ --image reference.png \ --seed 42 \ --output-dir exports/ # Verbose output avatar-generate -v --workflow workflows/image-to-vrm.json --image photo.png ``` ### 3. Convert to VRM ```bash blender --background --python scripts/vrm_export.py -- \ --input exports/model.glb \ --output exports/Silver-Mage.vrm \ --name "Silver Mage" ``` ### 4. Log to MLflow Generation parameters and metrics are logged to the cluster's MLflow instance. Set the tracking URI: ```bash export MLFLOW_TRACKING_URI=http://mlflow.lab.daviestechlabs.io:5000 ``` Logging is integrated into the generate workflow, or can be called directly: ```python from avatar_pipeline.log_mlflow import log_generation log_generation( avatar_name="Silver-Mage", params={"trellis_seed": 42, "trellis_simplify": 0.95, "texture_size": 1024}, metrics={"vertex_count": 12345, "face_count": 8000, "duration_s": 92.5}, ) ``` ### 5. Promote to Production ```bash # Preview what would be copied avatar-promote --dry-run exports/Silver-Mage.vrm # Promote to gravenhollow avatar-promote exports/Silver-Mage.vrm # Promote all VRM files avatar-promote exports/*.vrm ``` After promotion, register the model in companions-frontend by adding it to `AllowedAvatarModels` in the Go + JS allowlists. ## Key Parameters | Parameter | Default | Description | |-----------|---------|-------------| | `trellis_seed` | random | Reproducibility seed for TRELLIS generation | | `trellis_steps` | 12 | Sampling steps (sparse structure + SLAT) | | `trellis_cfg_strength` | 7.5 | Classifier-free guidance strength | | `trellis_simplify` | 0.95 | Triangle reduction ratio (1.0 = no reduction) | | `texture_size` | 1024 | Output texture resolution | ## VRAM Budget (RTX 4070 — 12 GB) Models run sequentially, not concurrently: | Step | VRAM | Time | |------|------|------| | TRELLIS image-large (1.2B, fp16) | ~10 GB | ~30s | | UniRig skeleton prediction | ~4 GB | ~30s | | UniRig skinning weights | ~4 GB | ~30s | | Blender CLI (VRM export) | CPU only | ~10s | Peak: ~10 GB during TRELLIS. 64 GB system RAM handles model loading overhead. ## Development ```bash # Install in dev mode uv pip install -e ".[dev]" # Lint uv run ruff check . uv run ruff format --check . # Auto-fix uv run ruff check --fix . && uv run ruff format . ``` ## Related - [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) — Design decision - [ADR-0011](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0011-kuberay-unified-gpu-backend.md) — KubeRay GPU backend - [TRELLIS](https://github.com/microsoft/TRELLIS) — Image-to-3D generation (CVPR'25) - [UniRig](https://github.com/VAST-AI-Research/UniRig) — Automatic rigging (SIGGRAPH'25) - [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — 3D nodes for ComfyUI