- setup.sh: automated desktop env setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray) - ray-join.sh: join Ray cluster as external worker with 3d_gen resource label - vrm_export.py: headless Blender GLB→VRM conversion script - generate.py: ComfyUI API driver (submit workflow JSON, poll, download outputs) - log_mlflow.py: REST-only MLflow experiment tracking (no SDK dependency) - promote.py: rclone promotion of VRM files to gravenhollow S3 - CLI entry points: avatar-generate, avatar-promote - workflows/ placeholder for ComfyUI exported workflow JSONs Implements ADR-0063 (ComfyUI + TRELLIS + UniRig 3D avatar pipeline)
191 lines
6.2 KiB
Markdown
191 lines
6.2 KiB
Markdown
# avatar-pipeline
|
|
|
|
ComfyUI-based image-to-VRM avatar generation pipeline using [TRELLIS](https://github.com/microsoft/TRELLIS) (image → 3D mesh) and [UniRig](https://github.com/VAST-AI-Research/UniRig) (automatic rigging), running on a desktop NVIDIA GPU as an on-demand [Ray](https://www.ray.io/) worker.
|
|
|
|
See [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) for the full design decision.
|
|
|
|
## Overview
|
|
|
|
```
|
|
Reference Image → TRELLIS (1.2B) → Textured GLB → UniRig → Rigged GLB → Blender CLI → VRM
|
|
│
|
|
rclone → gravenhollow
|
|
MLflow → experiment tracking
|
|
```
|
|
|
|
**Hardware target:** Arch Linux desktop with Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 (12 GB VRAM).
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
avatar-pipeline/
|
|
├── pyproject.toml # uv project, CLI entry points
|
|
├── renovate.json # Renovate dependency updates
|
|
├── scripts/
|
|
│ ├── setup.sh # Desktop environment setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray)
|
|
│ ├── ray-join.sh # Join the Ray cluster as external worker
|
|
│ └── vrm_export.py # Blender headless GLB → VRM conversion
|
|
├── workflows/
|
|
│ └── (ComfyUI workflow JSONs — exported from UI)
|
|
├── avatar_pipeline/
|
|
│ ├── __init__.py
|
|
│ ├── generate.py # ComfyUI API driver: submit workflow → poll → download
|
|
│ ├── log_mlflow.py # Log params/metrics/artifacts to MLflow REST API
|
|
│ └── promote.py # rclone promote VRM files to gravenhollow storage
|
|
├── LICENSE
|
|
└── README.md
|
|
```
|
|
|
|
## Setup
|
|
|
|
### Prerequisites
|
|
|
|
- NVIDIA drivers + CUDA toolkit (`sudo pacman -S nvidia nvidia-utils cuda cudnn`)
|
|
- [uv](https://astral.sh/uv) installed
|
|
- [Blender](https://www.blender.org/) 4.x with [VRM Add-on](https://vrm-addon-for-blender.info/en/)
|
|
- [rclone](https://rclone.org/) configured with `gravenhollow` remote
|
|
|
|
### Install Everything
|
|
|
|
```bash
|
|
./scripts/setup.sh
|
|
```
|
|
|
|
This installs ComfyUI, ComfyUI-3D-Pack (includes TRELLIS nodes), UniRig, Ray, and the avatar-pipeline package.
|
|
|
|
For partial installs:
|
|
```bash
|
|
./scripts/setup.sh --comfyui-only # ComfyUI + 3D-Pack only
|
|
./scripts/setup.sh --unirig-only # UniRig only
|
|
```
|
|
|
|
### Join Ray Cluster
|
|
|
|
```bash
|
|
# Set the Ray head address (exposed via NodePort from the Talos cluster)
|
|
./scripts/ray-join.sh 192.168.100.50:6379
|
|
|
|
# Or via env var
|
|
RAY_HEAD_ADDRESS=192.168.100.50:6379 ./scripts/ray-join.sh
|
|
|
|
# Verify
|
|
ray status
|
|
```
|
|
|
|
The desktop joins with resource labels `{"3d_gen": 1, "rtx4070": 1}` so only 3D generation workloads get scheduled here.
|
|
|
|
## Usage
|
|
|
|
### 1. Build a Workflow in ComfyUI
|
|
|
|
```bash
|
|
cd ComfyUI && source .venv/bin/activate
|
|
python main.py # Starts at http://localhost:8188
|
|
```
|
|
|
|
Build a node graph: Load Image → TRELLIS Image-to-3D → Mesh Simplify → UniRig Skeleton → UniRig Skinning → Save GLB.
|
|
|
|
Export the workflow in API format and save to `workflows/`.
|
|
|
|
### 2. Generate via CLI
|
|
|
|
```bash
|
|
# Run a saved workflow
|
|
avatar-generate \
|
|
--workflow workflows/image-to-vrm.json \
|
|
--image reference.png \
|
|
--seed 42 \
|
|
--output-dir exports/
|
|
|
|
# Verbose output
|
|
avatar-generate -v --workflow workflows/image-to-vrm.json --image photo.png
|
|
```
|
|
|
|
### 3. Convert to VRM
|
|
|
|
```bash
|
|
blender --background --python scripts/vrm_export.py -- \
|
|
--input exports/model.glb \
|
|
--output exports/Silver-Mage.vrm \
|
|
--name "Silver Mage"
|
|
```
|
|
|
|
### 4. Log to MLflow
|
|
|
|
Generation parameters and metrics are logged to the cluster's MLflow instance. Set the tracking URI:
|
|
|
|
```bash
|
|
export MLFLOW_TRACKING_URI=http://mlflow.lab.daviestechlabs.io:5000
|
|
```
|
|
|
|
Logging is integrated into the generate workflow, or can be called directly:
|
|
|
|
```python
|
|
from avatar_pipeline.log_mlflow import log_generation
|
|
|
|
log_generation(
|
|
avatar_name="Silver-Mage",
|
|
params={"trellis_seed": 42, "trellis_simplify": 0.95, "texture_size": 1024},
|
|
metrics={"vertex_count": 12345, "face_count": 8000, "duration_s": 92.5},
|
|
)
|
|
```
|
|
|
|
### 5. Promote to Production
|
|
|
|
```bash
|
|
# Preview what would be copied
|
|
avatar-promote --dry-run exports/Silver-Mage.vrm
|
|
|
|
# Promote to gravenhollow
|
|
avatar-promote exports/Silver-Mage.vrm
|
|
|
|
# Promote all VRM files
|
|
avatar-promote exports/*.vrm
|
|
```
|
|
|
|
After promotion, register the model in companions-frontend by adding it to `AllowedAvatarModels` in the Go + JS allowlists.
|
|
|
|
## Key Parameters
|
|
|
|
| Parameter | Default | Description |
|
|
|-----------|---------|-------------|
|
|
| `trellis_seed` | random | Reproducibility seed for TRELLIS generation |
|
|
| `trellis_steps` | 12 | Sampling steps (sparse structure + SLAT) |
|
|
| `trellis_cfg_strength` | 7.5 | Classifier-free guidance strength |
|
|
| `trellis_simplify` | 0.95 | Triangle reduction ratio (1.0 = no reduction) |
|
|
| `texture_size` | 1024 | Output texture resolution |
|
|
|
|
## VRAM Budget (RTX 4070 — 12 GB)
|
|
|
|
Models run sequentially, not concurrently:
|
|
|
|
| Step | VRAM | Time |
|
|
|------|------|------|
|
|
| TRELLIS image-large (1.2B, fp16) | ~10 GB | ~30s |
|
|
| UniRig skeleton prediction | ~4 GB | ~30s |
|
|
| UniRig skinning weights | ~4 GB | ~30s |
|
|
| Blender CLI (VRM export) | CPU only | ~10s |
|
|
|
|
Peak: ~10 GB during TRELLIS. 64 GB system RAM handles model loading overhead.
|
|
|
|
## Development
|
|
|
|
```bash
|
|
# Install in dev mode
|
|
uv pip install -e ".[dev]"
|
|
|
|
# Lint
|
|
uv run ruff check .
|
|
uv run ruff format --check .
|
|
|
|
# Auto-fix
|
|
uv run ruff check --fix . && uv run ruff format .
|
|
```
|
|
|
|
## Related
|
|
|
|
- [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) — Design decision
|
|
- [ADR-0011](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0011-kuberay-unified-gpu-backend.md) — KubeRay GPU backend
|
|
- [TRELLIS](https://github.com/microsoft/TRELLIS) — Image-to-3D generation (CVPR'25)
|
|
- [UniRig](https://github.com/VAST-AI-Research/UniRig) — Automatic rigging (SIGGRAPH'25)
|
|
- [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — 3D nodes for ComfyUI |