feat: scaffold avatar pipeline with ComfyUI driver, MLflow logging, and rclone promotion
- setup.sh: automated desktop env setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray) - ray-join.sh: join Ray cluster as external worker with 3d_gen resource label - vrm_export.py: headless Blender GLB→VRM conversion script - generate.py: ComfyUI API driver (submit workflow JSON, poll, download outputs) - log_mlflow.py: REST-only MLflow experiment tracking (no SDK dependency) - promote.py: rclone promotion of VRM files to gravenhollow S3 - CLI entry points: avatar-generate, avatar-promote - workflows/ placeholder for ComfyUI exported workflow JSONs Implements ADR-0063 (ComfyUI + TRELLIS + UniRig 3D avatar pipeline)
This commit is contained in:
189
README.md
189
README.md
@@ -1,2 +1,191 @@
|
||||
# avatar-pipeline
|
||||
|
||||
ComfyUI-based image-to-VRM avatar generation pipeline using [TRELLIS](https://github.com/microsoft/TRELLIS) (image → 3D mesh) and [UniRig](https://github.com/VAST-AI-Research/UniRig) (automatic rigging), running on a desktop NVIDIA GPU as an on-demand [Ray](https://www.ray.io/) worker.
|
||||
|
||||
See [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) for the full design decision.
|
||||
|
||||
## Overview
|
||||
|
||||
```
|
||||
Reference Image → TRELLIS (1.2B) → Textured GLB → UniRig → Rigged GLB → Blender CLI → VRM
|
||||
│
|
||||
rclone → gravenhollow
|
||||
MLflow → experiment tracking
|
||||
```
|
||||
|
||||
**Hardware target:** Arch Linux desktop with Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 (12 GB VRAM).
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
avatar-pipeline/
|
||||
├── pyproject.toml # uv project, CLI entry points
|
||||
├── renovate.json # Renovate dependency updates
|
||||
├── scripts/
|
||||
│ ├── setup.sh # Desktop environment setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray)
|
||||
│ ├── ray-join.sh # Join the Ray cluster as external worker
|
||||
│ └── vrm_export.py # Blender headless GLB → VRM conversion
|
||||
├── workflows/
|
||||
│ └── (ComfyUI workflow JSONs — exported from UI)
|
||||
├── avatar_pipeline/
|
||||
│ ├── __init__.py
|
||||
│ ├── generate.py # ComfyUI API driver: submit workflow → poll → download
|
||||
│ ├── log_mlflow.py # Log params/metrics/artifacts to MLflow REST API
|
||||
│ └── promote.py # rclone promote VRM files to gravenhollow storage
|
||||
├── LICENSE
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- NVIDIA drivers + CUDA toolkit (`sudo pacman -S nvidia nvidia-utils cuda cudnn`)
|
||||
- [uv](https://astral.sh/uv) installed
|
||||
- [Blender](https://www.blender.org/) 4.x with [VRM Add-on](https://vrm-addon-for-blender.info/en/)
|
||||
- [rclone](https://rclone.org/) configured with `gravenhollow` remote
|
||||
|
||||
### Install Everything
|
||||
|
||||
```bash
|
||||
./scripts/setup.sh
|
||||
```
|
||||
|
||||
This installs ComfyUI, ComfyUI-3D-Pack (includes TRELLIS nodes), UniRig, Ray, and the avatar-pipeline package.
|
||||
|
||||
For partial installs:
|
||||
```bash
|
||||
./scripts/setup.sh --comfyui-only # ComfyUI + 3D-Pack only
|
||||
./scripts/setup.sh --unirig-only # UniRig only
|
||||
```
|
||||
|
||||
### Join Ray Cluster
|
||||
|
||||
```bash
|
||||
# Set the Ray head address (exposed via NodePort from the Talos cluster)
|
||||
./scripts/ray-join.sh 192.168.100.50:6379
|
||||
|
||||
# Or via env var
|
||||
RAY_HEAD_ADDRESS=192.168.100.50:6379 ./scripts/ray-join.sh
|
||||
|
||||
# Verify
|
||||
ray status
|
||||
```
|
||||
|
||||
The desktop joins with resource labels `{"3d_gen": 1, "rtx4070": 1}` so only 3D generation workloads get scheduled here.
|
||||
|
||||
## Usage
|
||||
|
||||
### 1. Build a Workflow in ComfyUI
|
||||
|
||||
```bash
|
||||
cd ComfyUI && source .venv/bin/activate
|
||||
python main.py # Starts at http://localhost:8188
|
||||
```
|
||||
|
||||
Build a node graph: Load Image → TRELLIS Image-to-3D → Mesh Simplify → UniRig Skeleton → UniRig Skinning → Save GLB.
|
||||
|
||||
Export the workflow in API format and save to `workflows/`.
|
||||
|
||||
### 2. Generate via CLI
|
||||
|
||||
```bash
|
||||
# Run a saved workflow
|
||||
avatar-generate \
|
||||
--workflow workflows/image-to-vrm.json \
|
||||
--image reference.png \
|
||||
--seed 42 \
|
||||
--output-dir exports/
|
||||
|
||||
# Verbose output
|
||||
avatar-generate -v --workflow workflows/image-to-vrm.json --image photo.png
|
||||
```
|
||||
|
||||
### 3. Convert to VRM
|
||||
|
||||
```bash
|
||||
blender --background --python scripts/vrm_export.py -- \
|
||||
--input exports/model.glb \
|
||||
--output exports/Silver-Mage.vrm \
|
||||
--name "Silver Mage"
|
||||
```
|
||||
|
||||
### 4. Log to MLflow
|
||||
|
||||
Generation parameters and metrics are logged to the cluster's MLflow instance. Set the tracking URI:
|
||||
|
||||
```bash
|
||||
export MLFLOW_TRACKING_URI=http://mlflow.lab.daviestechlabs.io:5000
|
||||
```
|
||||
|
||||
Logging is integrated into the generate workflow, or can be called directly:
|
||||
|
||||
```python
|
||||
from avatar_pipeline.log_mlflow import log_generation
|
||||
|
||||
log_generation(
|
||||
avatar_name="Silver-Mage",
|
||||
params={"trellis_seed": 42, "trellis_simplify": 0.95, "texture_size": 1024},
|
||||
metrics={"vertex_count": 12345, "face_count": 8000, "duration_s": 92.5},
|
||||
)
|
||||
```
|
||||
|
||||
### 5. Promote to Production
|
||||
|
||||
```bash
|
||||
# Preview what would be copied
|
||||
avatar-promote --dry-run exports/Silver-Mage.vrm
|
||||
|
||||
# Promote to gravenhollow
|
||||
avatar-promote exports/Silver-Mage.vrm
|
||||
|
||||
# Promote all VRM files
|
||||
avatar-promote exports/*.vrm
|
||||
```
|
||||
|
||||
After promotion, register the model in companions-frontend by adding it to `AllowedAvatarModels` in the Go + JS allowlists.
|
||||
|
||||
## Key Parameters
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `trellis_seed` | random | Reproducibility seed for TRELLIS generation |
|
||||
| `trellis_steps` | 12 | Sampling steps (sparse structure + SLAT) |
|
||||
| `trellis_cfg_strength` | 7.5 | Classifier-free guidance strength |
|
||||
| `trellis_simplify` | 0.95 | Triangle reduction ratio (1.0 = no reduction) |
|
||||
| `texture_size` | 1024 | Output texture resolution |
|
||||
|
||||
## VRAM Budget (RTX 4070 — 12 GB)
|
||||
|
||||
Models run sequentially, not concurrently:
|
||||
|
||||
| Step | VRAM | Time |
|
||||
|------|------|------|
|
||||
| TRELLIS image-large (1.2B, fp16) | ~10 GB | ~30s |
|
||||
| UniRig skeleton prediction | ~4 GB | ~30s |
|
||||
| UniRig skinning weights | ~4 GB | ~30s |
|
||||
| Blender CLI (VRM export) | CPU only | ~10s |
|
||||
|
||||
Peak: ~10 GB during TRELLIS. 64 GB system RAM handles model loading overhead.
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
# Install in dev mode
|
||||
uv pip install -e ".[dev]"
|
||||
|
||||
# Lint
|
||||
uv run ruff check .
|
||||
uv run ruff format --check .
|
||||
|
||||
# Auto-fix
|
||||
uv run ruff check --fix . && uv run ruff format .
|
||||
```
|
||||
|
||||
## Related
|
||||
|
||||
- [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) — Design decision
|
||||
- [ADR-0011](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0011-kuberay-unified-gpu-backend.md) — KubeRay GPU backend
|
||||
- [TRELLIS](https://github.com/microsoft/TRELLIS) — Image-to-3D generation (CVPR'25)
|
||||
- [UniRig](https://github.com/VAST-AI-Research/UniRig) — Automatic rigging (SIGGRAPH'25)
|
||||
- [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — 3D nodes for ComfyUI
|
||||
Reference in New Issue
Block a user