Files

Billy D. 202b4e1d61 feat: scaffold avatar pipeline with ComfyUI driver, MLflow logging, and rclone promotion

- setup.sh: automated desktop env setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray)
- ray-join.sh: join Ray cluster as external worker with 3d_gen resource label
- vrm_export.py: headless Blender GLB→VRM conversion script
- generate.py: ComfyUI API driver (submit workflow JSON, poll, download outputs)
- log_mlflow.py: REST-only MLflow experiment tracking (no SDK dependency)
- promote.py: rclone promotion of VRM files to gravenhollow S3
- CLI entry points: avatar-generate, avatar-promote
- workflows/ placeholder for ComfyUI exported workflow JSONs

Implements ADR-0063 (ComfyUI + TRELLIS + UniRig 3D avatar pipeline)

2026-02-24 05:44:04 -05:00

6.2 KiB

Raw Permalink Blame History

avatar-pipeline

ComfyUI-based image-to-VRM avatar generation pipeline using TRELLIS (image → 3D mesh) and UniRig (automatic rigging), running on a desktop NVIDIA GPU as an on-demand Ray worker.

See ADR-0063 for the full design decision.

Overview

Reference Image → TRELLIS (1.2B) → Textured GLB → UniRig → Rigged GLB → Blender CLI → VRM
                                                                                        │
                                                                               rclone → gravenhollow
                                                                               MLflow → experiment tracking

Hardware target: Arch Linux desktop with Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 (12 GB VRAM).

Project Structure

avatar-pipeline/
├── pyproject.toml             # uv project, CLI entry points
├── renovate.json              # Renovate dependency updates
├── scripts/
│   ├── setup.sh               # Desktop environment setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray)
│   ├── ray-join.sh            # Join the Ray cluster as external worker
│   └── vrm_export.py          # Blender headless GLB → VRM conversion
├── workflows/
│   └── (ComfyUI workflow JSONs — exported from UI)
├── avatar_pipeline/
│   ├── __init__.py
│   ├── generate.py            # ComfyUI API driver: submit workflow → poll → download
│   ├── log_mlflow.py          # Log params/metrics/artifacts to MLflow REST API
│   └── promote.py             # rclone promote VRM files to gravenhollow storage
├── LICENSE
└── README.md

Setup

Prerequisites

NVIDIA drivers + CUDA toolkit (sudo pacman -S nvidia nvidia-utils cuda cudnn)
uv installed
Blender 4.x with VRM Add-on
rclone configured with gravenhollow remote

Install Everything

./scripts/setup.sh

This installs ComfyUI, ComfyUI-3D-Pack (includes TRELLIS nodes), UniRig, Ray, and the avatar-pipeline package.

For partial installs:

./scripts/setup.sh --comfyui-only   # ComfyUI + 3D-Pack only
./scripts/setup.sh --unirig-only    # UniRig only

Join Ray Cluster

# Set the Ray head address (exposed via NodePort from the Talos cluster)
./scripts/ray-join.sh 192.168.100.50:6379

# Or via env var
RAY_HEAD_ADDRESS=192.168.100.50:6379 ./scripts/ray-join.sh

# Verify
ray status

The desktop joins with resource labels {"3d_gen": 1, "rtx4070": 1} so only 3D generation workloads get scheduled here.

Usage

1. Build a Workflow in ComfyUI

cd ComfyUI && source .venv/bin/activate
python main.py  # Starts at http://localhost:8188

Build a node graph: Load Image → TRELLIS Image-to-3D → Mesh Simplify → UniRig Skeleton → UniRig Skinning → Save GLB.

Export the workflow in API format and save to workflows/.

2. Generate via CLI

# Run a saved workflow
avatar-generate \
    --workflow workflows/image-to-vrm.json \
    --image reference.png \
    --seed 42 \
    --output-dir exports/

# Verbose output
avatar-generate -v --workflow workflows/image-to-vrm.json --image photo.png

3. Convert to VRM

blender --background --python scripts/vrm_export.py -- \
    --input exports/model.glb \
    --output exports/Silver-Mage.vrm \
    --name "Silver Mage"

4. Log to MLflow

Generation parameters and metrics are logged to the cluster's MLflow instance. Set the tracking URI:

export MLFLOW_TRACKING_URI=http://mlflow.lab.daviestechlabs.io:5000

Logging is integrated into the generate workflow, or can be called directly:

from avatar_pipeline.log_mlflow import log_generation

log_generation(
    avatar_name="Silver-Mage",
    params={"trellis_seed": 42, "trellis_simplify": 0.95, "texture_size": 1024},
    metrics={"vertex_count": 12345, "face_count": 8000, "duration_s": 92.5},
)

5. Promote to Production

# Preview what would be copied
avatar-promote --dry-run exports/Silver-Mage.vrm

# Promote to gravenhollow
avatar-promote exports/Silver-Mage.vrm

# Promote all VRM files
avatar-promote exports/*.vrm

After promotion, register the model in companions-frontend by adding it to AllowedAvatarModels in the Go + JS allowlists.

Key Parameters

Parameter	Default	Description
`trellis_seed`	random	Reproducibility seed for TRELLIS generation
`trellis_steps`	12	Sampling steps (sparse structure + SLAT)
`trellis_cfg_strength`	7.5	Classifier-free guidance strength
`trellis_simplify`	0.95	Triangle reduction ratio (1.0 = no reduction)
`texture_size`	1024	Output texture resolution

VRAM Budget (RTX 4070 — 12 GB)

Models run sequentially, not concurrently:

Step	VRAM	Time
TRELLIS image-large (1.2B, fp16)	~10 GB	~30s
UniRig skeleton prediction	~4 GB	~30s
UniRig skinning weights	~4 GB	~30s
Blender CLI (VRM export)	CPU only	~10s

Peak: ~10 GB during TRELLIS. 64 GB system RAM handles model loading overhead.

Development

# Install in dev mode
uv pip install -e ".[dev]"

# Lint
uv run ruff check .
uv run ruff format --check .

# Auto-fix
uv run ruff check --fix . && uv run ruff format .

ADR-0063 — Design decision
ADR-0011 — KubeRay GPU backend
TRELLIS — Image-to-3D generation (CVPR'25)
UniRig — Automatic rigging (SIGGRAPH'25)
ComfyUI-3D-Pack — 3D nodes for ComfyUI

6.2 KiB Raw Permalink Blame History