# avatar-pipeline

ComfyUI-based image-to-VRM avatar generation pipeline using [TRELLIS](https://github.com/microsoft/TRELLIS) (image → 3D mesh) and [UniRig](https://github.com/VAST-AI-Research/UniRig) (automatic rigging), running on a desktop NVIDIA GPU as an on-demand [Ray](https://www.ray.io/) worker.

See [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) for the full design decision.

## Overview

```
Reference Image → TRELLIS (1.2B) → Textured GLB → UniRig → Rigged GLB → Blender CLI → VRM
                                                                                        │
                                                                               rclone → gravenhollow
                                                                               MLflow → experiment tracking
```

**Hardware target:** Arch Linux desktop with Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 (12 GB VRAM).

## Project Structure

```
avatar-pipeline/
├── pyproject.toml             # uv project, CLI entry points
├── renovate.json              # Renovate dependency updates
├── scripts/
│   ├── setup.sh               # Desktop environment setup (ComfyUI, 3D-Pack, UniRig, Blender, Ray)
│   ├── ray-join.sh            # Join the Ray cluster as external worker
│   └── vrm_export.py          # Blender headless GLB → VRM conversion
├── workflows/
│   └── (ComfyUI workflow JSONs — exported from UI)
├── avatar_pipeline/
│   ├── __init__.py
│   ├── generate.py            # ComfyUI API driver: submit workflow → poll → download
│   ├── log_mlflow.py          # Log params/metrics/artifacts to MLflow REST API
│   └── promote.py             # rclone promote VRM files to gravenhollow storage
├── LICENSE
└── README.md
```

## Setup

### Prerequisites

- NVIDIA drivers + CUDA toolkit (`sudo pacman -S nvidia nvidia-utils cuda cudnn`)
- [uv](https://astral.sh/uv) installed
- [Blender](https://www.blender.org/) 4.x with [VRM Add-on](https://vrm-addon-for-blender.info/en/)
- [rclone](https://rclone.org/) configured with `gravenhollow` remote

### Install Everything

```bash
./scripts/setup.sh
```

This installs ComfyUI, ComfyUI-3D-Pack (includes TRELLIS nodes), UniRig, Ray, and the avatar-pipeline package.

For partial installs:
```bash
./scripts/setup.sh --comfyui-only   # ComfyUI + 3D-Pack only
./scripts/setup.sh --unirig-only    # UniRig only
```

### Join Ray Cluster

```bash
# Set the Ray head address (exposed via NodePort from the Talos cluster)
./scripts/ray-join.sh 192.168.100.50:6379

# Or via env var
RAY_HEAD_ADDRESS=192.168.100.50:6379 ./scripts/ray-join.sh

# Verify
ray status
```

The desktop joins with resource labels `{"3d_gen": 1, "rtx4070": 1}` so only 3D generation workloads get scheduled here.

## Usage

### 1. Build a Workflow in ComfyUI

```bash
cd ComfyUI && source .venv/bin/activate
python main.py  # Starts at http://localhost:8188
```

Build a node graph: Load Image → TRELLIS Image-to-3D → Mesh Simplify → UniRig Skeleton → UniRig Skinning → Save GLB.

Export the workflow in API format and save to `workflows/`.

### 2. Generate via CLI

```bash
# Run a saved workflow
avatar-generate \
    --workflow workflows/image-to-vrm.json \
    --image reference.png \
    --seed 42 \
    --output-dir exports/

# Verbose output
avatar-generate -v --workflow workflows/image-to-vrm.json --image photo.png
```

### 3. Convert to VRM

```bash
blender --background --python scripts/vrm_export.py -- \
    --input exports/model.glb \
    --output exports/Silver-Mage.vrm \
    --name "Silver Mage"
```

### 4. Log to MLflow

Generation parameters and metrics are logged to the cluster's MLflow instance. Set the tracking URI:

```bash
export MLFLOW_TRACKING_URI=http://mlflow.lab.daviestechlabs.io:5000
```

Logging is integrated into the generate workflow, or can be called directly:

```python
from avatar_pipeline.log_mlflow import log_generation

log_generation(
    avatar_name="Silver-Mage",
    params={"trellis_seed": 42, "trellis_simplify": 0.95, "texture_size": 1024},
    metrics={"vertex_count": 12345, "face_count": 8000, "duration_s": 92.5},
)
```

### 5. Promote to Production

```bash
# Preview what would be copied
avatar-promote --dry-run exports/Silver-Mage.vrm

# Promote to gravenhollow
avatar-promote exports/Silver-Mage.vrm

# Promote all VRM files
avatar-promote exports/*.vrm
```

After promotion, register the model in companions-frontend by adding it to `AllowedAvatarModels` in the Go + JS allowlists.

## Key Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `trellis_seed` | random | Reproducibility seed for TRELLIS generation |
| `trellis_steps` | 12 | Sampling steps (sparse structure + SLAT) |
| `trellis_cfg_strength` | 7.5 | Classifier-free guidance strength |
| `trellis_simplify` | 0.95 | Triangle reduction ratio (1.0 = no reduction) |
| `texture_size` | 1024 | Output texture resolution |

## VRAM Budget (RTX 4070 — 12 GB)

Models run sequentially, not concurrently:

| Step | VRAM | Time |
|------|------|------|
| TRELLIS image-large (1.2B, fp16) | ~10 GB | ~30s |
| UniRig skeleton prediction | ~4 GB | ~30s |
| UniRig skinning weights | ~4 GB | ~30s |
| Blender CLI (VRM export) | CPU only | ~10s |

Peak: ~10 GB during TRELLIS. 64 GB system RAM handles model loading overhead.

## Development

```bash
# Install in dev mode
uv pip install -e ".[dev]"

# Lint
uv run ruff check .
uv run ruff format --check .

# Auto-fix
uv run ruff check --fix . && uv run ruff format .
```

## Related

- [ADR-0063](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0063-comfyui-3d-avatar-pipeline.md) — Design decision
- [ADR-0011](https://git.daviestechlabs.io/daviestechlabs/homelab-design/src/branch/main/decisions/0011-kuberay-unified-gpu-backend.md) — KubeRay GPU backend
- [TRELLIS](https://github.com/microsoft/TRELLIS) — Image-to-3D generation (CVPR'25)
- [UniRig](https://github.com/VAST-AI-Research/UniRig) — Automatic rigging (SIGGRAPH'25)
- [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — 3D nodes for ComfyUI