From 4affddf9b4b0a102d843da8a7e115ad6a71dcba6 Mon Sep 17 00:00:00 2001
From: "Billy D." <billy.davies.10@icloud.com>
Date: Tue, 24 Feb 2026 05:34:24 -0500
Subject: [PATCH] replacing mcp blender with reproducable flow.

---
 decisions/0059-mac-mini-ray-worker.md         |  13 +-
 .../0062-blender-mcp-3d-avatar-workflow.md    |   2 +-
 decisions/0063-comfyui-3d-avatar-pipeline.md  | 483 ++++++++++++++++++
 3 files changed, 491 insertions(+), 7 deletions(-)
 create mode 100644 decisions/0063-comfyui-3d-avatar-pipeline.md

diff --git a/decisions/0059-mac-mini-ray-worker.md b/decisions/0059-mac-mini-ray-worker.md
index ead643b..e1d8910 100644
--- a/decisions/0059-mac-mini-ray-worker.md
+++ b/decisions/0059-mac-mini-ray-worker.md
@@ -12,9 +12,9 @@
 
 - All Ray inference slots are already allocated and stable — adding a 5th GPU class (MPS) increases complexity without filling a gap
 - vLLM's MPS backend remains experimental — not production-ready for serving
-- The real unmet need is **3D avatar creation** for companions-frontend ([ADR-0062](0062-blender-mcp-3d-avatar-workflow.md))
+- The real unmet need is **3D avatar creation** for companions-frontend ([ADR-0063](0063-comfyui-3d-avatar-pipeline.md))
 
-[ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) describes using BlenderMCP in a Kasm Blender workstation for AI-assisted avatar creation. While Kasm works, it runs Blender inside a DinD container with **no GPU acceleration** — rendering and viewport interaction are CPU-only, which is painfully slow for sculpting, material preview, and VRM export iteration.
+[ADR-0063](0063-comfyui-3d-avatar-pipeline.md) describes an automated ComfyUI + TRELLIS + UniRig pipeline for image-to-VRM avatar generation, running on a personal desktop as an on-demand Ray worker. This supersedes the manual BlenderMCP Kasm workflow from [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md). waterdeep retains its role as an interactive Blender workstation for manual refinement of auto-generated models.
 
 waterdeep's M4 Pro has a 16-core GPU with hardware-accelerated Metal rendering and 48 GB of unified memory shared between CPU and GPU. Running Blender natively on waterdeep with BlenderMCP gives a dramatically better 3D creation experience than Kasm.
 
@@ -26,9 +26,9 @@ How should we use waterdeep to maximise the 3D avatar creation pipeline for comp
 * waterdeep has a 16-core Apple GPU with Metal support — Blender's Metal backend enables real-time viewport rendering, Cycles GPU rendering, and smooth sculpting
 * 48 GB unified memory means Blender, VS Code, and the MCP server can all run simultaneously without swapping
 * VS Code with Copilot agent mode and BlenderMCP server are installed on waterdeep — VS Code drives Blender via localhost:9876 with zero-latency socket communication
-* Exported VRM models must reach gravenhollow for production serving ([ADR-0062](0062-blender-mcp-3d-avatar-workflow.md))
+* Exported VRM models must reach gravenhollow for production serving ([ADR-0063](0063-comfyui-3d-avatar-pipeline.md))
 * **rclone** chosen for asset promotion to gravenhollow's RustFS S3 endpoint — simpler than NFS mounts on macOS, consistent with existing Kasm rclone patterns, and avoids autofs/NFS fstab complexity
-* The Kasm Blender workflow from ADR-0062 remains available as a fallback (browser-based, no local install required)
+* The automated ComfyUI pipeline from [ADR-0063](0063-comfyui-3d-avatar-pipeline.md) handles most avatar generation; waterdeep serves as the manual refinement station
 * ray cluster GPU fleet is fully allocated and stable — adding MPS complexity is not justified
 
 ## Considered Options
@@ -204,7 +204,7 @@ uvx blender-mcp --help
 
 ### 4. rclone for Asset Promotion
 
-Use rclone to promote finished VRM exports to gravenhollow's RustFS S3 endpoint. This is consistent with the Kasm rclone volume plugin pattern from [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) and avoids macOS NFS/autofs complexity.
+Use rclone to promote finished VRM exports to gravenhollow's RustFS S3 endpoint. This is consistent with the promotion pattern from [ADR-0063](0063-comfyui-3d-avatar-pipeline.md) and avoids macOS NFS/autofs complexity.
 
 ```bash
 # Install rclone
@@ -275,7 +275,8 @@ rclone sync ~/blender-avatars/exports/ gravenhollow:avatar-models/ --exclude "*.
 
 ## Links
 
-* Related: [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) — BlenderMCP 3D avatar workflow (Kasm + deployment architecture)
+* Related: [ADR-0063](0063-comfyui-3d-avatar-pipeline.md) — ComfyUI image-to-3D avatar pipeline (supersedes ADR-0062)
+* Related: [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) — BlenderMCP 3D avatar workflow (superseded)
 * Related: [ADR-0046](0046-companions-frontend-architecture.md) — Companions frontend architecture (Three.js + VRM avatars)
 * Related: [ADR-0026](0026-storage-strategy.md) — Storage strategy (gravenhollow NFS-fast)
 * Related: [ADR-0037](0037-node-naming-conventions.md) — Node naming conventions (waterdeep)
diff --git a/decisions/0062-blender-mcp-3d-avatar-workflow.md b/decisions/0062-blender-mcp-3d-avatar-workflow.md
index 954b7bc..e68045e 100644
--- a/decisions/0062-blender-mcp-3d-avatar-workflow.md
+++ b/decisions/0062-blender-mcp-3d-avatar-workflow.md
@@ -1,6 +1,6 @@
 # BlenderMCP for 3D Avatar Creation via Kasm Workstation
 
-* Status: proposed
+* Status: superseded by [ADR-0063](0063-comfyui-3d-avatar-pipeline.md)
 * Date: 2026-02-21
 * Deciders: Billy
 * Technical Story: Enable AI-assisted 3D avatar creation for companions-frontend using BlenderMCP in a Kasm Blender workstation with VS Code, storing assets in S3, serving locally from gravenhollow NFS and remotely via Cloudflare-cached RustFS
diff --git a/decisions/0063-comfyui-3d-avatar-pipeline.md b/decisions/0063-comfyui-3d-avatar-pipeline.md
new file mode 100644
index 0000000..c30b129
--- /dev/null
+++ b/decisions/0063-comfyui-3d-avatar-pipeline.md
@@ -0,0 +1,483 @@
+# ComfyUI Image-to-3D Avatar Pipeline with TRELLIS + UniRig
+
+* Status: proposed
+* Date: 2026-02-24
+* Deciders: Billy
+* Technical Story: Replace the manual BlenderMCP 3D avatar creation workflow with an automated, GPU-accelerated image-to-rigged-3D-model pipeline using ComfyUI, TRELLIS 2-4B, and UniRig — running on a personal desktop (NVIDIA RTX 4070) as an on-demand Ray worker, with direct MLflow logging and rclone asset promotion
+
+## Context and Problem Statement
+
+The companions-frontend serves VRM avatar models for Three.js-based 3D character rendering ([ADR-0046](0046-companions-frontend-architecture.md)). The previous approach ([ADR-0062](0062-blender-mcp-3d-avatar-workflow.md)) proposed using BlenderMCP in a Kasm workstation or on waterdeep ([ADR-0059](0059-mac-mini-ray-worker.md)) for AI-assisted avatar creation. While BlenderMCP bridges VS Code to Blender, the workflow is fundamentally **interactive and manual** — an operator must prompt the AI, review each sculpting step, and hand-tune rigging and VRM export. This is slow, non-reproducible, and doesn't scale.
+
+Meanwhile, the state of the art in image-to-3D generation has matured significantly:
+
+- **TRELLIS** (Microsoft, CVPR'25 Spotlight, 12k+ GitHub stars) generates high-quality textured 3D meshes from a single image in seconds using Structured 3D Latents (SLAT) — with models up to 2B parameters
+- **UniRig** (Tsinghua/Tripo, SIGGRAPH'25, 1.4k+ GitHub stars) automatically generates topologically valid skeletons and skinning weights for arbitrary 3D models using autoregressive transformers — the first model to rig humans, animals, and objects with a single unified framework
+- **ComfyUI-3D-Pack** (3.7k+ GitHub stars) provides battle-tested ComfyUI nodes for TRELLIS, 3D Gaussian Splatting, mesh processing, and GLB/VRM export — enabling node-graph-based automation without custom code
+
+Together, these tools enable a fully automated **image → 3D mesh → rigged model → VRM** pipeline that eliminates manual Blender work for the common case, produces reproducible results, and integrates with the existing MLflow + Ray stack.
+
+A personal desktop (Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 12 GB VRAM) running Arch Linux is available as an **on-demand external Ray worker** — it won't be a permanent cluster member (it's not running Talos), but can join the Ray cluster via `ray start` when 3D generation workloads need to run. This adds a 5th GPU to the fleet specifically for 3D generation, without disrupting the stable inference allocations.
+
+How do we build an automated, reproducible image-to-VRM pipeline that leverages the desktop's CUDA GPU and integrates with the existing AI/ML platform for experiment tracking and asset serving?
+
+## Decision Drivers
+
+* BlenderMCP workflow from ADR-0062 is interactive and non-reproducible — every avatar requires an operator in the loop
+* TRELLIS generates production-quality textured meshes from a single reference image in ~30 seconds on a 12 GB GPU
+* UniRig automatically rigs arbitrary 3D models with skeleton + skinning weights — no manual weight painting
+* ComfyUI-3D-Pack bundles TRELLIS, mesh processing, and GLB export as composable nodes — enabling visual pipeline authoring
+* The desktop's RTX 4070 (12 GB VRAM) meets TRELLIS's 16 GB minimum when using fp16/attention optimizations, and exceeds UniRig's 8 GB requirement
+* The desktop can join/leave the Ray cluster on demand — no permanent infrastructure commitment
+* MLflow tracks generation parameters, quality metrics, and output artifacts for reproducibility — the desktop logs directly to the cluster's MLflow service over HTTP
+* waterdeep (Mac Mini M4 Pro) remains available for interactive Blender touch-up on models that need manual refinement
+* VRM export, asset promotion to gravenhollow, and serving architecture from ADR-0062 remain valid and are reused
+
+## Considered Options
+
+1. **ComfyUI + TRELLIS + UniRig on desktop Ray worker, with direct MLflow logging and rclone promotion**
+2. **BlenderMCP interactive workflow** (ADR-0062, superseded)
+3. **Cloud-hosted 3D generation (Hyper3D Rodin, Meshy, etc.)**
+4. **Run TRELLIS + UniRig directly as Ray Serve deployments in-cluster**
+
+## Decision Outcome
+
+Chosen option: **Option 1 — ComfyUI + TRELLIS + UniRig on desktop Ray worker**, because it automates the entire image-to-rigged-model pipeline without operator interaction, leverages purpose-built state-of-the-art models (TRELLIS for generation, UniRig for rigging), and uses the desktop's RTX 4070 as on-demand GPU capacity without disrupting the stable inference cluster. ComfyUI's visual node graph provides the pipeline orchestration directly on the desktop — no Kubernetes-side orchestrator needed since all compute is local to one machine.
+
+waterdeep retains its role as an interactive Blender workstation for manual refinement of auto-generated models when needed — but the expectation is that most avatars pass through the automated pipeline without manual touch-up.
+
+### Positive Consequences
+
+* **Fully automated pipeline** — image → textured mesh → rigged model → VRM with no operator in the loop
+* **Reproducible** — same image + seed produces identical output; parameters tracked in MLflow
+* **Fast** — TRELLIS generates a mesh in ~30s, UniRig rigs it in ~60s; end-to-end under 5 minutes including VRM export
+* **On-demand GPU** — desktop joins Ray cluster only when needed; no standing resource cost
+* **Composable** — ComfyUI node graph can be extended with additional 3D processing nodes (Hunyuan3D, TripoSG, Stable3DGen) without code changes
+* **Quality** — TRELLIS (CVPR'25) and UniRig (SIGGRAPH'25) represent current state of the art
+* **MLflow integration** — generation parameters, mesh quality metrics, and output artifacts are logged directly to the cluster's MLflow service over HTTP
+* **Simple orchestration** — ComfyUI node graph handles the pipeline; no Kubernetes-side orchestrator needed for a single-GPU linear workflow
+* **Reuses existing serving architecture** — gravenhollow NFS + RustFS CDN serving from ADR-0062 is unchanged
+* **waterdeep fallback** — interactive Blender + BlenderMCP on waterdeep for models needing hand-tuning
+
+### Negative Consequences
+
+* Desktop must be powered on and `ray start` must be run manually to participate in the pipeline
+* TRELLIS requires NVIDIA CUDA — cannot run on the existing AMD/Intel GPU fleet (khelben, drizzt, danilo)
+* ComfyUI adds a Python dependency stack (PyTorch, CUDA, spconv, flash-attn) to maintain on the desktop
+* RTX 4070 has 12 GB VRAM — large TRELLIS models (2B params) may require fp16 + attention optimization; the 1.2B image-to-3D model fits comfortably
+* Auto-generated VRM models may still need manual expression/viseme morph targets for full companions-frontend lip-sync support
+* Desktop is not managed by GitOps/Kubernetes — Ansible or manual setup
+
+## Pros and Cons of the Options
+
+### Option 1 — ComfyUI + TRELLIS + UniRig on desktop Ray worker
+
+* Good, because fully automated image-to-VRM pipeline eliminates manual sculpting
+* Good, because TRELLIS (CVPR'25) and UniRig (SIGGRAPH'25) are state-of-the-art, MIT-licensed
+* Good, because ComfyUI-3D-Pack provides tested node implementations — no custom TRELLIS integration code
+* Good, because desktop GPU is free/idle capacity with no cluster impact
+* Good, because MLflow integration reuses existing experiment tracking infrastructure
+* Good, because ComfyUI can queue and batch-generate multiple avatars unattended
+* Bad, because desktop availability is not guaranteed (must be manually started)
+* Bad, because CUDA-only — doesn't leverage the existing ROCm/Intel fleet
+* Bad, because auto-rigging quality varies by model topology — some models may need manual refinement
+
+### Option 2 — BlenderMCP interactive workflow (ADR-0062)
+
+* Good, because maximum creative control via VS Code + Copilot
+* Good, because Kasm provides browser-based access from anywhere
+* Bad, because every avatar requires an operator in the loop — slow and non-reproducible
+* Bad, because Blender sculpting from scratch is time-intensive even with AI assistance
+* Bad, because Kasm runs Blender CPU-only (no GPU acceleration inside DinD)
+* Bad, because no MLflow tracking or reproducibility
+
+### Option 3 — Cloud-hosted 3D generation
+
+* Good, because no local GPU required
+* Good, because some services (Meshy, Hyper3D Rodin) offer API access
+* Bad, because vendor dependency for a core asset pipeline
+* Bad, because free tiers have daily limits; paid tiers add recurring cost
+* Bad, because limited control over output quality, rigging, and VRM compliance
+* Bad, because data leaves the homelab network
+
+### Option 4 — TRELLIS + UniRig as in-cluster Ray Serve deployments
+
+* Good, because fully integrated with existing Ray cluster
+* Good, because no desktop dependency
+* Bad, because TRELLIS requires NVIDIA CUDA — no CUDA GPUs in-cluster have enough VRAM (elminster has 8 GB, needs 12–16 GB)
+* Bad, because would require purchasing new in-cluster NVIDIA hardware
+* Bad, because 3D generation is batch/occasional, not real-time serving — Ray Serve's always-on model is wasteful
+* Bad, because TRELLIS's CUDA dependencies (spconv, flash-attn, nvdiffrast, kaolin) conflict with existing Ray worker images
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                  Kubeflow Pipelines (namespace: kubeflow)                     │
+│                                                                              │
+│  ┌────────────────────────────────────────────────────────────────────────┐  │
+│  │              3d_avatar_generation_pipeline                              │  │
+│  │                                                                        │  │
+│  │  1. prepare_reference    Load/generate reference image from prompt     │  │
+│  │         │                (optional: use vLLM + Stable Diffusion)       │  │
+│  │         ▼                                                              │  │
+│  │  2. generate_3d_mesh     Submit RayJob → desktop ComfyUI worker        │  │
+│  │         │                TRELLIS image-large (1.2B) → GLB mesh         │  │
+│  │         ▼                                                              │  │
+│  │  3. auto_rig             Submit RayJob → desktop UniRig worker         │  │
+│  │         │                UniRig skeleton + skinning → rigged FBX/GLB   │  │
+│  │         ▼                                                              │  │
+│  │  4. convert_to_vrm       Blender CLI (headless) on desktop or cluster  │  │
+│  │         │                Import rigged GLB → configure VRM metadata    │  │
+│  │         ▼                → export .vrm                                 │  │
+│  │  5. validate_vrm         Check humanoid rig, expressions, visemes      │  │
+│  │         │                                                              │  │
+│  │         ▼                                                              │  │
+│  │  6. promote_to_storage   rclone copy → gravenhollow RustFS S3          │  │
+│  │         │                                                              │  │
+│  │         ▼                                                              │  │
+│  │  7. log_to_mlflow        Parameters, metrics, artifacts → MLflow       │  │
+│  └────────────────────────────────────────────────────────────────────────┘  │
+└──────────────────────────────────────┬──────────────────────────────────────┘
+                                       │
+                           RayJob CR (ephemeral)
+                                       │
+                                       ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  desktop (Arch Linux · Ryzen 9 7950X · 64 GB DDR5 · RTX 4070 12 GB)        │
+│  On-demand Ray worker (ray start --address=<ray-head>:6379)                 │
+│                                                                              │
+│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  │                     ComfyUI + Custom Nodes                            │  │
+│  │                                                                       │  │
+│  │  ComfyUI-3D-Pack:                                                     │  │
+│  │   • TRELLIS image-large (1.2B) — image → textured GLB mesh           │  │
+│  │   • Mesh processing nodes — simplify, UV unwrap, texture bake         │  │
+│  │   • 3D preview — viewport render for quality check                    │  │
+│  │   • GLB/OBJ/PLY export                                               │  │
+│  │                                                                       │  │
+│  │  UniRig:                                                              │  │
+│  │   • Skeleton prediction — autoregressive bone hierarchy               │  │
+│  │   • Skinning weights — bone-point cross-attention                     │  │
+│  │   • Merge — skeleton + skin + original mesh → rigged model            │  │
+│  │   • Supports GLB, FBX, OBJ input/output                              │  │
+│  │                                                                       │  │
+│  │  Blender 4.x (headless CLI):                                          │  │
+│  │   • VRM Add-on for Blender — GLB → VRM conversion                    │  │
+│  │   • Humanoid rig mapping, expression morphs, viseme config            │  │
+│  │   • Batch export via bpy scripting                                    │  │
+│  └───────────────────────────────────────────────────────────────────────┘  │
+│                                                                              │
+│  GPU: NVIDIA RTX 4070 12 GB (CUDA 12.x)                                    │
+│  Ray: worker node with resource label {"nvidia_gpu": 1, "rtx4070": 1}       │
+│  Storage: ~/comfyui-3d/ (working dir), rclone → gravenhollow S3             │
+└──────────────────────────────────┬──────────────────────────────────────────┘
+                                   │
+                             rclone (S3)
+                                   │
+                                   ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│            gravenhollow.lab.daviestechlabs.io                                │
+│            (TrueNAS Scale · All-SSD · Dual 10GbE · 12.2 TB)                │
+│                                                                              │
+│  NFS: /mnt/gravenhollow/kubernetes/avatar-models/                            │
+│  ├── Seed-san.vrm          (default model)                                  │
+│  ├── Generated-A-v1.vrm    (auto-generated via pipeline)                    │
+│  └── animations/           (shared animation clips)                          │
+│                                                                              │
+│  S3 (RustFS): avatar-models bucket                                          │
+│  (same data, served via Cloudflare Tunnel for remote users)                 │
+└──────────────────────────┬──────────────────────────────────────────────────┘
+                           │
+              ┌────────────┴───────────────┐
+              │                            │
+        NFS (nfs-fast PVC)          Cloudflare Tunnel
+              │                     (assets.daviestechlabs.io)
+              ▼                            │
+┌──────────────────────────┐               ▼
+│  companions-frontend     │   ┌──────────────────────────┐
+│  (Kubernetes pod)        │   │  Remote users (CDN-cached │
+│  LAN users               │   │  via Cloudflare edge)     │
+└──────────────────────────┘   └──────────────────────────┘
+```
+
+### Ray Cluster Integration
+
+The desktop joins the existing KubeRay-managed cluster as an external worker. It is **not** a Talos node and not managed by Kubernetes — it connects to the Ray head node's GCS port directly:
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                     Ray Cluster (KubeRay RayService)                         │
+│                                                                              │
+│  Head: Ray head pod (in-cluster)                                            │
+│  GCS port: 6379 (exposed via NodePort or LoadBalancer)                      │
+│                                                                              │
+│  In-Cluster Workers (permanent, managed by KubeRay):                        │
+│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐                      │
+│  │ khelben  │ │elminster │ │  drizzt  │ │  danilo  │                      │
+│  │Strix Halo│ │RTX 2070  │ │Radeon 680│ │Intel Arc │                      │
+│  │  ROCm    │ │  CUDA    │ │  ROCm    │ │  Intel   │                      │
+│  │ /llm     │ │/whisper  │ │/embeddings│ │/reranker │                      │
+│  │          │ │  /tts    │ │          │ │          │                      │
+│  └──────────┘ └──────────┘ └──────────┘ └──────────┘                      │
+│                                                                              │
+│  External Worker (on-demand, self-managed):                                 │
+│  ┌──────────────────────────────────────────────────┐                      │
+│  │  desktop (Arch Linux, external)                   │                      │
+│  │  RTX 4070 12 GB · CUDA                            │                      │
+│  │  ComfyUI + TRELLIS + UniRig + Blender CLI         │                      │
+│  │  Resource labels: {"nvidia_gpu": 1, "3d_gen": 1}  │                      │
+│  │  Joins via: ray start --address=<head>:6379       │                      │
+│  └──────────────────────────────────────────────────┘                      │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+The existing inference deployments (`/llm`, `/whisper`, `/tts`, `/embeddings`, `/reranker`) are unaffected — they are pinned to their respective in-cluster GPU nodes via Ray resource labels. The desktop's `3d_gen` resource label ensures only 3D generation RayJobs get scheduled there.
+
+### Ray Service Multiplexing
+
+The desktop's RTX 4070 can **time-share between inference overflow and 3D generation** when idle. When no 3D generation jobs are queued, the desktop can optionally serve as overflow capacity for inference workloads:
+
+| Mode | When | What runs on desktop |
+|------|------|---------------------|
+| **3D generation** | ComfyUI workflow triggered (manually or via API) | ComfyUI + TRELLIS → UniRig → Blender VRM export |
+| **Inference overflow** | Manually enabled, high-traffic periods | vLLM (secondary), Whisper, or TTS replica |
+| **Idle** | Desktop powered on, no jobs | Ray worker connected but idle (0 resource cost) |
+
+Mode switching is managed by Ray's resource scheduling — 3D jobs request `{"3d_gen": 1}` and inference jobs request their specific GPU labels. When the desktop is off, all workloads continue on the existing in-cluster fleet with no impact.
+
+## Implementation Plan
+
+### 1. Desktop Environment Setup
+
+```bash
+# Install NVIDIA drivers + CUDA toolkit (Arch Linux)
+sudo pacman -S nvidia nvidia-utils cuda cudnn
+
+# Install Python environment (uv per ADR-0012)
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Create project directory
+mkdir -p ~/comfyui-3d && cd ~/comfyui-3d
+
+# Install ComfyUI
+git clone https://github.com/comfyanonymous/ComfyUI.git
+cd ComfyUI
+uv venv --python 3.11
+source .venv/bin/activate
+uv pip install -r requirements.txt
+
+# Install ComfyUI-3D-Pack (includes TRELLIS nodes)
+cd custom_nodes
+git clone https://github.com/MrForExample/ComfyUI-3D-Pack.git
+cd ComfyUI-3D-Pack
+uv pip install -r requirements.txt
+python install.py
+
+# Install UniRig
+cd ~/comfyui-3d
+git clone https://github.com/VAST-AI-Research/UniRig.git
+cd UniRig
+uv pip install torch torchvision
+uv pip install -r requirements.txt
+uv pip install spconv-cu124  # Match CUDA version
+uv pip install flash-attn --no-build-isolation
+
+# Install Blender (headless CLI for VRM export)
+sudo pacman -S blender
+# Install VRM Add-on
+python -c "import bpy, os; bpy.ops.preferences.addon_install(filepath=os.path.abspath('UniRig/blender/add-on-vrm-v2.20.77_modified.zip'))"
+
+# Install rclone for asset promotion
+sudo pacman -S rclone
+rclone config create gravenhollow s3 \
+    provider=Other \
+    endpoint=https://gravenhollow.lab.daviestechlabs.io:30292 \
+    access_key_id=<key> \
+    secret_access_key=<secret>
+
+# Install Ray for cluster joining
+uv pip install "ray[default]"
+```
+
+### 2. Ray Worker Configuration
+
+```bash
+# Join the Ray cluster on demand
+# Ray head GCS port must be exposed (NodePort 30637 or similar)
+ray start \
+    --address=<ray-head-external-ip>:6379 \
+    --num-cpus=16 \
+    --num-gpus=1 \
+    --resources='{"3d_gen": 1, "rtx4070": 1}' \
+    --node-name=desktop
+
+# Verify connection
+ray status  # Should show desktop as a connected worker
+```
+
+The Ray head's GCS port needs to be reachable from the desktop. Options:
+- **NodePort**: Expose port 6379 as a NodePort (e.g., 30637) on a cluster node
+- **Tailscale/WireGuard**: If the desktop is on a different network segment
+- **Direct LAN**: If desktop and cluster are on the same 192.168.100.0/24 subnet
+
+### 3. ComfyUI Workflow (Node Graph)
+
+The ComfyUI workflow JSON defines the image-to-GLB pipeline:
+
+```
+[Load Image] → [TRELLIS Image-to-3D] → [Mesh Simplify] → [Texture Bake]
+                                                              │
+                                                              ▼
+                                                       [Save GLB]
+                                                              │
+                                                              ▼
+                                               [UniRig Skeleton Prediction]
+                                                              │
+                                                              ▼
+                                               [UniRig Skinning Weights]
+                                                              │
+                                                              ▼
+                                               [UniRig Merge (rigged model)]
+                                                              │
+                                                              ▼
+                                               [Blender VRM Export (CLI)]
+                                                              │
+                                                              ▼
+                                               [Save VRM → ~/comfyui-3d/exports/]
+```
+
+Key TRELLIS parameters exposed:
+- `sparse_structure_sampler_params.steps`: 12 (default)
+- `sparse_structure_sampler_params.cfg_strength`: 7.5
+- `slat_sampler_params.steps`: 12
+- `slat_sampler_params.cfg_strength`: 3.0
+- `simplify`: 0.95 (triangle reduction ratio)
+- `texture_size`: 1024
+
+### 4. MLflow Experiment Tracking
+
+The desktop logs directly to the cluster's MLflow service over HTTP. Set `MLFLOW_TRACKING_URI` in the ComfyUI environment or in a post-generation logging script:
+
+```bash
+export MLFLOW_TRACKING_URI=http://<mlflow-service>:5000
+```
+
+Each generation run logs to a dedicated MLflow experiment:
+
+| What | MLflow Concept | Content |
+|------|---------------|---------|
+| Reference image | Artifact | `reference.png` |
+| TRELLIS parameters | Params | seed, cfg_strength, steps, simplify, texture_size |
+| UniRig parameters | Params | skeleton_seed |
+| Raw mesh | Artifact | `{name}_raw.glb` (pre-rigging) |
+| Rigged model | Artifact | `{name}_rigged.glb` (post-rigging) |
+| Final VRM | Artifact | `{name}.vrm` |
+| Mesh quality | Metrics | vertex_count, face_count, texture_resolution |
+| Rig quality | Metrics | bone_count, skinning_weight_coverage |
+| Pipeline duration | Metrics | trellis_time_s, unirig_time_s, total_time_s |
+
+### 5. VRM Export Script (Blender CLI)
+
+```python
+#!/usr/bin/env python3
+"""vrm_export.py — Headless Blender script for GLB→VRM conversion."""
+import bpy
+import sys
+
+argv = sys.argv[sys.argv.index("--") + 1:]
+input_glb = argv[0]
+output_vrm = argv[1]
+avatar_name = argv[2] if len(argv) > 2 else "Generated Avatar"
+
+# Clear scene
+bpy.ops.wm.read_factory_settings(use_empty=True)
+
+# Import rigged GLB
+bpy.ops.import_scene.gltf(filepath=input_glb)
+
+# Select armature
+armature = next(obj for obj in bpy.data.objects if obj.type == 'ARMATURE')
+bpy.context.view_layer.objects.active = armature
+
+# Configure VRM metadata
+armature["vrm_addon_extension"] = {
+    "spec_version": "1.0",
+    "vrm0": {
+        "meta": {
+            "title": avatar_name,
+            "author": "DaviesTechLabs Pipeline",
+            "allowedUserName": "Everyone",
+        }
+    }
+}
+
+# Export VRM
+bpy.ops.export_scene.vrm(filepath=output_vrm)
+print(f"Exported VRM: {output_vrm}")
+```
+
+Invoked via:
+```bash
+blender --background --python vrm_export.py -- input.glb output.vrm "Avatar Name"
+```
+
+### 6. Asset Promotion (Reuses ADR-0062 Architecture)
+
+The VRM serving architecture from ADR-0062 is preserved unchanged:
+
+| Stage | Action |
+|-------|--------|
+| **Generate** | Automated pipeline: image → TRELLIS → UniRig → VRM |
+| **Promote** | `rclone copy ~/comfyui-3d/exports/{name}.vrm gravenhollow:avatar-models/` |
+| **Register** | Add model path to `AllowedAvatarModels` in companions-frontend Go + JS allowlists |
+| **Deploy** | Flux rolls out config; model already on NFS PVC — no image rebuild |
+| **CDN** | Cloudflare Tunnel → RustFS → CDN cache at 300+ edge PoPs |
+
+## Model Requirements and VRAM Budget
+
+| Component | Model Size | VRAM Required | Notes |
+|-----------|-----------|---------------|-------|
+| TRELLIS image-large | 1.2B params | ~10 GB (fp16) | Image-to-3D, best quality |
+| TRELLIS text-xlarge | 2.0B params | ~14 GB (fp16) | Text-to-3D, optional |
+| UniRig skeleton | ~350M params | ~4 GB | Autoregressive skeleton prediction |
+| UniRig skinning | ~350M params | ~4 GB | Bone-point cross-attention |
+| Blender CLI | N/A | CPU only | Headless VRM export |
+
+**RTX 4070 budget (12 GB):** Models are loaded sequentially (not concurrently) — TRELLIS runs first, output is saved to disk, then UniRig loads for rigging. Peak VRAM usage is ~10 GB during TRELLIS inference. The desktop's 64 GB system RAM provides ample buffer for model loading and mesh processing.
+
+## Security Considerations
+
+* **Ray GCS port exposure**: The Ray head's port 6379 must be reachable from the desktop. Use a NodePort with network policy restricting source IPs to the desktop's address, or use a WireGuard/Tailscale tunnel.
+* **No cluster credentials on desktop**: The desktop runs Ray worker processes and ComfyUI only — it has no `kubeconfig` or Kubernetes API access. Generation is triggered locally via ComfyUI's UI or API, not from the cluster.
+* **Model provenance**: TRELLIS and UniRig checkpoints are downloaded from Hugging Face (Microsoft and VAST-AI orgs respectively). Pin checkpoint hashes in the setup script.
+* **ComfyUI network**: ComfyUI's web UI (port 8188) should be bound to localhost only when not in use. It is not exposed to the cluster.
+* **rclone credentials**: gravenhollow RustFS write credentials stored in `~/.config/rclone/rclone.conf` with `600` permissions.
+* **Generated content**: Auto-generated 3D models inherit no licensing restrictions (TRELLIS and UniRig are both MIT-licensed).
+
+## Future Considerations
+
+* **Kubeflow pipeline for model refinement**: When iterating on existing models (re-rigging, parameter sweeps, A/B testing generation backends), a Kubeflow pipeline can orchestrate multi-step refinement workflows with artifact lineage, caching, and retries — submitting RayJobs to the desktop worker via the existing KFP + RayJob pattern from [ADR-0058](0058-training-strategy-cpu-dgx-spark.md)
+* **DGX Spark** ([ADR-0058](0058-training-strategy-cpu-dgx-spark.md)): When acquired, could run TRELLIS + UniRig in-cluster with dedicated GPU, eliminating desktop dependency
+* **Stable3DGen / Hunyuan3D alternatives**: ComfyUI-3D-Pack supports multiple generation backends — can A/B test quality via MLflow metrics
+* **VRM expression morphs**: Investigate automated viseme and expression blendshape generation for full lip-sync support without manual Blender work
+* **ComfyUI API mode**: ComfyUI supports headless API-only execution (`--listen 0.0.0.0 --port 8188`) — a script or future Kubeflow pipeline can submit workflows via HTTP POST to `/prompt`
+* **Text-to-3D**: Use the cluster's vLLM instance to generate a character description, then Stable Diffusion (on desktop) to create a reference image, feeding into TRELLIS — fully text-to-avatar pipeline
+* **Batch generation**: Schedule overnight batch runs via CronWorkflow to generate avatar libraries from curated reference images
+* **In-cluster migration**: If a 16+ GB NVIDIA GPU is added to the cluster (e.g., via DGX Spark or RTX 5070), migrate TRELLIS + UniRig to a dedicated Ray Serve deployment for always-available generation
+
+## Links
+
+* Supersedes: [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) — BlenderMCP for 3D avatar creation (interactive workflow)
+* Updates: [ADR-0059](0059-mac-mini-ray-worker.md) — waterdeep retains Blender role for manual refinement only
+* Related: [ADR-0046](0046-companions-frontend-architecture.md) — Companions frontend architecture (Three.js + VRM avatars)
+* Related: [ADR-0011](0011-kuberay-unified-gpu-backend.md) — KubeRay unified GPU backend
+* Related: [ADR-0005](0005-multi-gpu-strategy.md) — Multi-GPU heterogeneous strategy
+* Related: [ADR-0058](0058-training-strategy-cpu-dgx-spark.md) — Training strategy (Kubeflow + RayJob pattern for future pipeline work)
+* Related: [ADR-0047](0047-mlflow-experiment-tracking.md) — MLflow experiment tracking
+* Related: [ADR-0026](0026-storage-strategy.md) — Storage strategy (gravenhollow NFS-fast, RustFS S3)
+* [Microsoft TRELLIS](https://github.com/microsoft/TRELLIS) — Structured 3D Latents for Scalable 3D Generation (CVPR'25 Spotlight)
+* [VAST-AI UniRig](https://github.com/VAST-AI-Research/UniRig) — One Model to Rig Them All (SIGGRAPH'25)
+* [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — Extensive 3D node suite for ComfyUI
+* [VRM Add-on for Blender](https://vrm-addon-for-blender.info/en/)
+* [@pixiv/three-vrm](https://github.com/pixiv/three-vrm) (runtime loader in companions-frontend)