# ComfyUI Image-to-3D Avatar Pipeline with TRELLIS + UniRig * Status: proposed * Date: 2026-02-24 * Deciders: Billy * Technical Story: Replace the manual BlenderMCP 3D avatar creation workflow with an automated, GPU-accelerated image-to-rigged-3D-model pipeline using ComfyUI, TRELLIS 2-4B, and UniRig — running on a personal desktop (NVIDIA RTX 4070) as an on-demand Ray worker, with direct MLflow logging and rclone asset promotion ## Context and Problem Statement The companions-frontend serves VRM avatar models for Three.js-based 3D character rendering ([ADR-0046](0046-companions-frontend-architecture.md)). The previous approach ([ADR-0062](0062-blender-mcp-3d-avatar-workflow.md)) proposed using BlenderMCP in a Kasm workstation or on waterdeep ([ADR-0059](0059-mac-mini-ray-worker.md)) for AI-assisted avatar creation. While BlenderMCP bridges VS Code to Blender, the workflow is fundamentally **interactive and manual** — an operator must prompt the AI, review each sculpting step, and hand-tune rigging and VRM export. This is slow, non-reproducible, and doesn't scale. Meanwhile, the state of the art in image-to-3D generation has matured significantly: - **TRELLIS** (Microsoft, CVPR'25 Spotlight, 12k+ GitHub stars) generates high-quality textured 3D meshes from a single image in seconds using Structured 3D Latents (SLAT) — with models up to 2B parameters - **UniRig** (Tsinghua/Tripo, SIGGRAPH'25, 1.4k+ GitHub stars) automatically generates topologically valid skeletons and skinning weights for arbitrary 3D models using autoregressive transformers — the first model to rig humans, animals, and objects with a single unified framework - **ComfyUI-3D-Pack** (3.7k+ GitHub stars) provides battle-tested ComfyUI nodes for TRELLIS, 3D Gaussian Splatting, mesh processing, and GLB/VRM export — enabling node-graph-based automation without custom code Together, these tools enable a fully automated **image → 3D mesh → rigged model → VRM** pipeline that eliminates manual Blender work for the common case, produces reproducible results, and integrates with the existing MLflow + Ray stack. A personal desktop (Ryzen 9 7950X, 64 GB DDR5, NVIDIA RTX 4070 12 GB VRAM) running Arch Linux is available as an **on-demand external Ray worker** — it won't be a permanent cluster member (it's not running Talos), but can join the Ray cluster via `ray start` when 3D generation workloads need to run. This adds a 5th GPU to the fleet specifically for 3D generation, without disrupting the stable inference allocations. How do we build an automated, reproducible image-to-VRM pipeline that leverages the desktop's CUDA GPU and integrates with the existing AI/ML platform for experiment tracking and asset serving? ## Decision Drivers * BlenderMCP workflow from ADR-0062 is interactive and non-reproducible — every avatar requires an operator in the loop * TRELLIS generates production-quality textured meshes from a single reference image in ~30 seconds on a 12 GB GPU * UniRig automatically rigs arbitrary 3D models with skeleton + skinning weights — no manual weight painting * ComfyUI-3D-Pack bundles TRELLIS, mesh processing, and GLB export as composable nodes — enabling visual pipeline authoring * The desktop's RTX 4070 (12 GB VRAM) meets TRELLIS's 16 GB minimum when using fp16/attention optimizations, and exceeds UniRig's 8 GB requirement * The desktop can join/leave the Ray cluster on demand — no permanent infrastructure commitment * MLflow tracks generation parameters, quality metrics, and output artifacts for reproducibility — the desktop logs directly to the cluster's MLflow service over HTTP * waterdeep (Mac Mini M4 Pro) remains available for interactive Blender touch-up on models that need manual refinement * VRM export, asset promotion to gravenhollow, and serving architecture from ADR-0062 remain valid and are reused ## Considered Options 1. **ComfyUI + TRELLIS + UniRig on desktop Ray worker, with direct MLflow logging and rclone promotion** 2. **BlenderMCP interactive workflow** (ADR-0062, superseded) 3. **Cloud-hosted 3D generation (Hyper3D Rodin, Meshy, etc.)** 4. **Run TRELLIS + UniRig directly as Ray Serve deployments in-cluster** ## Decision Outcome Chosen option: **Option 1 — ComfyUI + TRELLIS + UniRig on desktop Ray worker**, because it automates the entire image-to-rigged-model pipeline without operator interaction, leverages purpose-built state-of-the-art models (TRELLIS for generation, UniRig for rigging), and uses the desktop's RTX 4070 as on-demand GPU capacity without disrupting the stable inference cluster. ComfyUI's visual node graph provides the pipeline orchestration directly on the desktop — no Kubernetes-side orchestrator needed since all compute is local to one machine. waterdeep retains its role as an interactive Blender workstation for manual refinement of auto-generated models when needed — but the expectation is that most avatars pass through the automated pipeline without manual touch-up. ### Positive Consequences * **Fully automated pipeline** — image → textured mesh → rigged model → VRM with no operator in the loop * **Reproducible** — same image + seed produces identical output; parameters tracked in MLflow * **Fast** — TRELLIS generates a mesh in ~30s, UniRig rigs it in ~60s; end-to-end under 5 minutes including VRM export * **On-demand GPU** — desktop joins Ray cluster only when needed; no standing resource cost * **Composable** — ComfyUI node graph can be extended with additional 3D processing nodes (Hunyuan3D, TripoSG, Stable3DGen) without code changes * **Quality** — TRELLIS (CVPR'25) and UniRig (SIGGRAPH'25) represent current state of the art * **MLflow integration** — generation parameters, mesh quality metrics, and output artifacts are logged directly to the cluster's MLflow service over HTTP * **Simple orchestration** — ComfyUI node graph handles the pipeline; no Kubernetes-side orchestrator needed for a single-GPU linear workflow * **Reuses existing serving architecture** — gravenhollow NFS + RustFS CDN serving from ADR-0062 is unchanged * **waterdeep fallback** — interactive Blender + BlenderMCP on waterdeep for models needing hand-tuning ### Negative Consequences * Desktop must be powered on and `ray start` must be run manually to participate in the pipeline * TRELLIS requires NVIDIA CUDA — cannot run on the existing AMD/Intel GPU fleet (khelben, drizzt, danilo) * ComfyUI adds a Python dependency stack (PyTorch, CUDA, spconv, flash-attn) to maintain on the desktop * RTX 4070 has 12 GB VRAM — large TRELLIS models (2B params) may require fp16 + attention optimization; the 1.2B image-to-3D model fits comfortably * Auto-generated VRM models may still need manual expression/viseme morph targets for full companions-frontend lip-sync support * Desktop is not managed by GitOps/Kubernetes — Ansible or manual setup ## Pros and Cons of the Options ### Option 1 — ComfyUI + TRELLIS + UniRig on desktop Ray worker * Good, because fully automated image-to-VRM pipeline eliminates manual sculpting * Good, because TRELLIS (CVPR'25) and UniRig (SIGGRAPH'25) are state-of-the-art, MIT-licensed * Good, because ComfyUI-3D-Pack provides tested node implementations — no custom TRELLIS integration code * Good, because desktop GPU is free/idle capacity with no cluster impact * Good, because MLflow integration reuses existing experiment tracking infrastructure * Good, because ComfyUI can queue and batch-generate multiple avatars unattended * Bad, because desktop availability is not guaranteed (must be manually started) * Bad, because CUDA-only — doesn't leverage the existing ROCm/Intel fleet * Bad, because auto-rigging quality varies by model topology — some models may need manual refinement ### Option 2 — BlenderMCP interactive workflow (ADR-0062) * Good, because maximum creative control via VS Code + Copilot * Good, because Kasm provides browser-based access from anywhere * Bad, because every avatar requires an operator in the loop — slow and non-reproducible * Bad, because Blender sculpting from scratch is time-intensive even with AI assistance * Bad, because Kasm runs Blender CPU-only (no GPU acceleration inside DinD) * Bad, because no MLflow tracking or reproducibility ### Option 3 — Cloud-hosted 3D generation * Good, because no local GPU required * Good, because some services (Meshy, Hyper3D Rodin) offer API access * Bad, because vendor dependency for a core asset pipeline * Bad, because free tiers have daily limits; paid tiers add recurring cost * Bad, because limited control over output quality, rigging, and VRM compliance * Bad, because data leaves the homelab network ### Option 4 — TRELLIS + UniRig as in-cluster Ray Serve deployments * Good, because fully integrated with existing Ray cluster * Good, because no desktop dependency * Bad, because TRELLIS requires NVIDIA CUDA — no CUDA GPUs in-cluster have enough VRAM (elminster has 8 GB, needs 12–16 GB) * Bad, because would require purchasing new in-cluster NVIDIA hardware * Bad, because 3D generation is batch/occasional, not real-time serving — Ray Serve's always-on model is wasteful * Bad, because TRELLIS's CUDA dependencies (spconv, flash-attn, nvdiffrast, kaolin) conflict with existing Ray worker images ## Architecture ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ Kubeflow Pipelines (namespace: kubeflow) │ │ │ │ ┌────────────────────────────────────────────────────────────────────────┐ │ │ │ 3d_avatar_generation_pipeline │ │ │ │ │ │ │ │ 1. prepare_reference Load/generate reference image from prompt │ │ │ │ │ (optional: use vLLM + Stable Diffusion) │ │ │ │ ▼ │ │ │ │ 2. generate_3d_mesh Submit RayJob → desktop ComfyUI worker │ │ │ │ │ TRELLIS image-large (1.2B) → GLB mesh │ │ │ │ ▼ │ │ │ │ 3. auto_rig Submit RayJob → desktop UniRig worker │ │ │ │ │ UniRig skeleton + skinning → rigged FBX/GLB │ │ │ │ ▼ │ │ │ │ 4. convert_to_vrm Blender CLI (headless) on desktop or cluster │ │ │ │ │ Import rigged GLB → configure VRM metadata │ │ │ │ ▼ → export .vrm │ │ │ │ 5. validate_vrm Check humanoid rig, expressions, visemes │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ 6. promote_to_storage rclone copy → gravenhollow RustFS S3 │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ 7. log_to_mlflow Parameters, metrics, artifacts → MLflow │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────┬──────────────────────────────────────┘ │ RayJob CR (ephemeral) │ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ desktop (Arch Linux · Ryzen 9 7950X · 64 GB DDR5 · RTX 4070 12 GB) │ │ On-demand Ray worker (ray start --address=:6379) │ │ │ │ ┌───────────────────────────────────────────────────────────────────────┐ │ │ │ ComfyUI + Custom Nodes │ │ │ │ │ │ │ │ ComfyUI-3D-Pack: │ │ │ │ • TRELLIS image-large (1.2B) — image → textured GLB mesh │ │ │ │ • Mesh processing nodes — simplify, UV unwrap, texture bake │ │ │ │ • 3D preview — viewport render for quality check │ │ │ │ • GLB/OBJ/PLY export │ │ │ │ │ │ │ │ UniRig: │ │ │ │ • Skeleton prediction — autoregressive bone hierarchy │ │ │ │ • Skinning weights — bone-point cross-attention │ │ │ │ • Merge — skeleton + skin + original mesh → rigged model │ │ │ │ • Supports GLB, FBX, OBJ input/output │ │ │ │ │ │ │ │ Blender 4.x (headless CLI): │ │ │ │ • VRM Add-on for Blender — GLB → VRM conversion │ │ │ │ • Humanoid rig mapping, expression morphs, viseme config │ │ │ │ • Batch export via bpy scripting │ │ │ └───────────────────────────────────────────────────────────────────────┘ │ │ │ │ GPU: NVIDIA RTX 4070 12 GB (CUDA 12.x) │ │ Ray: worker node with resource label {"nvidia_gpu": 1, "rtx4070": 1} │ │ Storage: ~/comfyui-3d/ (working dir), rclone → gravenhollow S3 │ └──────────────────────────────────┬──────────────────────────────────────────┘ │ rclone (S3) │ ▼ ┌─────────────────────────────────────────────────────────────────────────────┐ │ gravenhollow.lab.daviestechlabs.io │ │ (TrueNAS Scale · All-SSD · Dual 10GbE · 12.2 TB) │ │ │ │ NFS: /mnt/gravenhollow/kubernetes/avatar-models/ │ │ ├── Seed-san.vrm (default model) │ │ ├── Generated-A-v1.vrm (auto-generated via pipeline) │ │ └── animations/ (shared animation clips) │ │ │ │ S3 (RustFS): avatar-models bucket │ │ (same data, served via Cloudflare Tunnel for remote users) │ └──────────────────────────┬──────────────────────────────────────────────────┘ │ ┌────────────┴───────────────┐ │ │ NFS (nfs-fast PVC) Cloudflare Tunnel │ (assets.daviestechlabs.io) ▼ │ ┌──────────────────────────┐ ▼ │ companions-frontend │ ┌──────────────────────────┐ │ (Kubernetes pod) │ │ Remote users (CDN-cached │ │ LAN users │ │ via Cloudflare edge) │ └──────────────────────────┘ └──────────────────────────┘ ``` ### Ray Cluster Integration The desktop joins the existing KubeRay-managed cluster as an external worker. It is **not** a Talos node and not managed by Kubernetes — it connects to the Ray head node's GCS port directly: ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ Ray Cluster (KubeRay RayService) │ │ │ │ Head: Ray head pod (in-cluster) │ │ GCS port: 6379 (exposed via NodePort or LoadBalancer) │ │ │ │ In-Cluster Workers (permanent, managed by KubeRay): │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ khelben │ │elminster │ │ drizzt │ │ danilo │ │ │ │Strix Halo│ │RTX 2070 │ │Radeon 680│ │Intel Arc │ │ │ │ ROCm │ │ CUDA │ │ ROCm │ │ Intel │ │ │ │ /llm │ │/whisper │ │/embeddings│ │/reranker │ │ │ │ │ │ /tts │ │ │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ External Worker (on-demand, self-managed): │ │ ┌──────────────────────────────────────────────────┐ │ │ │ desktop (Arch Linux, external) │ │ │ │ RTX 4070 12 GB · CUDA │ │ │ │ ComfyUI + TRELLIS + UniRig + Blender CLI │ │ │ │ Resource labels: {"nvidia_gpu": 1, "3d_gen": 1} │ │ │ │ Joins via: ray start --address=:6379 │ │ │ └──────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` The existing inference deployments (`/llm`, `/whisper`, `/tts`, `/embeddings`, `/reranker`) are unaffected — they are pinned to their respective in-cluster GPU nodes via Ray resource labels. The desktop's `3d_gen` resource label ensures only 3D generation RayJobs get scheduled there. ### Ray Service Multiplexing The desktop's RTX 4070 can **time-share between inference overflow and 3D generation** when idle. When no 3D generation jobs are queued, the desktop can optionally serve as overflow capacity for inference workloads: | Mode | When | What runs on desktop | |------|------|---------------------| | **3D generation** | ComfyUI workflow triggered (manually or via API) | ComfyUI + TRELLIS → UniRig → Blender VRM export | | **Inference overflow** | Manually enabled, high-traffic periods | vLLM (secondary), Whisper, or TTS replica | | **Idle** | Desktop powered on, no jobs | Ray worker connected but idle (0 resource cost) | Mode switching is managed by Ray's resource scheduling — 3D jobs request `{"3d_gen": 1}` and inference jobs request their specific GPU labels. When the desktop is off, all workloads continue on the existing in-cluster fleet with no impact. ## Implementation Plan ### 1. Desktop Environment Setup ```bash # Install NVIDIA drivers + CUDA toolkit (Arch Linux) sudo pacman -S nvidia nvidia-utils cuda cudnn # Install Python environment (uv per ADR-0012) curl -LsSf https://astral.sh/uv/install.sh | sh # Create project directory mkdir -p ~/comfyui-3d && cd ~/comfyui-3d # Install ComfyUI git clone https://github.com/comfyanonymous/ComfyUI.git cd ComfyUI uv venv --python 3.11 source .venv/bin/activate uv pip install -r requirements.txt # Install ComfyUI-3D-Pack (includes TRELLIS nodes) cd custom_nodes git clone https://github.com/MrForExample/ComfyUI-3D-Pack.git cd ComfyUI-3D-Pack uv pip install -r requirements.txt python install.py # Install UniRig cd ~/comfyui-3d git clone https://github.com/VAST-AI-Research/UniRig.git cd UniRig uv pip install torch torchvision uv pip install -r requirements.txt uv pip install spconv-cu124 # Match CUDA version uv pip install flash-attn --no-build-isolation # Install Blender (headless CLI for VRM export) sudo pacman -S blender # Install VRM Add-on python -c "import bpy, os; bpy.ops.preferences.addon_install(filepath=os.path.abspath('UniRig/blender/add-on-vrm-v2.20.77_modified.zip'))" # Install rclone for asset promotion sudo pacman -S rclone rclone config create gravenhollow s3 \ provider=Other \ endpoint=https://gravenhollow.lab.daviestechlabs.io:30292 \ access_key_id= \ secret_access_key= # Install Ray for cluster joining uv pip install "ray[default]" ``` ### 2. Ray Worker Configuration ```bash # Join the Ray cluster on demand # Ray head GCS port must be exposed (NodePort 30637 or similar) ray start \ --address=:6379 \ --num-cpus=16 \ --num-gpus=1 \ --resources='{"3d_gen": 1, "rtx4070": 1}' \ --node-name=desktop # Verify connection ray status # Should show desktop as a connected worker ``` The Ray head's GCS port needs to be reachable from the desktop. Options: - **NodePort**: Expose port 6379 as a NodePort (e.g., 30637) on a cluster node - **Tailscale/WireGuard**: If the desktop is on a different network segment - **Direct LAN**: If desktop and cluster are on the same 192.168.100.0/24 subnet ### 3. ComfyUI Workflow (Node Graph) The ComfyUI workflow JSON defines the image-to-GLB pipeline: ``` [Load Image] → [TRELLIS Image-to-3D] → [Mesh Simplify] → [Texture Bake] │ ▼ [Save GLB] │ ▼ [UniRig Skeleton Prediction] │ ▼ [UniRig Skinning Weights] │ ▼ [UniRig Merge (rigged model)] │ ▼ [Blender VRM Export (CLI)] │ ▼ [Save VRM → ~/comfyui-3d/exports/] ``` Key TRELLIS parameters exposed: - `sparse_structure_sampler_params.steps`: 12 (default) - `sparse_structure_sampler_params.cfg_strength`: 7.5 - `slat_sampler_params.steps`: 12 - `slat_sampler_params.cfg_strength`: 3.0 - `simplify`: 0.95 (triangle reduction ratio) - `texture_size`: 1024 ### 4. MLflow Experiment Tracking The desktop logs directly to the cluster's MLflow service over HTTP. Set `MLFLOW_TRACKING_URI` in the ComfyUI environment or in a post-generation logging script: ```bash export MLFLOW_TRACKING_URI=http://:5000 ``` Each generation run logs to a dedicated MLflow experiment: | What | MLflow Concept | Content | |------|---------------|---------| | Reference image | Artifact | `reference.png` | | TRELLIS parameters | Params | seed, cfg_strength, steps, simplify, texture_size | | UniRig parameters | Params | skeleton_seed | | Raw mesh | Artifact | `{name}_raw.glb` (pre-rigging) | | Rigged model | Artifact | `{name}_rigged.glb` (post-rigging) | | Final VRM | Artifact | `{name}.vrm` | | Mesh quality | Metrics | vertex_count, face_count, texture_resolution | | Rig quality | Metrics | bone_count, skinning_weight_coverage | | Pipeline duration | Metrics | trellis_time_s, unirig_time_s, total_time_s | ### 5. VRM Export Script (Blender CLI) ```python #!/usr/bin/env python3 """vrm_export.py — Headless Blender script for GLB→VRM conversion.""" import bpy import sys argv = sys.argv[sys.argv.index("--") + 1:] input_glb = argv[0] output_vrm = argv[1] avatar_name = argv[2] if len(argv) > 2 else "Generated Avatar" # Clear scene bpy.ops.wm.read_factory_settings(use_empty=True) # Import rigged GLB bpy.ops.import_scene.gltf(filepath=input_glb) # Select armature armature = next(obj for obj in bpy.data.objects if obj.type == 'ARMATURE') bpy.context.view_layer.objects.active = armature # Configure VRM metadata armature["vrm_addon_extension"] = { "spec_version": "1.0", "vrm0": { "meta": { "title": avatar_name, "author": "DaviesTechLabs Pipeline", "allowedUserName": "Everyone", } } } # Export VRM bpy.ops.export_scene.vrm(filepath=output_vrm) print(f"Exported VRM: {output_vrm}") ``` Invoked via: ```bash blender --background --python vrm_export.py -- input.glb output.vrm "Avatar Name" ``` ### 6. Asset Promotion (Reuses ADR-0062 Architecture) The VRM serving architecture from ADR-0062 is preserved unchanged: | Stage | Action | |-------|--------| | **Generate** | Automated pipeline: image → TRELLIS → UniRig → VRM | | **Promote** | `rclone copy ~/comfyui-3d/exports/{name}.vrm gravenhollow:avatar-models/` | | **Register** | Add model path to `AllowedAvatarModels` in companions-frontend Go + JS allowlists | | **Deploy** | Flux rolls out config; model already on NFS PVC — no image rebuild | | **CDN** | Cloudflare Tunnel → RustFS → CDN cache at 300+ edge PoPs | ## Model Requirements and VRAM Budget | Component | Model Size | VRAM Required | Notes | |-----------|-----------|---------------|-------| | TRELLIS image-large | 1.2B params | ~10 GB (fp16) | Image-to-3D, best quality | | TRELLIS text-xlarge | 2.0B params | ~14 GB (fp16) | Text-to-3D, optional | | UniRig skeleton | ~350M params | ~4 GB | Autoregressive skeleton prediction | | UniRig skinning | ~350M params | ~4 GB | Bone-point cross-attention | | Blender CLI | N/A | CPU only | Headless VRM export | **RTX 4070 budget (12 GB):** Models are loaded sequentially (not concurrently) — TRELLIS runs first, output is saved to disk, then UniRig loads for rigging. Peak VRAM usage is ~10 GB during TRELLIS inference. The desktop's 64 GB system RAM provides ample buffer for model loading and mesh processing. ## Security Considerations * **Ray GCS port exposure**: The Ray head's port 6379 must be reachable from the desktop. Use a NodePort with network policy restricting source IPs to the desktop's address, or use a WireGuard/Tailscale tunnel. * **No cluster credentials on desktop**: The desktop runs Ray worker processes and ComfyUI only — it has no `kubeconfig` or Kubernetes API access. Generation is triggered locally via ComfyUI's UI or API, not from the cluster. * **Model provenance**: TRELLIS and UniRig checkpoints are downloaded from Hugging Face (Microsoft and VAST-AI orgs respectively). Pin checkpoint hashes in the setup script. * **ComfyUI network**: ComfyUI's web UI (port 8188) should be bound to localhost only when not in use. It is not exposed to the cluster. * **rclone credentials**: gravenhollow RustFS write credentials stored in `~/.config/rclone/rclone.conf` with `600` permissions. * **Generated content**: Auto-generated 3D models inherit no licensing restrictions (TRELLIS and UniRig are both MIT-licensed). ## Future Considerations * **Kubeflow pipeline for model refinement**: When iterating on existing models (re-rigging, parameter sweeps, A/B testing generation backends), a Kubeflow pipeline can orchestrate multi-step refinement workflows with artifact lineage, caching, and retries — submitting RayJobs to the desktop worker via the existing KFP + RayJob pattern from [ADR-0058](0058-training-strategy-cpu-dgx-spark.md) * **DGX Spark** ([ADR-0058](0058-training-strategy-cpu-dgx-spark.md)): When acquired, could run TRELLIS + UniRig in-cluster with dedicated GPU, eliminating desktop dependency * **Stable3DGen / Hunyuan3D alternatives**: ComfyUI-3D-Pack supports multiple generation backends — can A/B test quality via MLflow metrics * **VRM expression morphs**: Investigate automated viseme and expression blendshape generation for full lip-sync support without manual Blender work * **ComfyUI API mode**: ComfyUI supports headless API-only execution (`--listen 0.0.0.0 --port 8188`) — a script or future Kubeflow pipeline can submit workflows via HTTP POST to `/prompt` * **Text-to-3D**: Use the cluster's vLLM instance to generate a character description, then Stable Diffusion (on desktop) to create a reference image, feeding into TRELLIS — fully text-to-avatar pipeline * **Batch generation**: Schedule overnight batch runs via CronWorkflow to generate avatar libraries from curated reference images * **In-cluster migration**: If a 16+ GB NVIDIA GPU is added to the cluster (e.g., via DGX Spark or RTX 5070), migrate TRELLIS + UniRig to a dedicated Ray Serve deployment for always-available generation ## Links * Supersedes: [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) — BlenderMCP for 3D avatar creation (interactive workflow) * Updates: [ADR-0059](0059-mac-mini-ray-worker.md) — waterdeep retains Blender role for manual refinement only * Related: [ADR-0046](0046-companions-frontend-architecture.md) — Companions frontend architecture (Three.js + VRM avatars) * Related: [ADR-0011](0011-kuberay-unified-gpu-backend.md) — KubeRay unified GPU backend * Related: [ADR-0005](0005-multi-gpu-strategy.md) — Multi-GPU heterogeneous strategy * Related: [ADR-0058](0058-training-strategy-cpu-dgx-spark.md) — Training strategy (Kubeflow + RayJob pattern for future pipeline work) * Related: [ADR-0047](0047-mlflow-experiment-tracking.md) — MLflow experiment tracking * Related: [ADR-0026](0026-storage-strategy.md) — Storage strategy (gravenhollow NFS-fast, RustFS S3) * [Microsoft TRELLIS](https://github.com/microsoft/TRELLIS) — Structured 3D Latents for Scalable 3D Generation (CVPR'25 Spotlight) * [VAST-AI UniRig](https://github.com/VAST-AI-Research/UniRig) — One Model to Rig Them All (SIGGRAPH'25) * [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack) — Extensive 3D node suite for ComfyUI * [VRM Add-on for Blender](https://vrm-addon-for-blender.info/en/) * [@pixiv/three-vrm](https://github.com/pixiv/three-vrm) (runtime loader in companions-frontend)