daviestechlabs/homelab-design

Fork 0

Files

Billy D. 100ba21eba

Update README with ADR Index / update-readme (push) Successful in 1m2s

Details

updates to adrs and fixing to reflect go refactor.

2026-02-23 06:14:30 -05:00

18 KiB

Raw Blame History

Mac Mini M4 Pro (waterdeep) as Local AI Agent for 3D Avatar Creation

Status: accepted
Date: 2026-02-16
Updated: 2026-02-23
Deciders: Billy
Technical Story: Use waterdeep as a dedicated local AI workstation for BlenderMCP-driven 3D avatar creation, replacing the previously proposed Ray worker role

Context and Problem Statement

waterdeep is a Mac Mini M4 Pro with 48 GB of unified memory that currently serves as a development workstation (see ADR-0037). The original proposal was to add it to the Ray cluster as an external inference/training worker, but:

All Ray inference slots are already allocated and stable — adding a 5th GPU class (MPS) increases complexity without filling a gap
vLLM's MPS backend remains experimental — not production-ready for serving
The real unmet need is 3D avatar creation for companions-frontend (ADR-0062)

ADR-0062 describes using BlenderMCP in a Kasm Blender workstation for AI-assisted avatar creation. While Kasm works, it runs Blender inside a DinD container with no GPU acceleration — rendering and viewport interaction are CPU-only, which is painfully slow for sculpting, material preview, and VRM export iteration.

waterdeep's M4 Pro has a 16-core GPU with hardware-accelerated Metal rendering and 48 GB of unified memory shared between CPU and GPU. Running Blender natively on waterdeep with BlenderMCP gives a dramatically better 3D creation experience than Kasm.

How should we use waterdeep to maximise the 3D avatar creation pipeline for companions-frontend?

Decision Drivers

Blender on Kasm is CPU-rendered inside DinD — no Metal/Vulkan/CUDA GPU access, poor viewport performance
waterdeep has a 16-core Apple GPU with Metal support — Blender's Metal backend enables real-time viewport rendering, Cycles GPU rendering, and smooth sculpting
48 GB unified memory means Blender, VS Code, and the MCP server can all run simultaneously without swapping
VS Code with Copilot agent mode and BlenderMCP server are installed on waterdeep — VS Code drives Blender via localhost:9876 with zero-latency socket communication
Exported VRM models must reach gravenhollow for production serving (ADR-0062)
rclone chosen for asset promotion to gravenhollow's RustFS S3 endpoint — simpler than NFS mounts on macOS, consistent with existing Kasm rclone patterns, and avoids autofs/NFS fstab complexity
The Kasm Blender workflow from ADR-0062 remains available as a fallback (browser-based, no local install required)
ray cluster GPU fleet is fully allocated and stable — adding MPS complexity is not justified

Considered Options

Local AI agent on waterdeep — Blender + BlenderMCP + VS Code natively on macOS, promoting assets to gravenhollow via rclone (S3)
External Ray worker on macOS (original proposal) — join the Ray cluster for inference and training
Keep Kasm-only workflow — rely entirely on the browser-based Kasm Blender workstation from ADR-0062

Decision Outcome

Chosen option: Option 1 — Local AI agent on waterdeep, because the Mac Mini's Metal GPU makes it dramatically better for 3D work than CPU-rendered Kasm, the Ray cluster doesn't need another worker, and the local workflow eliminates network latency between VS Code, the MCP server, and Blender.

Positive Consequences

Metal GPU acceleration — real-time Eevee viewport, GPU-accelerated Cycles rendering, smooth 60fps sculpting
Zero-latency MCP — BlenderMCP socket (localhost:9876) has no network hop, instant command execution
48 GB unified memory — large Blender scenes, multiple VRM models open simultaneously, no swap pressure
VS Code + Copilot agent mode + BlenderMCP server installed natively — single editor drives both code and Blender commands
rclone for asset promotion — consistent with Kasm rclone patterns, avoids macOS NFS/autofs complexity
Remaining a dev workstation — avatar creation is a creative dev workflow, not a server workload
Kasm Blender remains available as a browser-based fallback for remote/mobile access
Simpler than the Ray worker approach — no cluster integration, no GCS port exposure, no experimental MPS backend

Negative Consequences

Blender, VS Code, and add-ons must be installed and maintained locally on waterdeep via Homebrew
Assets created locally need explicit rclone copy to promote to gravenhollow (vs Kasm's automatic rclone to Quobyte S3)
waterdeep is a single machine — no redundancy for the 3D creation workflow
Not managed by Kubernetes or GitOps — relies on Homebrew-managed tooling

Pros and Cons of the Options

Option 1: Local AI agent on waterdeep

Good, because Metal GPU acceleration makes Blender usable for real 3D work (sculpting, rendering, material preview)
Good, because localhost MCP socket eliminates all network latency
Good, because 48 GB unified memory supports complex scenes without swapping
Good, because no experimental backends (MPS/vLLM) — using Blender's mature Metal renderer
Good, because waterdeep stays a dev workstation, aligning with its named role
Bad, because local-only — no browser-based remote access (use Kasm for that)
Bad, because manual tool installation (Blender, VRM add-on, BlenderMCP, VS Code)
Bad, because asset promotion to gravenhollow requires explicit rclone command

Option 2: External Ray worker on macOS (original proposal)

Good, because adds GPU compute to the Ray cluster
Good, because training jobs gain MPS acceleration
Bad, because vLLM MPS backend is experimental — not production-ready
Bad, because adds a 5th GPU class (MPS) to an already complex fleet
Bad, because Ray GCS port exposure adds security surface
Bad, because doesn't address the actual unmet need (3D avatar creation)
Bad, because waterdeep becomes a server, degrading its dev workstation role

Option 3: Kasm-only workflow

Good, because browser-based — usable from any device
Good, because no local installation required
Bad, because CPU-rendered Blender inside DinD — poor viewport performance
Bad, because network latency between VS Code and Blender socket
Bad, because limited memory inside Kasm container
Bad, because no GPU acceleration for rendering or sculpting

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│  waterdeep (Mac Mini M4 Pro · 48 GB unified · Metal GPU)                │
│                                                                         │
│  ┌──────────────────────────────────────────────────────┐              │
│  │         VS Code + GitHub Copilot (agent mode)        │              │
│  │                                                      │              │
│  │  BlenderMCP Server (uvx blender-mcp)                 │              │
│  │  DISABLE_TELEMETRY=true                              │              │
│  │         │                                            │              │
│  │         │ TCP localhost:9876 (zero latency)           │              │
│  │         ▼                                            │              │
│  └─────────┬────────────────────────────────────────────┘              │
│            │                                                            │
│  ┌─────────▼────────────────────────────────────────────┐              │
│  │              Blender 4.x (native macOS)              │              │
│  │                                                      │              │
│  │  Renderer: Metal (Eevee real-time + Cycles GPU)      │              │
│  │  Add-ons:                                            │              │
│  │   • BlenderMCP (addon.py) — socket server :9876      │              │
│  │   • VRM Add-on for Blender — import/export VRM       │              │
│  │                                                      │              │
│  │  Working files: ~/blender-avatars/                    │              │
│  │  ├── projects/          (.blend source files)        │              │
│  │  ├── exports/           (.vrm exported models)       │              │
│  │  └── textures/          (shared texture library)     │              │
│  └──────────────────────────────────────────────────────┘              │
│                          │                                              │
│                    rclone (S3 asset promotion)                           │
│                    gravenhollow RustFS :30292                            │
└──────────────────────────┼──────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────────────┐
│            gravenhollow.lab.daviestechlabs.io                            │
│            (TrueNAS Scale · All-SSD · Dual 10GbE · 12.2 TB)            │
│                                                                         │
│  NFS: /mnt/gravenhollow/kubernetes/avatar-models/                       │
│  ├── Seed-san.vrm          (default model)                              │
│  ├── Companion-A.vrm       (promoted from waterdeep)                    │
│  └── animations/           (shared animation clips)                     │
│                                                                         │
│  S3 (RustFS): avatar-models bucket                                      │
│  (same data, served via Cloudflare Tunnel for remote users)             │
└──────────────────────────┬──────────────────────────────────────────────┘
                           │
              ┌────────────┴───────────────┐
              │                            │
        NFS (nfs-fast PVC)          Cloudflare Tunnel
              │                     (assets.daviestechlabs.io)
              ▼                            │
┌──────────────────────────┐               ▼
│  companions-frontend     │   ┌──────────────────────────┐
│  (Kubernetes pod)        │   │  Remote users (CDN-cached │
│  LAN users               │   │  via Cloudflare edge)     │
└──────────────────────────┘   └──────────────────────────┘

Implementation Plan

1. Install Blender and Add-ons

# Install Blender via Homebrew
brew install --cask blender

# Download BlenderMCP add-on
curl -LO https://raw.githubusercontent.com/ahujasid/blender-mcp/main/addon.py

# Install in Blender:
# Edit > Preferences > Add-ons > Install... > select addon.py
# Enable "Interface: Blender MCP"

# Install VRM Add-on for Blender:
# Download from https://vrm-addon-for-blender.info/en/
# Edit > Preferences > Add-ons > Install... > select VRM add-on zip
# Enable "Import-Export: VRM"

2. VS Code MCP Configuration

// .vscode/mcp.json (in companions-frontend or global settings)
{
  "servers": {
    "blender": {
      "command": "uvx",
      "args": ["blender-mcp"],
      "env": {
        "BLENDER_HOST": "localhost",
        "BLENDER_PORT": "9876",
        "DISABLE_TELEMETRY": "true"
      }
    }
  }
}

3. Python Environment for BlenderMCP

# Install uv (per ADR-0012)
curl -LsSf https://astral.sh/uv/install.sh | sh

# uvx handles the BlenderMCP server environment automatically
# Verify it works:
uvx blender-mcp --help

4. rclone for Asset Promotion

Use rclone to promote finished VRM exports to gravenhollow's RustFS S3 endpoint. This is consistent with the Kasm rclone volume plugin pattern from ADR-0062 and avoids macOS NFS/autofs complexity.

# Install rclone
brew install rclone

# Configure gravenhollow RustFS endpoint
rclone config create gravenhollow s3 \
    provider=Other \
    endpoint=https://gravenhollow.lab.daviestechlabs.io:30292 \
    access_key_id=<key> \
    secret_access_key=<secret>

# Promote a finished VRM
rclone copy ~/blender-avatars/exports/Companion-A.vrm gravenhollow:avatar-models/

# Sync all exports (idempotent)
rclone sync ~/blender-avatars/exports/ gravenhollow:avatar-models/ --exclude "*.blend"

Why rclone over NFS? macOS autofs/NFS mounts are fragile across reboots and network changes. rclone is a single binary, works over HTTPS, and matches the promotion pattern already used in Kasm workflows. The explicit rclone copy command also serves as a deliberate promotion gate — only intentionally promoted models reach production.

5. Avatar Creation Workflow (waterdeep)

Open Blender on waterdeep (native Metal-accelerated)
Enable BlenderMCP → 3D View sidebar → "BlenderMCP" tab → click "Connect"
Open VS Code with Copilot agent mode — BlenderMCP server starts automatically
Create avatars using AI-assisted prompts:
- "Create an anime-style character with silver hair and a mage outfit"
- "Apply metallic blue material to the staff"
- "Rig this character for VRM export with standard humanoid bones"
- "Export as VRM to ~/blender-avatars/exports/Silver-Mage.vrm"
Preview in real-time — Metal GPU renders Eevee viewport at 60fps

Promote the finished VRM to gravenhollow via rclone:

rclone copy ~/blender-avatars/exports/Silver-Mage-v1.vrm gravenhollow:avatar-models/

Register in companions-frontend — update AllowedAvatarModels in Go and JS allowlists, commit

6. Workflow Comparison: waterdeep vs Kasm

Aspect	waterdeep (local)	Kasm (browser)
GPU rendering	Metal 16-core GPU — Eevee real-time, Cycles GPU	CPU-only software rendering
Viewport FPS	60fps (Metal)	5–15fps (CPU rasterisation)
MCP latency	localhost socket — sub-millisecond	Network hop to Kasm container
Memory	48 GB unified, shared with GPU	Limited by Kasm container allocation
Sculpting	Smooth, hardware-accelerated	Laggy, CPU-bound
Asset promotion	rclone to gravenhollow RustFS S3	Auto rclone to Quobyte S3 → manual promote to gravenhollow
Access	Local only (waterdeep physical/VNC)	Any browser, anywhere
Setup	Homebrew + manual add-on install	Pre-baked in Kasm image
Use when	Primary creation workflow	Remote access, quick edits, mobile

Security Considerations

BlenderMCP's execute_blender_code runs arbitrary Python in Blender — review AI-generated code before execution, especially file I/O operations
Telemetry disabled via DISABLE_TELEMETRY=true in MCP server config
BlenderMCP socket (port 9876) bound to localhost — not exposed to the network
NFS traffic to gravenhollow traverses the LAN — no sensitive data in VRM files
waterdeep has no cluster access — compromise doesn't impact Kubernetes workloads
.blend source files stay local on waterdeep; only finished VRM exports are promoted to gravenhollow

Future Considerations

DGX Spark (ADR-0058): When acquired, DGX Spark handles training; waterdeep remains the 3D creation workstation
Blender + MLX: Apple's MLX framework could power local AI-generated textures or mesh deformation directly in Blender — worth evaluating as Blender add-ons mature
Automated promotion: A file watcher (fswatch/launchd) could auto-run rclone sync when a new VRM appears in ~/blender-avatars/exports/
VRM validation: Add a pre-promotion check script that validates VRM humanoid rig completeness, expression morphs, and viseme shapes before copying to gravenhollow

18 KiB Raw Blame History Unescape Escape