Files
homelab-design/decisions/0059-mac-mini-ray-worker.md
Billy D. 4affddf9b4
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 1m5s
replacing mcp blender with reproducable flow.
2026-02-24 05:34:29 -05:00

288 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Mac Mini M4 Pro (waterdeep) as Local AI Agent for 3D Avatar Creation
* Status: accepted
* Date: 2026-02-16
* Updated: 2026-02-23
* Deciders: Billy
* Technical Story: Use waterdeep as a dedicated local AI workstation for BlenderMCP-driven 3D avatar creation, replacing the previously proposed Ray worker role
## Context and Problem Statement
**waterdeep** is a Mac Mini M4 Pro with 48 GB of unified memory that currently serves as a development workstation (see [ADR-0037](0037-node-naming-conventions.md)). The original proposal was to add it to the Ray cluster as an external inference/training worker, but:
- All Ray inference slots are already allocated and stable — adding a 5th GPU class (MPS) increases complexity without filling a gap
- vLLM's MPS backend remains experimental — not production-ready for serving
- The real unmet need is **3D avatar creation** for companions-frontend ([ADR-0063](0063-comfyui-3d-avatar-pipeline.md))
[ADR-0063](0063-comfyui-3d-avatar-pipeline.md) describes an automated ComfyUI + TRELLIS + UniRig pipeline for image-to-VRM avatar generation, running on a personal desktop as an on-demand Ray worker. This supersedes the manual BlenderMCP Kasm workflow from [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md). waterdeep retains its role as an interactive Blender workstation for manual refinement of auto-generated models.
waterdeep's M4 Pro has a 16-core GPU with hardware-accelerated Metal rendering and 48 GB of unified memory shared between CPU and GPU. Running Blender natively on waterdeep with BlenderMCP gives a dramatically better 3D creation experience than Kasm.
How should we use waterdeep to maximise the 3D avatar creation pipeline for companions-frontend?
## Decision Drivers
* Blender on Kasm is CPU-rendered inside DinD — no Metal/Vulkan/CUDA GPU access, poor viewport performance
* waterdeep has a 16-core Apple GPU with Metal support — Blender's Metal backend enables real-time viewport rendering, Cycles GPU rendering, and smooth sculpting
* 48 GB unified memory means Blender, VS Code, and the MCP server can all run simultaneously without swapping
* VS Code with Copilot agent mode and BlenderMCP server are installed on waterdeep — VS Code drives Blender via localhost:9876 with zero-latency socket communication
* Exported VRM models must reach gravenhollow for production serving ([ADR-0063](0063-comfyui-3d-avatar-pipeline.md))
* **rclone** chosen for asset promotion to gravenhollow's RustFS S3 endpoint — simpler than NFS mounts on macOS, consistent with existing Kasm rclone patterns, and avoids autofs/NFS fstab complexity
* The automated ComfyUI pipeline from [ADR-0063](0063-comfyui-3d-avatar-pipeline.md) handles most avatar generation; waterdeep serves as the manual refinement station
* ray cluster GPU fleet is fully allocated and stable — adding MPS complexity is not justified
## Considered Options
1. **Local AI agent on waterdeep** — Blender + BlenderMCP + VS Code natively on macOS, promoting assets to gravenhollow via rclone (S3)
2. **External Ray worker on macOS** (original proposal) — join the Ray cluster for inference and training
3. **Keep Kasm-only workflow** — rely entirely on the browser-based Kasm Blender workstation from ADR-0062
## Decision Outcome
Chosen option: **Option 1 — Local AI agent on waterdeep**, because the Mac Mini's Metal GPU makes it dramatically better for 3D work than CPU-rendered Kasm, the Ray cluster doesn't need another worker, and the local workflow eliminates network latency between VS Code, the MCP server, and Blender.
### Positive Consequences
* Metal GPU acceleration — real-time Eevee viewport, GPU-accelerated Cycles rendering, smooth 60fps sculpting
* Zero-latency MCP — BlenderMCP socket (localhost:9876) has no network hop, instant command execution
* 48 GB unified memory — large Blender scenes, multiple VRM models open simultaneously, no swap pressure
* VS Code + Copilot agent mode + BlenderMCP server installed natively — single editor drives both code and Blender commands
* rclone for asset promotion — consistent with Kasm rclone patterns, avoids macOS NFS/autofs complexity
* Remaining a dev workstation — avatar creation is a creative dev workflow, not a server workload
* Kasm Blender remains available as a browser-based fallback for remote/mobile access
* Simpler than the Ray worker approach — no cluster integration, no GCS port exposure, no experimental MPS backend
### Negative Consequences
* Blender, VS Code, and add-ons must be installed and maintained locally on waterdeep via Homebrew
* Assets created locally need explicit `rclone copy` to promote to gravenhollow (vs Kasm's automatic rclone to Quobyte S3)
* waterdeep is a single machine — no redundancy for the 3D creation workflow
* Not managed by Kubernetes or GitOps — relies on Homebrew-managed tooling
## Pros and Cons of the Options
### Option 1: Local AI agent on waterdeep
* Good, because Metal GPU acceleration makes Blender usable for real 3D work (sculpting, rendering, material preview)
* Good, because localhost MCP socket eliminates all network latency
* Good, because 48 GB unified memory supports complex scenes without swapping
* Good, because no experimental backends (MPS/vLLM) — using Blender's mature Metal renderer
* Good, because waterdeep stays a dev workstation, aligning with its named role
* Bad, because local-only — no browser-based remote access (use Kasm for that)
* Bad, because manual tool installation (Blender, VRM add-on, BlenderMCP, VS Code)
* Bad, because asset promotion to gravenhollow requires explicit rclone command
### Option 2: External Ray worker on macOS (original proposal)
* Good, because adds GPU compute to the Ray cluster
* Good, because training jobs gain MPS acceleration
* Bad, because vLLM MPS backend is experimental — not production-ready
* Bad, because adds a 5th GPU class (MPS) to an already complex fleet
* Bad, because Ray GCS port exposure adds security surface
* Bad, because doesn't address the actual unmet need (3D avatar creation)
* Bad, because waterdeep becomes a server, degrading its dev workstation role
### Option 3: Kasm-only workflow
* Good, because browser-based — usable from any device
* Good, because no local installation required
* Bad, because CPU-rendered Blender inside DinD — poor viewport performance
* Bad, because network latency between VS Code and Blender socket
* Bad, because limited memory inside Kasm container
* Bad, because no GPU acceleration for rendering or sculpting
## Architecture
```
┌─────────────────────────────────────────────────────────────────────────┐
│ waterdeep (Mac Mini M4 Pro · 48 GB unified · Metal GPU) │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ VS Code + GitHub Copilot (agent mode) │ │
│ │ │ │
│ │ BlenderMCP Server (uvx blender-mcp) │ │
│ │ DISABLE_TELEMETRY=true │ │
│ │ │ │ │
│ │ │ TCP localhost:9876 (zero latency) │ │
│ │ ▼ │ │
│ └─────────┬────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────▼────────────────────────────────────────────┐ │
│ │ Blender 4.x (native macOS) │ │
│ │ │ │
│ │ Renderer: Metal (Eevee real-time + Cycles GPU) │ │
│ │ Add-ons: │ │
│ │ • BlenderMCP (addon.py) — socket server :9876 │ │
│ │ • VRM Add-on for Blender — import/export VRM │ │
│ │ │ │
│ │ Working files: ~/blender-avatars/ │ │
│ │ ├── projects/ (.blend source files) │ │
│ │ ├── exports/ (.vrm exported models) │ │
│ │ └── textures/ (shared texture library) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ rclone (S3 asset promotion) │
│ gravenhollow RustFS :30292 │
└──────────────────────────┼──────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ gravenhollow.lab.daviestechlabs.io │
│ (TrueNAS Scale · All-SSD · Dual 10GbE · 12.2 TB) │
│ │
│ NFS: /mnt/gravenhollow/kubernetes/avatar-models/ │
│ ├── Seed-san.vrm (default model) │
│ ├── Companion-A.vrm (promoted from waterdeep) │
│ └── animations/ (shared animation clips) │
│ │
│ S3 (RustFS): avatar-models bucket │
│ (same data, served via Cloudflare Tunnel for remote users) │
└──────────────────────────┬──────────────────────────────────────────────┘
┌────────────┴───────────────┐
│ │
NFS (nfs-fast PVC) Cloudflare Tunnel
│ (assets.daviestechlabs.io)
▼ │
┌──────────────────────────┐ ▼
│ companions-frontend │ ┌──────────────────────────┐
│ (Kubernetes pod) │ │ Remote users (CDN-cached │
│ LAN users │ │ via Cloudflare edge) │
└──────────────────────────┘ └──────────────────────────┘
```
## Implementation Plan
### 1. Install Blender and Add-ons
```bash
# Install Blender via Homebrew
brew install --cask blender
# Download BlenderMCP add-on
curl -LO https://raw.githubusercontent.com/ahujasid/blender-mcp/main/addon.py
# Install in Blender:
# Edit > Preferences > Add-ons > Install... > select addon.py
# Enable "Interface: Blender MCP"
# Install VRM Add-on for Blender:
# Download from https://vrm-addon-for-blender.info/en/
# Edit > Preferences > Add-ons > Install... > select VRM add-on zip
# Enable "Import-Export: VRM"
```
### 2. VS Code MCP Configuration
```json
// .vscode/mcp.json (in companions-frontend or global settings)
{
"servers": {
"blender": {
"command": "uvx",
"args": ["blender-mcp"],
"env": {
"BLENDER_HOST": "localhost",
"BLENDER_PORT": "9876",
"DISABLE_TELEMETRY": "true"
}
}
}
}
```
### 3. Python Environment for BlenderMCP
```bash
# Install uv (per ADR-0012)
curl -LsSf https://astral.sh/uv/install.sh | sh
# uvx handles the BlenderMCP server environment automatically
# Verify it works:
uvx blender-mcp --help
```
### 4. rclone for Asset Promotion
Use rclone to promote finished VRM exports to gravenhollow's RustFS S3 endpoint. This is consistent with the promotion pattern from [ADR-0063](0063-comfyui-3d-avatar-pipeline.md) and avoids macOS NFS/autofs complexity.
```bash
# Install rclone
brew install rclone
# Configure gravenhollow RustFS endpoint
rclone config create gravenhollow s3 \
provider=Other \
endpoint=https://gravenhollow.lab.daviestechlabs.io:30292 \
access_key_id=<key> \
secret_access_key=<secret>
# Promote a finished VRM
rclone copy ~/blender-avatars/exports/Companion-A.vrm gravenhollow:avatar-models/
# Sync all exports (idempotent)
rclone sync ~/blender-avatars/exports/ gravenhollow:avatar-models/ --exclude "*.blend"
```
> **Why rclone over NFS?** macOS autofs/NFS mounts are fragile across reboots and network changes. rclone is a single binary, works over HTTPS, and matches the promotion pattern already used in Kasm workflows. The explicit `rclone copy` command also serves as a deliberate promotion gate — only intentionally promoted models reach production.
### 5. Avatar Creation Workflow (waterdeep)
1. **Open Blender** on waterdeep (native Metal-accelerated)
2. **Enable BlenderMCP** → 3D View sidebar → "BlenderMCP" tab → click "Connect"
3. **Open VS Code** with Copilot agent mode — BlenderMCP server starts automatically
4. **Create avatars** using AI-assisted prompts:
- _"Create an anime-style character with silver hair and a mage outfit"_
- _"Apply metallic blue material to the staff"_
- _"Rig this character for VRM export with standard humanoid bones"_
- _"Export as VRM to ~/blender-avatars/exports/Silver-Mage.vrm"_
5. **Preview** in real-time — Metal GPU renders Eevee viewport at 60fps
6. **Promote** the finished VRM to gravenhollow via rclone:
```bash
rclone copy ~/blender-avatars/exports/Silver-Mage-v1.vrm gravenhollow:avatar-models/
```
7. **Register** in companions-frontend — update `AllowedAvatarModels` in Go and JS allowlists, commit
### 6. Workflow Comparison: waterdeep vs Kasm
| Aspect | waterdeep (local) | Kasm (browser) |
|--------|-------------------|----------------|
| **GPU rendering** | Metal 16-core GPU — Eevee real-time, Cycles GPU | CPU-only software rendering |
| **Viewport FPS** | 60fps (Metal) | 515fps (CPU rasterisation) |
| **MCP latency** | localhost socket — sub-millisecond | Network hop to Kasm container |
| **Memory** | 48 GB unified, shared with GPU | Limited by Kasm container allocation |
| **Sculpting** | Smooth, hardware-accelerated | Laggy, CPU-bound |
| **Asset promotion** | rclone to gravenhollow RustFS S3 | Auto rclone to Quobyte S3 → manual promote to gravenhollow |
| **Access** | Local only (waterdeep physical/VNC) | Any browser, anywhere |
| **Setup** | Homebrew + manual add-on install | Pre-baked in Kasm image |
| **Use when** | Primary creation workflow | Remote access, quick edits, mobile |
## Security Considerations
* BlenderMCP's `execute_blender_code` runs arbitrary Python in Blender — review AI-generated code before execution, especially file I/O operations
* Telemetry disabled via `DISABLE_TELEMETRY=true` in MCP server config
* BlenderMCP socket (port 9876) bound to localhost — not exposed to the network
* NFS traffic to gravenhollow traverses the LAN — no sensitive data in VRM files
* waterdeep has no cluster access — compromise doesn't impact Kubernetes workloads
* `.blend` source files stay local on waterdeep; only finished VRM exports are promoted to gravenhollow
## Future Considerations
* **DGX Spark** ([ADR-0058](0058-training-strategy-cpu-dgx-spark.md)): When acquired, DGX Spark handles training; waterdeep remains the 3D creation workstation
* **Blender + MLX**: Apple's MLX framework could power local AI-generated textures or mesh deformation directly in Blender — worth evaluating as Blender add-ons mature
* **Automated promotion**: A file watcher (fswatch/launchd) could auto-run `rclone sync` when a new VRM appears in `~/blender-avatars/exports/`
* **VRM validation**: Add a pre-promotion check script that validates VRM humanoid rig completeness, expression morphs, and viseme shapes before copying to gravenhollow
## Links
* Related: [ADR-0063](0063-comfyui-3d-avatar-pipeline.md) — ComfyUI image-to-3D avatar pipeline (supersedes ADR-0062)
* Related: [ADR-0062](0062-blender-mcp-3d-avatar-workflow.md) — BlenderMCP 3D avatar workflow (superseded)
* Related: [ADR-0046](0046-companions-frontend-architecture.md) — Companions frontend architecture (Three.js + VRM avatars)
* Related: [ADR-0026](0026-storage-strategy.md) — Storage strategy (gravenhollow NFS-fast)
* Related: [ADR-0037](0037-node-naming-conventions.md) — Node naming conventions (waterdeep)
* Related: [ADR-0012](0012-use-uv-for-python-development.md) — uv for Python development
* [BlenderMCP GitHub](https://github.com/ahujasid/blender-mcp)
* [Blender Metal GPU Rendering](https://docs.blender.org/manual/en/latest/render/cycles/gpu_rendering.html)
* [VRM Add-on for Blender](https://vrm-addon-for-blender.info/en/)
* [@pixiv/three-vrm](https://github.com/pixiv/three-vrm)