Files
homelab-design/decisions/0062-blender-mcp-3d-avatar-workflow.md
Billy D. 4affddf9b4
All checks were successful
Update README with ADR Index / update-readme (push) Successful in 1m5s
replacing mcp blender with reproducable flow.
2026-02-24 05:34:29 -05:00

449 lines
29 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# BlenderMCP for 3D Avatar Creation via Kasm Workstation
* Status: superseded by [ADR-0063](0063-comfyui-3d-avatar-pipeline.md)
* Date: 2026-02-21
* Deciders: Billy
* Technical Story: Enable AI-assisted 3D avatar creation for companions-frontend using BlenderMCP in a Kasm Blender workstation with VS Code, storing assets in S3, serving locally from gravenhollow NFS and remotely via Cloudflare-cached RustFS
## Context and Problem Statement
The companions-frontend serves VRM avatar models for its Three.js-based 3D character rendering (see [ADR-0046](0046-companions-frontend-architecture.md)). Today the avatar library is limited to three models (`Seed-san.vrm`, `Aka.vrm`, `Midori.vrm`) — only one of which actually ships in the repo — and every model must be sourced or hand-sculpted externally.
Creating custom VRM avatars is a manual, time-intensive process: open Blender, sculpt/rig a character, export to VRM, iterate. There is no integration between the AI coding workflow (VS Code / Copilot) and Blender, so context switching between the editor and the 3D tool is constant.
How do we streamline custom 3D avatar creation for companions-frontend with AI assistance, while keeping assets durable and accessible across workstations?
## Decision Drivers
* The existing avatar pipeline is manual and disconnected from the development workflow
* BlenderMCP (v1.5.5, 17k+ GitHub stars) bridges AI assistants to Blender via the Model Context Protocol — enabling prompt-driven 3D modelling, material control, scene manipulation, and code execution inside Blender
* Kasm Workspaces already run in the cluster (`productivity` namespace) and support Docker-in-Docker with volume plugins for persistent storage
* VS Code supports MCP servers natively (GitHub Copilot agent mode), meaning the same editor used for code can drive Blender scene creation
* Custom volume mounts in Kasm map `/s3` to S3-compatible storage via the rclone Docker volume plugin — providing durable, off-node persistence
* Quobyte S3-compatible endpoint with the `kasm` bucket is the existing Kasm storage backend
* VRM models must ultimately land in the companions-frontend `/assets/models/` path at build time or be served from an external URL
* Final production models and animations should live on gravenhollow (all-SSD TrueNAS, dual 10GbE) for fast local serving via NFS
* Remote users accessing companions-chat through Cloudflare Tunnel need a CDN-cached path for multi-MB VRM downloads
* Models are write-once/read-many — ideal for aggressive caching
* gravenhollow already runs RustFS (S3-compatible) — exposing it via Cloudflare Tunnel gives CDN caching without a separate storage tier
## Considered Options
1. **BlenderMCP in Kasm Blender workstation + VS Code MCP client, assets in Quobyte S3 (`kasm` bucket)**
2. **Local Blender + BlenderMCP on a developer laptop**
3. **Hyper3D / Rodin cloud generation only (no Blender)**
4. **Manual Blender workflow (status quo)**
## Decision Outcome
Chosen option: **Option 1 — BlenderMCP in Kasm Blender workstation + VS Code MCP client, assets in Quobyte S3**, because it integrates AI-assisted modelling directly into the existing Kasm + VS Code workflow, stores assets durably in S3, and requires no additional infrastructure beyond what is already deployed.
### Positive Consequences
* AI-assisted 3D modelling — prompt-driven creation, material application, and scene manipulation inside Blender via MCP
* Zero context switching — VS Code agent mode drives Blender commands through the same editor used for code
* Persistent storage — VRM exports written to `/s3` survive session teardown and are available from any Kasm session or CI pipeline
* Existing infrastructure — Kasm agent, DinD, rclone volume plugin, Quobyte S3, gravenhollow NFS, and Cloudflare are all already deployed
* No image rebuild for new models — VRM files live on gravenhollow NFS, mounted read-only into the pod; add a model and update the allowlist
* LAN performance — all-SSD NFS with dual 10GbE delivers VRM files in <100ms
* Remote performance — RustFS exposed through Cloudflare Tunnel with CDN caching at 300+ global PoPs; no separate storage tier needed
* Poly Haven / Hyper3D integration — BlenderMCP supports downloading Poly Haven assets and generating models via Hyper3D Rodin, expanding the asset library
* VRM ecosystem — Blender VRM add-on exports directly to VRM 0.x/1.0 format consumed by `@pixiv/three-vrm` in companions-frontend
* Reproducible — Kasm workspace images are versioned; Blender + add-ons are pre-baked
### Negative Consequences
* BlenderMCP `execute_blender_code` tool runs arbitrary Python in Blender — must trust AI-generated code or review before execution
* Socket-based communication (TCP 9876) between the MCP server and Blender add-on adds a failure mode
* VRM export quality depends on correct rigging/weight painting — AI can scaffold but manual touch-up may still be needed
* Kasm Blender image must be configured with both the BlenderMCP add-on and the VRM add-on pre-installed
* Telemetry is on by default in BlenderMCP — must disable via `DISABLE_TELEMETRY=true` for privacy
* Cache misses from remote users hit gravenhollow via the tunnel — negligible with immutable files and long TTLs
## Architecture
```
┌─────────────────────────────────────────────────────────────────────────┐
│ Developer Workstation │
│ │
│ ┌──────────────────────────────────┐ │
│ │ VS Code (local) │ │
│ │ │ │
│ │ GitHub Copilot (agent mode) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ BlenderMCP Server (MCP) │ │
│ │ (uvx blender-mcp) │ │
│ │ │ │ │
│ └─────────┼────────────────────────┘ │
│ │ TCP :9876 (JSON over socket) │
└────────────┼────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ Kasm Blender Workstation (browser session) │
│ kasm.daviestechlabs.io │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Blender 4.x │ │
│ │ │ │
│ │ Add-ons: │ │
│ │ • BlenderMCP (addon.py) — socket server :9876 │ │
│ │ • VRM Add-on for Blender — import/export VRM │ │
│ │ │ │
│ │ ┌────────────────────────────────────────────────┐ │ │
│ │ │ /s3/blender-avatars/ │ │ │
│ │ │ ├── projects/ (.blend source files) │ │ │
│ │ │ ├── exports/ (.vrm exported models) │ │ │
│ │ │ └── textures/ (shared texture lib) │ │ │
│ │ └────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ rclone volume │
│ plugin (S3) │
└──────────────────────────┼──────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ Quobyte S3 Endpoint │
│ Bucket: kasm │
│ │
│ kasm/blender-avatars/projects/Companion-A.blend │
│ kasm/blender-avatars/exports/Companion-A.vrm │
│ kasm/blender-avatars/textures/skin-tone-01.png │
└──────────────────────────┬──────────────────────────────────────────────┘
rclone sync (promotion)
┌─────────────────────────────────────────────────────────────────────────┐
│ gravenhollow.lab.daviestechlabs.io │
│ (TrueNAS Scale · All-SSD · Dual 10GbE · 12.2 TB) │
│ │
│ NFS: /mnt/gravenhollow/kubernetes/avatar-models/ │
│ ├── Seed-san.vrm (default model) │
│ ├── Aka.vrm (Legend tier) │
│ ├── Midori.vrm (Legend tier) │
│ ├── Companion-A.vrm (custom, promoted from Kasm S3) │
│ └── animations/ (shared animation clips) │
│ │
│ S3 (RustFS): avatar-models bucket │
│ (same data as NFS dir, served via S3 API for Cloudflare Tunnel) │
└──────────┬─────────────────────────────────┬────────────────────────────┘
│ │
NFS mount (nfs-fast) S3 API (RustFS :30292)
for pod volume via Cloudflare Tunnel
│ │
▼ ▼
┌──────────────────────────┐ ┌──────────────────────────────────────────┐
│ companions-frontend │ │ Cloudflare Tunnel + CDN │
│ (Kubernetes pod) │ │ │
│ │ │ assets.daviestechlabs.io │
│ /models/ volume mount │ │ → envoy-external │
│ (nfs-fast PVC, RO) │ │ → avatar-assets-svc (in-cluster) │
│ │ │ → gravenhollow RustFS :30292 │
│ Go FileServer: │ │ │
│ /assets/models/ → │ │ Cloudflare CDN caches at 300+ PoPs │
│ serves from PVC │ │ Cache-Control: public, max-age=31536000 │
│ │ │ (immutable, versioned filenames) │
└──────────┬───────────────┘ └──────────────────────┬───────────────────┘
│ │
LAN clients Remote clients
companions-chat.lab... companions-chat via
(envoy-internal, direct) Cloudflare Tunnel
│ │
└──────────────────┬───────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ Browser (Three.js) │
│ AvatarManager.loadModel('/assets/models/Companion-A.vrm') │
│ │
│ LAN: fetch from companions-frontend pod (NFS-backed, ~10GbE) │
│ Remote: fetch from assets.daviestechlabs.io (Cloudflare CDN-cached) │
└─────────────────────────────────────────────────────────────────────────┘
```
## Workflow
### 1. Kasm Workspace Setup
The Kasm Blender workspace image is configured with:
| Component | Version | Purpose |
|-----------|---------|---------|
| Blender | 4.x | 3D modelling and sculpting |
| BlenderMCP add-on (`addon.py`) | 1.5.5 | Socket server for MCP commands |
| VRM Add-on for Blender | latest | Import/export VRM format |
| Python | 3.10+ | Blender scripting runtime |
The Kasm storage mapping mounts `/s3` via the rclone Docker volume plugin to the Quobyte S3 endpoint (`kasm` bucket). The sub-path `blender-avatars/` is used for all 3D asset work.
### 2. VS Code MCP Configuration
Add BlenderMCP as an MCP server in VS Code (`.vscode/mcp.json` or user settings):
```json
{
"servers": {
"blender": {
"command": "uvx",
"args": ["blender-mcp"],
"env": {
"BLENDER_HOST": "localhost",
"BLENDER_PORT": "9876",
"DISABLE_TELEMETRY": "true"
}
}
}
}
```
When the Kasm session is accessed remotely, set `BLENDER_HOST` to the Kasm workstation's reachable address.
### 3. Avatar Creation Workflow
1. **Launch** the Kasm Blender workspace via `kasm.daviestechlabs.io`
2. **Enable** the BlenderMCP add-on in Blender → 3D View sidebar → "BlenderMCP" tab → "Connect to Claude"
3. **Open VS Code** with Copilot agent mode and the BlenderMCP MCP server running
4. **Prompt** the AI to create or modify avatars:
- _"Create a humanoid character with anime-style proportions, blue hair, and a fantasy outfit"_
- _"Apply a metallic gold material to the armor pieces"_
- _"Set up the lighting for a character showcase render"_
- _"Rig this character for VRM export with standard humanoid bones"_
5. **Export** the finished model to VRM via the VRM add-on (or via BlenderMCP `execute_blender_code` calling the VRM export operator)
6. **Save** the `.vrm` to `/s3/blender-avatars/exports/` and the `.blend` source to `/s3/blender-avatars/projects/`
7. **Import** the VRM into companions-frontend — copy to `assets/models/`, update the allowlists in `internal/database/database.go` and `static/js/avatar.js`
### 4. Asset Pipeline (Kasm S3 → gravenhollow → production)
| Stage | Action |
|-------|--------|
| **Create** | AI-assisted modelling + VRM export in Kasm Blender → `/s3/blender-avatars/exports/*.vrm` |
| **Store** | rclone syncs `/s3` to Quobyte S3 `kasm` bucket automatically |
| **Promote** | `rclone copy quobyte:kasm/blender-avatars/exports/Model.vrm gravenhollow-nfs:/avatar-models/` (manual or CI) |
| **Register** | Add model path to `AllowedAvatarModels` in Go and JS allowlists, commit to repo |
| **Deploy** | Flux rolls out updated companions-frontend config; model already available on NFS PVC — no image rebuild needed |
| **CDN** | Model immediately available via `assets.daviestechlabs.io` — Cloudflare Tunnel proxies to RustFS, CDN caches at edge |
### 5. Deployment and Storage Architecture
#### Local Serving (LAN users)
Companions-frontend currently serves VRM models via `http.FileServer(http.Dir("assets"))` from the container filesystem. This bakes models into the image and requires a rebuild to add new avatars.
The new approach mounts avatar models from gravenhollow via an `nfs-fast` PVC:
```yaml
# PersistentVolumeClaim for avatar models
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: avatar-models
namespace: ai-ml
spec:
storageClassName: nfs-fast
accessModes: [ReadOnlyMany]
resources:
requests:
storage: 10Gi
```
The pod mounts this PVC at `/models` and the Go server serves it at `/assets/models/`:
```go
// Replace embedded assets with NFS-backed volume
mux.Handle("/assets/models/", http.StripPrefix("/assets/models/",
http.FileServer(http.Dir("/models"))))
```
Benefits:
- **No image rebuild** to add/update models — write to gravenhollow NFS, pod sees it immediately (with `actimeo=600` cache, within 10 minutes)
- **All-SSD + dual 10GbE** — VRM files (typically 530 MB) load in <100ms on LAN
- **ReadOnlyMany** — multiple replicas can share the same PVC
- Source `.blend` files and textures remain on Quobyte S3 (Kasm bucket) for the creation workflow; only promoted VRM exports land on gravenhollow
#### Remote Serving (Cloudflare-cached RustFS)
Companions-chat is accessed externally via Cloudflare Tunnel → `envoy-internal`. Rather than duplicating assets to a separate storage tier (e.g., Cloudflare R2), gravenhollow's RustFS S3 endpoint is exposed directly through the Cloudflare Tunnel with a dedicated hostname. Cloudflare's CDN automatically caches responses at edge PoPs — since VRM files are immutable with year-long TTLs, virtually all requests are served from cache.
| | |
|---|---|
| **Origin** | gravenhollow RustFS `avatar-models` bucket (`:30292`, same data as NFS dir) |
| **Public hostname** | `assets.daviestechlabs.io` (Cloudflare DNS, orange-clouded) |
| **Tunnel routing** | Cloudflare Tunnel → `envoy-external``avatar-assets-svc` → gravenhollow RustFS |
| **CDN caching** | Cloudflare CDN caches at 300+ global PoPs; `Cache-Control: public, max-age=31536000, immutable` |
| **Egress** | Cloudflare-proxied traffic has no bandwidth surcharge |
| **Auth** | Public read (models are not sensitive); RustFS write credentials stay internal |
| **No sync needed** | Single source of truth — NFS and RustFS serve the same data from gravenhollow |
##### In-Cluster Proxy Service
An ExternalName or Endpoints service proxies cluster traffic to gravenhollow's RustFS endpoint so the HTTPRoute can reference it:
```yaml
# Service pointing to gravenhollow RustFS for avatar assets
apiVersion: v1
kind: Service
metadata:
name: avatar-assets
namespace: ai-ml
spec:
type: ExternalName
externalName: gravenhollow.lab.daviestechlabs.io
ports:
- port: 30292
protocol: TCP
```
##### HTTPRoute (Cloudflare Tunnel → RustFS)
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: avatar-assets
namespace: ai-ml
annotations:
external-dns.alpha.kubernetes.io/hostname: assets.daviestechlabs.io
spec:
hostnames:
- assets.daviestechlabs.io
parentRefs:
- name: envoy-external
namespace: network
rules:
- matches:
- path:
type: PathPrefix
value: /avatar-models/
backendRefs:
- name: avatar-assets
port: 30292
filters:
- type: ResponseHeaderModifier
responseHeaderModifier:
set:
- name: Cache-Control
value: "public, max-age=31536000, immutable"
- name: Access-Control-Allow-Origin
value: "https://companions-chat.daviestechlabs.io"
```
Cloudflare Tunnel picks up `assets.daviestechlabs.io` via the existing wildcard ingress rule (`*.daviestechlabs.io → envoy-external`). The CDN caches based on the `Cache-Control` header — after the first request per PoP, all subsequent loads are served from Cloudflare's edge.
##### Client-Side Routing
The frontend detects whether the user is on LAN or remote and routes model fetches accordingly:
```javascript
// avatar.js — model URL resolution
function resolveModelURL(path) {
// LAN users: serve from the Go server (NFS-backed, same origin)
// Remote users: serve from Cloudflare-cached RustFS
const isLAN = location.hostname.endsWith('.lab.daviestechlabs.io');
if (isLAN) return path; // e.g. /assets/models/Companion-A.vrm
return `https://assets.daviestechlabs.io/avatar-models/${path.split('/').pop()}`;
// → https://assets.daviestechlabs.io/avatar-models/Companion-A.vrm
}
```
Alternatively, the Go server can set the model base URL via a template variable based on the `Host` header, keeping the logic server-side.
#### Versioning Strategy
VRM files are immutable once promoted — updated models get a new filename (e.g., `Companion-A-v2.vrm`) rather than overwriting. This ensures:
- Cloudflare CDN cache never serves stale content
- Rollback is trivial — point the allowlist back to the previous version
- Browser `Cache-Control: immutable` works correctly
#### Storage Tier Summary
| Location | Purpose | Tier | Access |
|----------|---------|------|--------|
| Quobyte S3 (`kasm` bucket) | Working files: `.blend`, textures, WIP exports | Kasm rclone volume | Kasm sessions only |
| gravenhollow NFS (`/avatar-models/`) | Production VRM models + animations | `nfs-fast` PVC (RO) | companions-frontend pod, LAN |
| gravenhollow RustFS S3 (`avatar-models`) | Same data as NFS, exposed to Cloudflare Tunnel for remote users | S3 API via HTTPRoute | Cloudflare CDN-cached, global |
## BlenderMCP Capabilities Used
| MCP Tool | Avatar Workflow Use |
|----------|-------------------|
| `get_scene_info` | Inspect current scene before modifications |
| `create_object` | Scaffold base meshes for characters |
| `modify_object` | Adjust proportions, positions, bone placement |
| `set_material` | Apply skin, hair, clothing materials |
| `execute_blender_code` | Run VRM export scripts, batch operations, custom rigging |
| `get_screenshot` | AI reviews viewport to understand current state |
| `poly_haven_download` | Fetch HDRIs, textures for environment/materials |
| `hyper3d_generate` | Generate base 3D models from text prompts via Hyper3D Rodin |
## Security Considerations
* **Code execution:** BlenderMCP's `execute_blender_code` runs arbitrary Python in Blender. The Kasm session is sandboxed (DinD container with no cluster access), limiting blast radius. Always save before executing AI-generated code.
* **Telemetry:** BlenderMCP collects anonymous telemetry by default. Disabled via `DISABLE_TELEMETRY=true` in the MCP server config.
* **Network:** The TCP socket (port 9876) between the MCP server and Blender add-on is local to the session. If accessed remotely, ensure the connection is tunnelled or restricted.
* **S3 credentials:** rclone volume plugin credentials are managed via Kasm storage mappings and the existing `kasm-agent` ExternalSecret — no new secrets required.
* **RustFS exposure:** The `avatar-models` RustFS bucket is exposed read-only through Cloudflare Tunnel. RustFS write credentials remain internal. The HTTPRoute only routes GET requests to the bucket path — no write operations are reachable externally.
* **Public assets:** Avatar models are public assets (served to any authenticated companions-chat user). No sensitive data in VRM files. CORS restricts to `companions-chat.daviestechlabs.io` origin.
* **Model allowlist:** Even though models are served from NFS/R2, the server-side and client-side allowlists in companions-frontend gate which models users can actually select. Uploading a VRM to gravenhollow does not make it available without a code change.
## Pros and Cons of the Options
### Option 1 — BlenderMCP in Kasm + VS Code + Quobyte S3 + gravenhollow (NFS + RustFS via Cloudflare)
* Good, because AI-assisted modelling reduces manual effort for avatar creation
* Good, because assets persist in S3 across sessions and are accessible from CI
* Good, because no new infrastructure — Kasm, rclone, Quobyte, gravenhollow, Cloudflare Tunnel are all already deployed
* Good, because VS Code MCP integration means one editor for code and 3D work
* Good, because Kasm sandboxes Blender execution away from the cluster
* Good, because NFS-fast serving decouples model assets from container images (no rebuild to add models)
* Good, because RustFS through Cloudflare Tunnel provides CDN caching with zero additional storage tiers — no R2 bucket, no sync CronJob, no extra credentials
* Good, because single source of truth — gravenhollow serves both LAN (NFS) and remote (RustFS → Cloudflare CDN) from the same data
* Good, because immutable versioned filenames enable aggressive caching and trivial rollback
* Good, because models are available to remote users immediately after promotion (no sync delay)
* Bad, because BlenderMCP is a third-party tool with arbitrary code execution
* Bad, because socket communication adds latency for remote Kasm sessions
* Bad, because VRM rigging quality may require manual adjustment after AI scaffolding
* Bad, because cache misses hit gravenhollow via the tunnel (negligible with immutable files + long TTLs)
### Option 2 — Local Blender + BlenderMCP on developer laptop
* Good, because lowest latency (everything local)
* Good, because no Kasm dependency
* Bad, because assets are local — no durable S3 storage without manual sync
* Bad, because Blender + add-ons must be installed on every dev machine
* Bad, because not reproducible across machines
### Option 3 — Hyper3D / Rodin cloud generation only
* Good, because no Blender installation needed
* Good, because fully prompt-driven model generation
* Bad, because limited control over output — no fine-tuning materials, rigging, or proportions
* Bad, because Hyper3D free tier has daily generation limits
* Bad, because generated models require post-processing for VRM compliance (humanoid rig, expressions, visemes)
* Bad, because vendor dependency for a core asset pipeline
### Option 4 — Manual Blender workflow (status quo)
* Good, because full manual control
* Good, because no new tooling
* Bad, because slow — no AI assistance for repetitive modelling tasks
* Bad, because no integration with the development workflow
* Bad, because assets stored ad-hoc with no structured pipeline to companions-frontend
## Links
* Related to [ADR-0046](0046-companions-frontend-architecture.md) (companions-frontend architecture — Three.js + VRM avatars)
* Related to [ADR-0026](0026-storage-strategy.md) (storage strategy — gravenhollow NFS-fast, Quobyte S3, rclone)
* Related to [ADR-0044](0044-dns-and-external-access.md) (DNS and external access — Cloudflare Tunnel, split-horizon)
* Related to [ADR-0049](0049-self-hosted-productivity-suite.md) (Kasm Workspaces)
* Related to [ADR-0059](0059-mac-mini-ray-worker.md) (waterdeep as local AI agent — primary 3D creation workstation with Metal GPU)
* [BlenderMCP GitHub](https://github.com/ahujasid/blender-mcp)
* [VRM Add-on for Blender](https://vrm-addon-for-blender.info/en/)
* [VRM Specification](https://vrm.dev/en/)
* [@pixiv/three-vrm](https://github.com/pixiv/three-vrm) (runtime loader used in companions-frontend)
* [Poly Haven](https://polyhaven.com/) (free 3D assets, HDRIs, textures)
* [Hyper3D Rodin](https://hyper3d.ai/) (AI 3D model generation)
* [Cloudflare Tunnel Docs](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/)
* [Cloudflare CDN Cache Rules](https://developers.cloudflare.com/cache/)