New ADRs: - 0043: Cilium CNI and Network Fabric - 0044: DNS and External Access Architecture - 0045: TLS Certificate Strategy (cert-manager) - 0046: Companions Frontend Architecture - 0047: MLflow Experiment Tracking and Model Registry - 0048: Entertainment and Media Stack - 0049: Self-Hosted Productivity Suite - 0050: Argo Rollouts Progressive Delivery - 0051: KEDA Event-Driven Autoscaling - 0052: Cluster Utilities (Spegel, Descheduler, Reloader, CSI-NFS) - 0053: Vaultwarden Password Management README updated with table entries and badge count (53 total).
7.8 KiB
Companions Frontend Architecture
- Status: accepted
- Date: 2026-02-09
- Deciders: Billy
- Technical Story: Design the primary user interface for the AI/ML platform, supporting real-time chat, voice, and 3D avatar interactions
Context and Problem Statement
The homelab AI platform needs a web interface for users to interact with chat (RAG + LLM), voice (STT → LLM → TTS), and embedding services. The interface must support real-time streaming responses, WebSocket connections for NATS message bus integration, and an engaging visual experience.
How do we build a performant, maintainable frontend that integrates with the NATS-based backend without a heavy JavaScript framework build step?
Decision Drivers
- Real-time streaming for chat and voice (WebSocket required)
- Direct integration with NATS JetStream (binary MessagePack protocol)
- Minimal client-side JavaScript (~20KB gzipped target)
- No frontend build step (no webpack/vite/node required)
- 3D avatar rendering for immersive experience
- OAuth integration with multiple providers
- Single binary deployment (Go)
Considered Options
- Go + HTMX + Alpine.js + Three.js — Server-rendered with minimal JS
- Next.js / React SPA — Full JavaScript framework
- SvelteKit — Compiled JS framework
- Go + Templ + raw WebSocket — Pure Go templates, no JS framework
Decision Outcome
Chosen option: Option 1 - Go + HTMX + Alpine.js + Three.js, because it provides a zero-build-step frontend with server-rendered HTML, minimal JavaScript, and rich 3D avatar support, all served from a single Go binary.
Positive Consequences
- Single binary deployment — Go server serves everything
- ~20KB gzipped total JS payload (CDN-served HTMX + Alpine + Three.js)
- No npm, no webpack, no build step — assets served directly
- Server-side rendering via Go templates
- WebSocket handled natively in Go (gorilla/websocket)
- NATS integration with MessagePack in the same binary
- Distroless container image for minimal attack surface
Negative Consequences
- Three.js adds complexity for 3D avatar rendering
- HTMX pattern less familiar to developers expecting React/Vue
- Limited client-side state management (by design)
Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| Server | Go 1.25 | HTTP server, WebSocket, NATS client, OAuth |
| Templates | Go html/template |
Server-side HTML rendering |
| Interactivity | HTMX 2.0 | AJAX, WebSocket, server-sent events |
| Client state | Alpine.js 3 | Lightweight reactive UI for local state |
| 3D Avatars | Three.js + VRM | 3D character rendering with lip-sync |
| Styling | Tailwind CSS 4 + DaisyUI | Utility-first CSS with component library |
| Messaging | NATS JetStream | Real-time pub/sub with MessagePack encoding |
| Auth | golang-jwt/jwt/v5 | JWT token handling for OAuth flows |
| Database | PostgreSQL (lib/pq) + SQLite | Persistent + local session storage |
| Observability | OpenTelemetry SDK | Traces, metrics via OTLP gRPC |
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Browser │
│ │
│ HTMX (server-rendered HTML) ←→ Go Server (WebSocket) │
│ Alpine.js (local UI state) │
│ Three.js (VRM 3D avatars with lip-sync) │
└───────────────────────┬─────────────────────────────────────────┘
│ HTTP/WebSocket
▼
┌─────────────────────────────────────────────────────────────────┐
│ Go Server (single binary) │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Routes │ │ OAuth │ │WebSocket │ │ OTEL │ │
│ │ (HTTP) │ │ Handlers │ │ Hub │ │ Tracing │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │
│ └──────────────┴────────────┘ │
│ │ │
│ ┌─────────┴─────────┐ │
│ │ NATS Client │ │
│ │ (JetStream + │ │
│ │ MessagePack) │ │
│ └─────────┬─────────┘ │
└────────────────────────┼────────────────────────────────────────┘
│
┌────────────┴────────────────┐
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ NATS JetStream │ │ Ray Serve │
│ ai.chat.* │ │ (STT, TTS, LLM, │
│ ai.voice.* │ │ Embeddings) │
└──────────────────┘ └──────────────────┘
Key Features
| Feature | Implementation |
|---|---|
| Real-time chat | WebSocket → NATS pub/sub per-user channels |
| Voice assistant | Streaming STT → LLM → TTS via Ray Serve endpoints |
| 3D avatars | VRM models rendered in Three.js with audio-driven lip-sync |
| OAuth login | Google, Discord, GitHub, Twitch + Authentik OIDC |
| RAG search | Milvus vector search for premium users |
| Session state | PostgreSQL (CNPG) for persistent data, SQLite for local cache |
Kubernetes Deployment
| Namespace | ai-ml |
| Replicas | 1 |
| Image | ghcr.io/billy-davies-2/companions-frontend (distroless) |
| Resources | 50m/128Mi request → 500m/512Mi limit |
OTEL sidecar: otel/opentelemetry-collector-contrib:0.145.0 exports traces to ClickStack.
Backend routing: All AI inference requests (STT, TTS, LLM, embeddings, reranking) route to Ray Serve at ai-inference-serve-svc.ai-ml.svc.cluster.local:8000. Auxiliary HTTPRoutes in the auxiliary kustomization provide direct model endpoint access at embeddings.lab, whisper.lab, tts.lab, llm.lab, reranker.lab.
Access: companions-chat.lab.daviestechlabs.io via envoy-internal with Authentik OIDC proxy auth.
Links
- Related to ADR-0003 (NATS messaging)
- Related to ADR-0004 (MessagePack encoding)
- Related to ADR-0011 (Ray Serve backend)
- Related to ADR-0028 (OAuth/OIDC)
- HTMX Documentation
- VRM Specification