# Companions Frontend Architecture * Status: accepted * Date: 2026-02-09 * Deciders: Billy * Technical Story: Design the primary user interface for the AI/ML platform, supporting real-time chat, voice, and 3D avatar interactions ## Context and Problem Statement The homelab AI platform needs a web interface for users to interact with chat (RAG + LLM), voice (STT → LLM → TTS), and embedding services. The interface must support real-time streaming responses, WebSocket connections for NATS message bus integration, and an engaging visual experience. How do we build a performant, maintainable frontend that integrates with the NATS-based backend without a heavy JavaScript framework build step? ## Decision Drivers * Real-time streaming for chat and voice (WebSocket required) * Direct integration with NATS JetStream (Protocol Buffers encoding, see [ADR-0061](0061-go-handler-refactor.md)) * Minimal client-side JavaScript (~20KB gzipped target) * No frontend build step (no webpack/vite/node required) * 3D avatar rendering for immersive experience * OAuth integration with multiple providers * Single binary deployment (Go) ## Considered Options 1. **Go + HTMX + Alpine.js + Three.js** — Server-rendered with minimal JS 2. **Next.js / React SPA** — Full JavaScript framework 3. **SvelteKit** — Compiled JS framework 4. **Go + Templ + raw WebSocket** — Pure Go templates, no JS framework ## Decision Outcome Chosen option: **Option 1 - Go + HTMX + Alpine.js + Three.js**, because it provides a zero-build-step frontend with server-rendered HTML, minimal JavaScript, and rich 3D avatar support, all served from a single Go binary. ### Positive Consequences * Single binary deployment — Go server serves everything * ~20KB gzipped total JS payload (CDN-served HTMX + Alpine + Three.js) * No npm, no webpack, no build step — assets served directly * Server-side rendering via Go templates * WebSocket handled natively in Go (gorilla/websocket) * NATS integration with Protocol Buffers in the same binary * Distroless container image for minimal attack surface * Type-safe NATS messages via handler-base shared Go module (protobuf stubs) ### Negative Consequences * Three.js adds complexity for 3D avatar rendering * HTMX pattern less familiar to developers expecting React/Vue * Limited client-side state management (by design) ## Technology Stack | Layer | Technology | Purpose | |-------|-----------|---------| | Server | Go 1.25 | HTTP server, WebSocket, NATS client, OAuth | | Templates | Go `html/template` | Server-side HTML rendering | | Interactivity | HTMX 2.0 | AJAX, WebSocket, server-sent events | | Client state | Alpine.js 3 | Lightweight reactive UI for local state | | 3D Avatars | Three.js + VRM | 3D character rendering with lip-sync | | Styling | Tailwind CSS 4 + DaisyUI | Utility-first CSS with component library | | Messaging | NATS JetStream | Real-time pub/sub with Protocol Buffers encoding | | Auth | golang-jwt/jwt/v5 | JWT token handling for OAuth flows | | Shared lib | handler-base (Go module) | NATS client, protobuf messages, health, OTel, HTTP clients | | Database | PostgreSQL (lib/pq) + SQLite | Persistent + local session storage | | Observability | OpenTelemetry SDK | Traces, metrics via OTLP gRPC | ## Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ │ Browser │ │ │ │ HTMX (server-rendered HTML) ←→ Go Server (WebSocket) │ │ Alpine.js (local UI state) │ │ Three.js (VRM 3D avatars with lip-sync) │ └───────────────────────┬─────────────────────────────────────────┘ │ HTTP/WebSocket ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Go Server (single binary) │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ Routes │ │ OAuth │ │WebSocket │ │ OTEL │ │ │ │ (HTTP) │ │ Handlers │ │ Hub │ │ Tracing │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ │ │ │ └──────────────┴────────────┘ │ │ │ │ │ ┌─────────┴─────────┐ │ │ │ NATS Client │ │ │ │ (JetStream + │ │ │ │ Protobuf) │ │ │ └─────────┬─────────┘ │ └────────────────────────┼────────────────────────────────────────┘ │ ┌────────────┴────────────────┐ ▼ ▼ ┌──────────────────┐ ┌──────────────────┐ │ NATS JetStream │ │ Ray Serve │ │ ai.chat.* │ │ (STT, TTS, LLM, │ │ ai.voice.* │ │ Embeddings) │ └──────────────────┘ └──────────────────┘ ``` ## Key Features | Feature | Implementation | |---------|---------------| | Real-time chat | WebSocket → NATS pub/sub per-user channels | | Voice assistant | Streaming STT → LLM → TTS via Ray Serve endpoints | | 3D avatars | VRM models rendered in Three.js with audio-driven lip-sync | | OAuth login | Google, Discord, GitHub, Twitch + Authentik OIDC | | RAG search | Milvus vector search for premium users | | Session state | PostgreSQL (CNPG) for persistent data, SQLite for local cache | ## Kubernetes Deployment | | | |---|---| | **Namespace** | `ai-ml` | | **Replicas** | 1 | | **Image** | `ghcr.io/billy-davies-2/companions-frontend` (distroless) | | **Resources** | 50m/128Mi request → 500m/512Mi limit | **OTEL sidecar:** `otel/opentelemetry-collector-contrib:0.145.0` exports traces to ClickStack. **Backend routing:** All AI inference requests (STT, TTS, LLM, embeddings, reranking) route to Ray Serve at `ai-inference-serve-svc.ai-ml.svc.cluster.local:8000`. Auxiliary HTTPRoutes in the `auxiliary` kustomization provide direct model endpoint access at `embeddings.lab`, `whisper.lab`, `tts.lab`, `llm.lab`, `reranker.lab`. **Access:** `companions-chat.lab.daviestechlabs.io` via envoy-internal with Authentik OIDC proxy auth. ## Links * Related to [ADR-0003](0003-use-nats-for-messaging.md) (NATS messaging) * Related to [ADR-0004](0004-use-messagepack-for-nats.md) (MessagePack encoding — superseded by Protocol Buffers, see [ADR-0061](0061-go-handler-refactor.md)) * Related to [ADR-0011](0011-kuberay-unified-gpu-backend.md) (Ray Serve backend) * Related to [ADR-0028](0028-authentik-sso-strategy.md) (OAuth/OIDC) * Related to [ADR-0061](0061-go-handler-refactor.md) (Go handler refactor — handler-base shared module, protobuf wire format) * [HTMX Documentation](https://htmx.org/docs/) * [VRM Specification](https://vrm.dev/en/)