homelab-design/decisions/0046-companions-frontend-architecture.md

# Companions Frontend Architecture

* Status: accepted
* Date: 2026-02-09
* Deciders: Billy
* Technical Story: Design the primary user interface for the AI/ML platform, supporting real-time chat, voice, and 3D avatar interactions

## Context and Problem Statement

The homelab AI platform needs a web interface for users to interact with chat (RAG + LLM), voice (STT → LLM → TTS), and embedding services. The interface must support real-time streaming responses, WebSocket connections for NATS message bus integration, and an engaging visual experience.

How do we build a performant, maintainable frontend that integrates with the NATS-based backend without a heavy JavaScript framework build step?

## Decision Drivers

* Real-time streaming for chat and voice (WebSocket required)
* Direct integration with NATS JetStream (binary MessagePack protocol)
* Minimal client-side JavaScript (~20KB gzipped target)
* No frontend build step (no webpack/vite/node required)
* 3D avatar rendering for immersive experience
* OAuth integration with multiple providers
* Single binary deployment (Go)

## Considered Options

1. **Go + HTMX + Alpine.js + Three.js** — Server-rendered with minimal JS
2. **Next.js / React SPA** — Full JavaScript framework
3. **SvelteKit** — Compiled JS framework
4. **Go + Templ + raw WebSocket** — Pure Go templates, no JS framework

## Decision Outcome

Chosen option: **Option 1 - Go + HTMX + Alpine.js + Three.js**, because it provides a zero-build-step frontend with server-rendered HTML, minimal JavaScript, and rich 3D avatar support, all served from a single Go binary.

### Positive Consequences

* Single binary deployment — Go server serves everything
* ~20KB gzipped total JS payload (CDN-served HTMX + Alpine + Three.js)
* No npm, no webpack, no build step — assets served directly
* Server-side rendering via Go templates
* WebSocket handled natively in Go (gorilla/websocket)
* NATS integration with MessagePack in the same binary
* Distroless container image for minimal attack surface

### Negative Consequences

* Three.js adds complexity for 3D avatar rendering
* HTMX pattern less familiar to developers expecting React/Vue
* Limited client-side state management (by design)

## Technology Stack

| Layer | Technology | Purpose |
|-------|-----------|---------|
| Server | Go 1.25 | HTTP server, WebSocket, NATS client, OAuth |
| Templates | Go `html/template` | Server-side HTML rendering |
| Interactivity | HTMX 2.0 | AJAX, WebSocket, server-sent events |
| Client state | Alpine.js 3 | Lightweight reactive UI for local state |
| 3D Avatars | Three.js + VRM | 3D character rendering with lip-sync |
| Styling | Tailwind CSS 4 + DaisyUI | Utility-first CSS with component library |
| Messaging | NATS JetStream | Real-time pub/sub with MessagePack encoding |
| Auth | golang-jwt/jwt/v5 | JWT token handling for OAuth flows |
| Database | PostgreSQL (lib/pq) + SQLite | Persistent + local session storage |
| Observability | OpenTelemetry SDK | Traces, metrics via OTLP gRPC |

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                         Browser                                  │
│                                                                 │
│  HTMX (server-rendered HTML) ←→ Go Server (WebSocket)          │
│  Alpine.js (local UI state)                                     │
│  Three.js (VRM 3D avatars with lip-sync)                        │
└───────────────────────┬─────────────────────────────────────────┘
                        │ HTTP/WebSocket
                        ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Go Server (single binary)                     │
│                                                                 │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │  Routes   │  │  OAuth   │  │WebSocket │  │  OTEL    │       │
│  │ (HTTP)    │  │ Handlers │  │ Hub      │  │ Tracing  │       │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘       │
│        │              │            │                             │
│        └──────────────┴────────────┘                            │
│                        │                                        │
│              ┌─────────┴─────────┐                              │
│              │  NATS Client      │                              │
│              │  (JetStream +     │                              │
│              │   MessagePack)    │                              │
│              └─────────┬─────────┘                              │
└────────────────────────┼────────────────────────────────────────┘
                         │
            ┌────────────┴────────────────┐
            ▼                             ▼
┌──────────────────┐           ┌──────────────────┐
│  NATS JetStream  │           │  Ray Serve       │
│  ai.chat.*       │           │  (STT, TTS, LLM, │
│  ai.voice.*      │           │   Embeddings)    │
└──────────────────┘           └──────────────────┘
```

## Key Features

| Feature | Implementation |
|---------|---------------|
| Real-time chat | WebSocket → NATS pub/sub per-user channels |
| Voice assistant | Streaming STT → LLM → TTS via Ray Serve endpoints |
| 3D avatars | VRM models rendered in Three.js with audio-driven lip-sync |
| OAuth login | Google, Discord, GitHub, Twitch + Authentik OIDC |
| RAG search | Milvus vector search for premium users |
| Session state | PostgreSQL (CNPG) for persistent data, SQLite for local cache |

## Kubernetes Deployment

| | |
|---|---|
| **Namespace** | `ai-ml` |
| **Replicas** | 1 |
| **Image** | `ghcr.io/billy-davies-2/companions-frontend` (distroless) |
| **Resources** | 50m/128Mi request → 500m/512Mi limit |

**OTEL sidecar:** `otel/opentelemetry-collector-contrib:0.145.0` exports traces to ClickStack.

**Backend routing:** All AI inference requests (STT, TTS, LLM, embeddings, reranking) route to Ray Serve at `ai-inference-serve-svc.ai-ml.svc.cluster.local:8000`. Auxiliary HTTPRoutes in the `auxiliary` kustomization provide direct model endpoint access at `embeddings.lab`, `whisper.lab`, `tts.lab`, `llm.lab`, `reranker.lab`.

**Access:** `companions-chat.lab.daviestechlabs.io` via envoy-internal with Authentik OIDC proxy auth.

## Links

* Related to [ADR-0003](0003-use-nats-for-messaging.md) (NATS messaging)
* Related to [ADR-0004](0004-use-messagepack-for-nats.md) (MessagePack encoding)
* Related to [ADR-0011](0011-kuberay-unified-gpu-backend.md) (Ray Serve backend)
* Related to [ADR-0028](0028-authentik-sso-strategy.md) (OAuth/OIDC)
* [HTMX Documentation](https://htmx.org/docs/)
* [VRM Specification](https://vrm.dev/en/)