All checks were successful
Update README with ADR Index / update-readme (push) Successful in 6s
New ADRs: - 0043: Cilium CNI and Network Fabric - 0044: DNS and External Access Architecture - 0045: TLS Certificate Strategy (cert-manager) - 0046: Companions Frontend Architecture - 0047: MLflow Experiment Tracking and Model Registry - 0048: Entertainment and Media Stack - 0049: Self-Hosted Productivity Suite - 0050: Argo Rollouts Progressive Delivery - 0051: KEDA Event-Driven Autoscaling - 0052: Cluster Utilities (Spegel, Descheduler, Reloader, CSI-NFS) - 0053: Vaultwarden Password Management README updated with table entries and badge count (53 total).
138 lines
7.8 KiB
Markdown
138 lines
7.8 KiB
Markdown
# Companions Frontend Architecture
|
|
|
|
* Status: accepted
|
|
* Date: 2026-02-09
|
|
* Deciders: Billy
|
|
* Technical Story: Design the primary user interface for the AI/ML platform, supporting real-time chat, voice, and 3D avatar interactions
|
|
|
|
## Context and Problem Statement
|
|
|
|
The homelab AI platform needs a web interface for users to interact with chat (RAG + LLM), voice (STT → LLM → TTS), and embedding services. The interface must support real-time streaming responses, WebSocket connections for NATS message bus integration, and an engaging visual experience.
|
|
|
|
How do we build a performant, maintainable frontend that integrates with the NATS-based backend without a heavy JavaScript framework build step?
|
|
|
|
## Decision Drivers
|
|
|
|
* Real-time streaming for chat and voice (WebSocket required)
|
|
* Direct integration with NATS JetStream (binary MessagePack protocol)
|
|
* Minimal client-side JavaScript (~20KB gzipped target)
|
|
* No frontend build step (no webpack/vite/node required)
|
|
* 3D avatar rendering for immersive experience
|
|
* OAuth integration with multiple providers
|
|
* Single binary deployment (Go)
|
|
|
|
## Considered Options
|
|
|
|
1. **Go + HTMX + Alpine.js + Three.js** — Server-rendered with minimal JS
|
|
2. **Next.js / React SPA** — Full JavaScript framework
|
|
3. **SvelteKit** — Compiled JS framework
|
|
4. **Go + Templ + raw WebSocket** — Pure Go templates, no JS framework
|
|
|
|
## Decision Outcome
|
|
|
|
Chosen option: **Option 1 - Go + HTMX + Alpine.js + Three.js**, because it provides a zero-build-step frontend with server-rendered HTML, minimal JavaScript, and rich 3D avatar support, all served from a single Go binary.
|
|
|
|
### Positive Consequences
|
|
|
|
* Single binary deployment — Go server serves everything
|
|
* ~20KB gzipped total JS payload (CDN-served HTMX + Alpine + Three.js)
|
|
* No npm, no webpack, no build step — assets served directly
|
|
* Server-side rendering via Go templates
|
|
* WebSocket handled natively in Go (gorilla/websocket)
|
|
* NATS integration with MessagePack in the same binary
|
|
* Distroless container image for minimal attack surface
|
|
|
|
### Negative Consequences
|
|
|
|
* Three.js adds complexity for 3D avatar rendering
|
|
* HTMX pattern less familiar to developers expecting React/Vue
|
|
* Limited client-side state management (by design)
|
|
|
|
## Technology Stack
|
|
|
|
| Layer | Technology | Purpose |
|
|
|-------|-----------|---------|
|
|
| Server | Go 1.25 | HTTP server, WebSocket, NATS client, OAuth |
|
|
| Templates | Go `html/template` | Server-side HTML rendering |
|
|
| Interactivity | HTMX 2.0 | AJAX, WebSocket, server-sent events |
|
|
| Client state | Alpine.js 3 | Lightweight reactive UI for local state |
|
|
| 3D Avatars | Three.js + VRM | 3D character rendering with lip-sync |
|
|
| Styling | Tailwind CSS 4 + DaisyUI | Utility-first CSS with component library |
|
|
| Messaging | NATS JetStream | Real-time pub/sub with MessagePack encoding |
|
|
| Auth | golang-jwt/jwt/v5 | JWT token handling for OAuth flows |
|
|
| Database | PostgreSQL (lib/pq) + SQLite | Persistent + local session storage |
|
|
| Observability | OpenTelemetry SDK | Traces, metrics via OTLP gRPC |
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Browser │
|
|
│ │
|
|
│ HTMX (server-rendered HTML) ←→ Go Server (WebSocket) │
|
|
│ Alpine.js (local UI state) │
|
|
│ Three.js (VRM 3D avatars with lip-sync) │
|
|
└───────────────────────┬─────────────────────────────────────────┘
|
|
│ HTTP/WebSocket
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Go Server (single binary) │
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ Routes │ │ OAuth │ │WebSocket │ │ OTEL │ │
|
|
│ │ (HTTP) │ │ Handlers │ │ Hub │ │ Tracing │ │
|
|
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
|
|
│ │ │ │ │
|
|
│ └──────────────┴────────────┘ │
|
|
│ │ │
|
|
│ ┌─────────┴─────────┐ │
|
|
│ │ NATS Client │ │
|
|
│ │ (JetStream + │ │
|
|
│ │ MessagePack) │ │
|
|
│ └─────────┬─────────┘ │
|
|
└────────────────────────┼────────────────────────────────────────┘
|
|
│
|
|
┌────────────┴────────────────┐
|
|
▼ ▼
|
|
┌──────────────────┐ ┌──────────────────┐
|
|
│ NATS JetStream │ │ Ray Serve │
|
|
│ ai.chat.* │ │ (STT, TTS, LLM, │
|
|
│ ai.voice.* │ │ Embeddings) │
|
|
└──────────────────┘ └──────────────────┘
|
|
```
|
|
|
|
## Key Features
|
|
|
|
| Feature | Implementation |
|
|
|---------|---------------|
|
|
| Real-time chat | WebSocket → NATS pub/sub per-user channels |
|
|
| Voice assistant | Streaming STT → LLM → TTS via Ray Serve endpoints |
|
|
| 3D avatars | VRM models rendered in Three.js with audio-driven lip-sync |
|
|
| OAuth login | Google, Discord, GitHub, Twitch + Authentik OIDC |
|
|
| RAG search | Milvus vector search for premium users |
|
|
| Session state | PostgreSQL (CNPG) for persistent data, SQLite for local cache |
|
|
|
|
## Kubernetes Deployment
|
|
|
|
| | |
|
|
|---|---|
|
|
| **Namespace** | `ai-ml` |
|
|
| **Replicas** | 1 |
|
|
| **Image** | `ghcr.io/billy-davies-2/companions-frontend` (distroless) |
|
|
| **Resources** | 50m/128Mi request → 500m/512Mi limit |
|
|
|
|
**OTEL sidecar:** `otel/opentelemetry-collector-contrib:0.145.0` exports traces to ClickStack.
|
|
|
|
**Backend routing:** All AI inference requests (STT, TTS, LLM, embeddings, reranking) route to Ray Serve at `ai-inference-serve-svc.ai-ml.svc.cluster.local:8000`. Auxiliary HTTPRoutes in the `auxiliary` kustomization provide direct model endpoint access at `embeddings.lab`, `whisper.lab`, `tts.lab`, `llm.lab`, `reranker.lab`.
|
|
|
|
**Access:** `companions-chat.lab.daviestechlabs.io` via envoy-internal with Authentik OIDC proxy auth.
|
|
|
|
## Links
|
|
|
|
* Related to [ADR-0003](0003-use-nats-for-messaging.md) (NATS messaging)
|
|
* Related to [ADR-0004](0004-use-messagepack-for-nats.md) (MessagePack encoding)
|
|
* Related to [ADR-0011](0011-kuberay-unified-gpu-backend.md) (Ray Serve backend)
|
|
* Related to [ADR-0028](0028-authentik-sso-strategy.md) (OAuth/OIDC)
|
|
* [HTMX Documentation](https://htmx.org/docs/)
|
|
* [VRM Specification](https://vrm.dev/en/)
|