feat: add comprehensive architecture documentation

- Add AGENT-ONBOARDING.md for AI agents - Add ARCHITECTURE.md with full system overview - Add TECH-STACK.md with complete technology inventory - Add DOMAIN-MODEL.md with entities and bounded contexts - Add CODING-CONVENTIONS.md with patterns and practices - Add GLOSSARY.md with terminology reference - Add C4 diagrams (Context and Container levels) - Add 10 ADRs documenting key decisions: - Talos Linux, NATS, MessagePack, Multi-GPU strategy - GitOps with Flux, KServe, Milvus, Dual workflow engines - Envoy Gateway - Add specs directory with JetStream configuration - Add diagrams for GPU allocation and data flows Based on analysis of homelab-k8s2 and llm-workflows repositories and kubectl cluster-info dump data.
2026-02-01 14:30:05 -05:00
parent 4d4f6f464c
commit 832cda34bd
26 changed files with 3805 additions and 2 deletions
--- a/DOMAIN-MODEL.md
+++ b/DOMAIN-MODEL.md
@@ -0,0 +1,345 @@
+# 📊 Domain Model
+
+> **Core entities, bounded contexts, and relationships in the DaviesTechLabs homelab**
+
+## Bounded Contexts
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                           BOUNDED CONTEXTS                                   │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                              │
+│  ┌───────────────────┐   ┌───────────────────┐   ┌───────────────────┐     │
+│  │    CHAT CONTEXT   │   │   VOICE CONTEXT   │   │ WORKFLOW CONTEXT  │     │
+│  ├───────────────────┤   ├───────────────────┤   ├───────────────────┤     │
+│  │ • ChatSession     │   │ • VoiceSession    │   │ • Pipeline        │     │
+│  │ • ChatMessage     │   │ • AudioChunk      │   │ • PipelineRun     │     │
+│  │ • Conversation    │   │ • Transcription   │   │ • Artifact        │     │
+│  │ • User            │   │ • SynthesizedAudio│   │ • Experiment      │     │
+│  └─────────┬─────────┘   └─────────┬─────────┘   └─────────┬─────────┘     │
+│            │                       │                       │                │
+│            └───────────────────────┼───────────────────────┘                │
+│                                    │                                        │
+│                                    ▼                                        │
+│  ┌───────────────────────────────────────────────────────────────────┐     │
+│  │                    INFERENCE CONTEXT                               │     │
+│  ├───────────────────────────────────────────────────────────────────┤     │
+│  │ • InferenceRequest  • Model  • Embedding  • Document  • Chunk     │     │
+│  └───────────────────────────────────────────────────────────────────┘     │
+│                                                                              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Core Entities
+
+### User Context
+
+```yaml
+User:
+  id: string (UUID)
+  username: string
+  premium: boolean
+  preferences:
+    voice_id: string
+    model_preference: string
+    enable_rag: boolean
+  created_at: timestamp
+  
+Session:
+  id: string (UUID)
+  user_id: string
+  type: "chat" | "voice"
+  started_at: timestamp
+  last_activity: timestamp
+  metadata: object
+```
+
+### Chat Context
+
+```yaml
+ChatMessage:
+  id: string (UUID)
+  session_id: string
+  user_id: string
+  role: "user" | "assistant" | "system"
+  content: string
+  created_at: timestamp
+  metadata:
+    tokens_used: integer
+    latency_ms: float
+    rag_sources: string[]
+    model_used: string
+
+Conversation:
+  id: string (UUID)
+  user_id: string
+  messages: ChatMessage[]
+  title: string (auto-generated)
+  created_at: timestamp
+  updated_at: timestamp
+```
+
+### Voice Context
+
+```yaml
+VoiceRequest:
+  id: string (UUID)
+  user_id: string
+  audio_b64: string (base64)
+  format: "wav" | "webm" | "mp3"
+  language: string
+  premium: boolean
+  enable_rag: boolean
+
+VoiceResponse:
+  id: string (UUID)
+  request_id: string
+  transcription: string
+  response_text: string
+  audio_b64: string (base64)
+  audio_format: string
+  latency_ms: float
+  rag_docs_used: integer
+```
+
+### Inference Context
+
+```yaml
+InferenceRequest:
+  id: string (UUID)
+  service: "llm" | "stt" | "tts" | "embeddings" | "reranker"
+  input: string | bytes
+  parameters: object
+  priority: "standard" | "premium"
+
+InferenceResponse:
+  id: string (UUID)
+  request_id: string
+  output: string | bytes | float[]
+  metadata:
+    model: string
+    latency_ms: float
+    tokens: integer (if applicable)
+```
+
+### RAG Context
+
+```yaml
+Document:
+  id: string (UUID)
+  collection: string
+  title: string
+  content: string
+  source_url: string
+  ingested_at: timestamp
+
+Chunk:
+  id: string (UUID)
+  document_id: string
+  content: string
+  embedding: float[1024]  # BGE-large dimensions
+  metadata:
+    position: integer
+    page: integer
+
+RAGQuery:
+  query: string
+  collection: string
+  top_k: integer (default: 5)
+  rerank: boolean (default: true)
+  rerank_top_k: integer (default: 3)
+
+RAGResult:
+  chunks: Chunk[]
+  scores: float[]
+  reranked: boolean
+```
+
+### Workflow Context
+
+```yaml
+Pipeline:
+  id: string
+  name: string
+  version: string
+  engine: "kubeflow" | "argo"
+  definition: object (YAML)
+  
+PipelineRun:
+  id: string (UUID)
+  pipeline_id: string
+  status: "pending" | "running" | "succeeded" | "failed"
+  started_at: timestamp
+  completed_at: timestamp
+  parameters: object
+  artifacts: Artifact[]
+
+Artifact:
+  id: string (UUID)
+  run_id: string
+  name: string
+  type: "model" | "dataset" | "metrics" | "logs"
+  uri: string (s3://)
+  metadata: object
+
+Experiment:
+  id: string (UUID)
+  name: string
+  runs: PipelineRun[]
+  metrics: object
+  created_at: timestamp
+```
+
+---
+
+## Entity Relationships
+
+```mermaid
+erDiagram
+    USER ||--o{ SESSION : has
+    USER ||--o{ CONVERSATION : owns
+    SESSION ||--o{ CHAT_MESSAGE : contains
+    CONVERSATION ||--o{ CHAT_MESSAGE : contains
+    
+    USER ||--o{ VOICE_REQUEST : makes
+    VOICE_REQUEST ||--|| VOICE_RESPONSE : produces
+    
+    DOCUMENT ||--o{ CHUNK : contains
+    CHUNK }|--|| EMBEDDING : has
+    
+    PIPELINE ||--o{ PIPELINE_RUN : executed_as
+    PIPELINE_RUN ||--o{ ARTIFACT : produces
+    EXPERIMENT ||--o{ PIPELINE_RUN : tracks
+    
+    INFERENCE_REQUEST }|--|| INFERENCE_RESPONSE : produces
+```
+
+---
+
+## Aggregate Roots
+
+| Aggregate | Root Entity | Child Entities |
+|-----------|-------------|----------------|
+| Chat | Conversation | ChatMessage |
+| Voice | VoiceRequest | VoiceResponse |
+| RAG | Document | Chunk, Embedding |
+| Workflow | PipelineRun | Artifact |
+| User | User | Session, Preferences |
+
+---
+
+## Event Flow
+
+### Chat Event Stream
+
+```
+UserLogin
+  └─► SessionCreated
+        └─► MessageReceived
+              ├─► RAGQueryExecuted (optional)
+              ├─► InferenceRequested
+              └─► ResponseGenerated
+                    └─► MessageStored
+```
+
+### Voice Event Stream
+
+```
+VoiceRequestReceived
+  └─► TranscriptionStarted
+        └─► TranscriptionCompleted
+              └─► RAGQueryExecuted (optional)
+                    └─► LLMInferenceStarted
+                          └─► LLMResponseGenerated
+                                └─► TTSSynthesisStarted
+                                      └─► AudioResponseReady
+```
+
+### Workflow Event Stream
+
+```
+PipelineTriggerReceived
+  └─► PipelineRunCreated
+        └─► StepStarted (repeated)
+              └─► StepCompleted (repeated)
+                    └─► ArtifactProduced (repeated)
+                          └─► PipelineRunCompleted
+```
+
+---
+
+## Data Retention
+
+| Entity | Retention | Storage |
+|--------|-----------|---------|
+| ChatMessage | 30 days | JetStream → PostgreSQL |
+| VoiceRequest/Response | 1 hour (audio), 30 days (text) | JetStream → PostgreSQL |
+| Chunk/Embedding | Permanent | Milvus |
+| PipelineRun | Permanent | PostgreSQL |
+| Artifact | Permanent | MinIO |
+| Session | 7 days | Valkey |
+
+---
+
+## Invariants
+
+### Chat Context
+- A ChatMessage must belong to exactly one Conversation
+- A Conversation must have at least one ChatMessage
+- Messages are immutable once created
+
+### Voice Context
+- VoiceResponse must have corresponding VoiceRequest
+- Audio format must be one of: wav, webm, mp3
+- Transcription cannot be empty for valid audio
+
+### RAG Context
+- Chunk must belong to exactly one Document
+- Embedding dimensions must match model (1024 for BGE-large)
+- Document must have at least one Chunk
+
+### Workflow Context
+- PipelineRun must reference valid Pipeline
+- Artifacts must have valid S3 URIs
+- Run status transitions: pending → running → (succeeded|failed)
+
+---
+
+## Value Objects
+
+```python
+# Immutable value objects
+@dataclass(frozen=True)
+class MessageContent:
+    text: str
+    tokens: int
+
+@dataclass(frozen=True)  
+class AudioData:
+    data: bytes
+    format: str
+    duration_ms: int
+    sample_rate: int
+
+@dataclass(frozen=True)
+class EmbeddingVector:
+    values: tuple[float, ...]
+    model: str
+    dimensions: int
+
+@dataclass(frozen=True)
+class RAGContext:
+    chunks: tuple[str, ...]
+    scores: tuple[float, ...]
+    query: str
+```
+
+---
+
+## Related Documents
+
+- [ARCHITECTURE.md](ARCHITECTURE.md) - System architecture
+- [GLOSSARY.md](GLOSSARY.md) - Term definitions
+- [decisions/0004-use-messagepack-for-nats.md](decisions/0004-use-messagepack-for-nats.md) - Message format decision