Files
homelab-design/DOMAIN-MODEL.md
Billy D. 832cda34bd feat: add comprehensive architecture documentation
- Add AGENT-ONBOARDING.md for AI agents
- Add ARCHITECTURE.md with full system overview
- Add TECH-STACK.md with complete technology inventory
- Add DOMAIN-MODEL.md with entities and bounded contexts
- Add CODING-CONVENTIONS.md with patterns and practices
- Add GLOSSARY.md with terminology reference
- Add C4 diagrams (Context and Container levels)
- Add 10 ADRs documenting key decisions:
  - Talos Linux, NATS, MessagePack, Multi-GPU strategy
  - GitOps with Flux, KServe, Milvus, Dual workflow engines
  - Envoy Gateway
- Add specs directory with JetStream configuration
- Add diagrams for GPU allocation and data flows

Based on analysis of homelab-k8s2 and llm-workflows repositories
and kubectl cluster-info dump data.
2026-02-01 14:30:05 -05:00

9.8 KiB

📊 Domain Model

Core entities, bounded contexts, and relationships in the DaviesTechLabs homelab

Bounded Contexts

┌─────────────────────────────────────────────────────────────────────────────┐
│                           BOUNDED CONTEXTS                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌───────────────────┐   ┌───────────────────┐   ┌───────────────────┐     │
│  │    CHAT CONTEXT   │   │   VOICE CONTEXT   │   │ WORKFLOW CONTEXT  │     │
│  ├───────────────────┤   ├───────────────────┤   ├───────────────────┤     │
│  │ • ChatSession     │   │ • VoiceSession    │   │ • Pipeline        │     │
│  │ • ChatMessage     │   │ • AudioChunk      │   │ • PipelineRun     │     │
│  │ • Conversation    │   │ • Transcription   │   │ • Artifact        │     │
│  │ • User            │   │ • SynthesizedAudio│   │ • Experiment      │     │
│  └─────────┬─────────┘   └─────────┬─────────┘   └─────────┬─────────┘     │
│            │                       │                       │                │
│            └───────────────────────┼───────────────────────┘                │
│                                    │                                        │
│                                    ▼                                        │
│  ┌───────────────────────────────────────────────────────────────────┐     │
│  │                    INFERENCE CONTEXT                               │     │
│  ├───────────────────────────────────────────────────────────────────┤     │
│  │ • InferenceRequest  • Model  • Embedding  • Document  • Chunk     │     │
│  └───────────────────────────────────────────────────────────────────┘     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Core Entities

User Context

User:
  id: string (UUID)
  username: string
  premium: boolean
  preferences:
    voice_id: string
    model_preference: string
    enable_rag: boolean
  created_at: timestamp
  
Session:
  id: string (UUID)
  user_id: string
  type: "chat" | "voice"
  started_at: timestamp
  last_activity: timestamp
  metadata: object

Chat Context

ChatMessage:
  id: string (UUID)
  session_id: string
  user_id: string
  role: "user" | "assistant" | "system"
  content: string
  created_at: timestamp
  metadata:
    tokens_used: integer
    latency_ms: float
    rag_sources: string[]
    model_used: string

Conversation:
  id: string (UUID)
  user_id: string
  messages: ChatMessage[]
  title: string (auto-generated)
  created_at: timestamp
  updated_at: timestamp

Voice Context

VoiceRequest:
  id: string (UUID)
  user_id: string
  audio_b64: string (base64)
  format: "wav" | "webm" | "mp3"
  language: string
  premium: boolean
  enable_rag: boolean

VoiceResponse:
  id: string (UUID)
  request_id: string
  transcription: string
  response_text: string
  audio_b64: string (base64)
  audio_format: string
  latency_ms: float
  rag_docs_used: integer

Inference Context

InferenceRequest:
  id: string (UUID)
  service: "llm" | "stt" | "tts" | "embeddings" | "reranker"
  input: string | bytes
  parameters: object
  priority: "standard" | "premium"

InferenceResponse:
  id: string (UUID)
  request_id: string
  output: string | bytes | float[]
  metadata:
    model: string
    latency_ms: float
    tokens: integer (if applicable)

RAG Context

Document:
  id: string (UUID)
  collection: string
  title: string
  content: string
  source_url: string
  ingested_at: timestamp

Chunk:
  id: string (UUID)
  document_id: string
  content: string
  embedding: float[1024]  # BGE-large dimensions
  metadata:
    position: integer
    page: integer

RAGQuery:
  query: string
  collection: string
  top_k: integer (default: 5)
  rerank: boolean (default: true)
  rerank_top_k: integer (default: 3)

RAGResult:
  chunks: Chunk[]
  scores: float[]
  reranked: boolean

Workflow Context

Pipeline:
  id: string
  name: string
  version: string
  engine: "kubeflow" | "argo"
  definition: object (YAML)
  
PipelineRun:
  id: string (UUID)
  pipeline_id: string
  status: "pending" | "running" | "succeeded" | "failed"
  started_at: timestamp
  completed_at: timestamp
  parameters: object
  artifacts: Artifact[]

Artifact:
  id: string (UUID)
  run_id: string
  name: string
  type: "model" | "dataset" | "metrics" | "logs"
  uri: string (s3://)
  metadata: object

Experiment:
  id: string (UUID)
  name: string
  runs: PipelineRun[]
  metrics: object
  created_at: timestamp

Entity Relationships

erDiagram
    USER ||--o{ SESSION : has
    USER ||--o{ CONVERSATION : owns
    SESSION ||--o{ CHAT_MESSAGE : contains
    CONVERSATION ||--o{ CHAT_MESSAGE : contains
    
    USER ||--o{ VOICE_REQUEST : makes
    VOICE_REQUEST ||--|| VOICE_RESPONSE : produces
    
    DOCUMENT ||--o{ CHUNK : contains
    CHUNK }|--|| EMBEDDING : has
    
    PIPELINE ||--o{ PIPELINE_RUN : executed_as
    PIPELINE_RUN ||--o{ ARTIFACT : produces
    EXPERIMENT ||--o{ PIPELINE_RUN : tracks
    
    INFERENCE_REQUEST }|--|| INFERENCE_RESPONSE : produces

Aggregate Roots

Aggregate Root Entity Child Entities
Chat Conversation ChatMessage
Voice VoiceRequest VoiceResponse
RAG Document Chunk, Embedding
Workflow PipelineRun Artifact
User User Session, Preferences

Event Flow

Chat Event Stream

UserLogin
  └─► SessionCreated
        └─► MessageReceived
              ├─► RAGQueryExecuted (optional)
              ├─► InferenceRequested
              └─► ResponseGenerated
                    └─► MessageStored

Voice Event Stream

VoiceRequestReceived
  └─► TranscriptionStarted
        └─► TranscriptionCompleted
              └─► RAGQueryExecuted (optional)
                    └─► LLMInferenceStarted
                          └─► LLMResponseGenerated
                                └─► TTSSynthesisStarted
                                      └─► AudioResponseReady

Workflow Event Stream

PipelineTriggerReceived
  └─► PipelineRunCreated
        └─► StepStarted (repeated)
              └─► StepCompleted (repeated)
                    └─► ArtifactProduced (repeated)
                          └─► PipelineRunCompleted

Data Retention

Entity Retention Storage
ChatMessage 30 days JetStream → PostgreSQL
VoiceRequest/Response 1 hour (audio), 30 days (text) JetStream → PostgreSQL
Chunk/Embedding Permanent Milvus
PipelineRun Permanent PostgreSQL
Artifact Permanent MinIO
Session 7 days Valkey

Invariants

Chat Context

  • A ChatMessage must belong to exactly one Conversation
  • A Conversation must have at least one ChatMessage
  • Messages are immutable once created

Voice Context

  • VoiceResponse must have corresponding VoiceRequest
  • Audio format must be one of: wav, webm, mp3
  • Transcription cannot be empty for valid audio

RAG Context

  • Chunk must belong to exactly one Document
  • Embedding dimensions must match model (1024 for BGE-large)
  • Document must have at least one Chunk

Workflow Context

  • PipelineRun must reference valid Pipeline
  • Artifacts must have valid S3 URIs
  • Run status transitions: pending → running → (succeeded|failed)

Value Objects

# Immutable value objects
@dataclass(frozen=True)
class MessageContent:
    text: str
    tokens: int

@dataclass(frozen=True)  
class AudioData:
    data: bytes
    format: str
    duration_ms: int
    sample_rate: int

@dataclass(frozen=True)
class EmbeddingVector:
    values: tuple[float, ...]
    model: str
    dimensions: int

@dataclass(frozen=True)
class RAGContext:
    chunks: tuple[str, ...]
    scores: tuple[float, ...]
    query: str