# 📊 Domain Model

> **Core entities, bounded contexts, and relationships in the DaviesTechLabs homelab**

## Bounded Contexts

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                           BOUNDED CONTEXTS                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌───────────────────┐   ┌───────────────────┐   ┌───────────────────┐     │
│  │    CHAT CONTEXT   │   │   VOICE CONTEXT   │   │ WORKFLOW CONTEXT  │     │
│  ├───────────────────┤   ├───────────────────┤   ├───────────────────┤     │
│  │ • ChatSession     │   │ • VoiceSession    │   │ • Pipeline        │     │
│  │ • ChatMessage     │   │ • AudioChunk      │   │ • PipelineRun     │     │
│  │ • Conversation    │   │ • Transcription   │   │ • Artifact        │     │
│  │ • User            │   │ • SynthesizedAudio│   │ • Experiment      │     │
│  └─────────┬─────────┘   └─────────┬─────────┘   └─────────┬─────────┘     │
│            │                       │                       │                │
│            └───────────────────────┼───────────────────────┘                │
│                                    │                                        │
│                                    ▼                                        │
│  ┌───────────────────────────────────────────────────────────────────┐     │
│  │                    INFERENCE CONTEXT                               │     │
│  ├───────────────────────────────────────────────────────────────────┤     │
│  │ • InferenceRequest  • Model  • Embedding  • Document  • Chunk     │     │
│  └───────────────────────────────────────────────────────────────────┘     │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

---

## Core Entities

### User Context

```yaml
User:
  id: string (UUID)
  username: string
  premium: boolean
  preferences:
    voice_id: string
    model_preference: string
    enable_rag: boolean
  created_at: timestamp
  
Session:
  id: string (UUID)
  user_id: string
  type: "chat" | "voice"
  started_at: timestamp
  last_activity: timestamp
  metadata: object
```

### Chat Context

```yaml
ChatMessage:
  id: string (UUID)
  session_id: string
  user_id: string
  role: "user" | "assistant" | "system"
  content: string
  created_at: timestamp
  metadata:
    tokens_used: integer
    latency_ms: float
    rag_sources: string[]
    model_used: string

Conversation:
  id: string (UUID)
  user_id: string
  messages: ChatMessage[]
  title: string (auto-generated)
  created_at: timestamp
  updated_at: timestamp
```

### Voice Context

```yaml
VoiceRequest:
  id: string (UUID)
  user_id: string
  audio_b64: string (base64)
  format: "wav" | "webm" | "mp3"
  language: string
  premium: boolean
  enable_rag: boolean

VoiceResponse:
  id: string (UUID)
  request_id: string
  transcription: string
  response_text: string
  audio_b64: string (base64)
  audio_format: string
  latency_ms: float
  rag_docs_used: integer
```

### Inference Context

```yaml
InferenceRequest:
  id: string (UUID)
  service: "llm" | "stt" | "tts" | "embeddings" | "reranker"
  input: string | bytes
  parameters: object
  priority: "standard" | "premium"

InferenceResponse:
  id: string (UUID)
  request_id: string
  output: string | bytes | float[]
  metadata:
    model: string
    latency_ms: float
    tokens: integer (if applicable)
```

### RAG Context

```yaml
Document:
  id: string (UUID)
  collection: string
  title: string
  content: string
  source_url: string
  ingested_at: timestamp

Chunk:
  id: string (UUID)
  document_id: string
  content: string
  embedding: float[1024]  # BGE-large dimensions
  metadata:
    position: integer
    page: integer

RAGQuery:
  query: string
  collection: string
  top_k: integer (default: 5)
  rerank: boolean (default: true)
  rerank_top_k: integer (default: 3)

RAGResult:
  chunks: Chunk[]
  scores: float[]
  reranked: boolean
```

### Workflow Context

```yaml
Pipeline:
  id: string
  name: string
  version: string
  engine: "kubeflow" | "argo"
  definition: object (YAML)
  
PipelineRun:
  id: string (UUID)
  pipeline_id: string
  status: "pending" | "running" | "succeeded" | "failed"
  started_at: timestamp
  completed_at: timestamp
  parameters: object
  artifacts: Artifact[]

Artifact:
  id: string (UUID)
  run_id: string
  name: string
  type: "model" | "dataset" | "metrics" | "logs"
  uri: string (s3://)
  metadata: object

Experiment:
  id: string (UUID)
  name: string
  runs: PipelineRun[]
  metrics: object
  created_at: timestamp
```

---

## Entity Relationships

```mermaid
erDiagram
    USER ||--o{ SESSION : has
    USER ||--o{ CONVERSATION : owns
    SESSION ||--o{ CHAT_MESSAGE : contains
    CONVERSATION ||--o{ CHAT_MESSAGE : contains
    
    USER ||--o{ VOICE_REQUEST : makes
    VOICE_REQUEST ||--|| VOICE_RESPONSE : produces
    
    DOCUMENT ||--o{ CHUNK : contains
    CHUNK }|--|| EMBEDDING : has
    
    PIPELINE ||--o{ PIPELINE_RUN : executed_as
    PIPELINE_RUN ||--o{ ARTIFACT : produces
    EXPERIMENT ||--o{ PIPELINE_RUN : tracks
    
    INFERENCE_REQUEST }|--|| INFERENCE_RESPONSE : produces
```

---

## Aggregate Roots

| Aggregate | Root Entity | Child Entities |
|-----------|-------------|----------------|
| Chat | Conversation | ChatMessage |
| Voice | VoiceRequest | VoiceResponse |
| RAG | Document | Chunk, Embedding |
| Workflow | PipelineRun | Artifact |
| User | User | Session, Preferences |

---

## Event Flow

### Chat Event Stream

```
UserLogin
  └─► SessionCreated
        └─► MessageReceived
              ├─► RAGQueryExecuted (optional)
              ├─► InferenceRequested
              └─► ResponseGenerated
                    └─► MessageStored
```

### Voice Event Stream

```
VoiceRequestReceived
  └─► TranscriptionStarted
        └─► TranscriptionCompleted
              └─► RAGQueryExecuted (optional)
                    └─► LLMInferenceStarted
                          └─► LLMResponseGenerated
                                └─► TTSSynthesisStarted
                                      └─► AudioResponseReady
```

### Workflow Event Stream

```
PipelineTriggerReceived
  └─► PipelineRunCreated
        └─► StepStarted (repeated)
              └─► StepCompleted (repeated)
                    └─► ArtifactProduced (repeated)
                          └─► PipelineRunCompleted
```

---

## Data Retention

| Entity | Retention | Storage |
|--------|-----------|---------|
| ChatMessage | 30 days | JetStream → PostgreSQL |
| VoiceRequest/Response | 1 hour (audio), 30 days (text) | JetStream → PostgreSQL |
| Chunk/Embedding | Permanent | Milvus |
| PipelineRun | Permanent | PostgreSQL |
| Artifact | Permanent | MinIO |
| Session | 7 days | Valkey |

---

## Invariants

### Chat Context
- A ChatMessage must belong to exactly one Conversation
- A Conversation must have at least one ChatMessage
- Messages are immutable once created

### Voice Context
- VoiceResponse must have corresponding VoiceRequest
- Audio format must be one of: wav, webm, mp3
- Transcription cannot be empty for valid audio

### RAG Context
- Chunk must belong to exactly one Document
- Embedding dimensions must match model (1024 for BGE-large)
- Document must have at least one Chunk

### Workflow Context
- PipelineRun must reference valid Pipeline
- Artifacts must have valid S3 URIs
- Run status transitions: pending → running → (succeeded|failed)

---

## Value Objects

```python
# Immutable value objects
@dataclass(frozen=True)
class MessageContent:
    text: str
    tokens: int

@dataclass(frozen=True)  
class AudioData:
    data: bytes
    format: str
    duration_ms: int
    sample_rate: int

@dataclass(frozen=True)
class EmbeddingVector:
    values: tuple[float, ...]
    model: str
    dimensions: int

@dataclass(frozen=True)
class RAGContext:
    chunks: tuple[str, ...]
    scores: tuple[float, ...]
    query: str
```

---

## Related Documents

- [ARCHITECTURE.md](ARCHITECTURE.md) - System architecture
- [GLOSSARY.md](GLOSSARY.md) - Term definitions
- [decisions/0004-use-messagepack-for-nats.md](decisions/0004-use-messagepack-for-nats.md) - Message format decision