feat: add comprehensive architecture documentation

- Add AGENT-ONBOARDING.md for AI agents
- Add ARCHITECTURE.md with full system overview
- Add TECH-STACK.md with complete technology inventory
- Add DOMAIN-MODEL.md with entities and bounded contexts
- Add CODING-CONVENTIONS.md with patterns and practices
- Add GLOSSARY.md with terminology reference
- Add C4 diagrams (Context and Container levels)
- Add 10 ADRs documenting key decisions:
  - Talos Linux, NATS, MessagePack, Multi-GPU strategy
  - GitOps with Flux, KServe, Milvus, Dual workflow engines
  - Envoy Gateway
- Add specs directory with JetStream configuration
- Add diagrams for GPU allocation and data flows

Based on analysis of homelab-k8s2 and llm-workflows repositories
and kubectl cluster-info dump data.
This commit is contained in:
2026-02-01 14:30:05 -05:00
parent 4d4f6f464c
commit 832cda34bd
26 changed files with 3805 additions and 2 deletions

View File

@@ -0,0 +1,47 @@
%% GPU Allocation Diagram
%% Shows how AI workloads are distributed across GPU nodes
flowchart TB
subgraph khelben["🖥️ khelben (AMD Strix Halo 64GB)"]
direction TB
vllm["🧠 vLLM<br/>LLM Inference<br/>100% GPU"]
end
subgraph elminster["🖥️ elminster (NVIDIA RTX 2070 8GB)"]
direction TB
whisper["🎤 Whisper<br/>STT<br/>~50% GPU"]
xtts["🔊 XTTS<br/>TTS<br/>~50% GPU"]
end
subgraph drizzt["🖥️ drizzt (AMD Radeon 680M 12GB)"]
direction TB
embeddings["📊 BGE Embeddings<br/>Vector Encoding<br/>~80% GPU"]
end
subgraph danilo["🖥️ danilo (Intel Arc)"]
direction TB
reranker["📋 BGE Reranker<br/>Document Ranking<br/>~80% GPU"]
end
subgraph workloads["Workload Routing"]
chat["💬 Chat Request"]
voice["🎤 Voice Request"]
end
chat --> embeddings
chat --> reranker
chat --> vllm
voice --> whisper
voice --> embeddings
voice --> reranker
voice --> vllm
voice --> xtts
classDef nvidia fill:#76B900,color:white
classDef amd fill:#ED1C24,color:white
classDef intel fill:#0071C5,color:white
class whisper,xtts nvidia
class vllm,embeddings amd
class reranker intel