feat: add comprehensive architecture documentation

- Add AGENT-ONBOARDING.md for AI agents - Add ARCHITECTURE.md with full system overview - Add TECH-STACK.md with complete technology inventory - Add DOMAIN-MODEL.md with entities and bounded contexts - Add CODING-CONVENTIONS.md with patterns and practices - Add GLOSSARY.md with terminology reference - Add C4 diagrams (Context and Container levels) - Add 10 ADRs documenting key decisions: - Talos Linux, NATS, MessagePack, Multi-GPU strategy - GitOps with Flux, KServe, Milvus, Dual workflow engines - Envoy Gateway - Add specs directory with JetStream configuration - Add diagrams for GPU allocation and data flows Based on analysis of homelab-k8s2 and llm-workflows repositories and kubectl cluster-info dump data.
2026-02-01 14:30:05 -05:00
parent 4d4f6f464c
commit 832cda34bd
26 changed files with 3805 additions and 2 deletions
--- a/decisions/0008-use-milvus-for-vectors.md
+++ b/decisions/0008-use-milvus-for-vectors.md
@@ -0,0 +1,107 @@
+# Use Milvus for Vector Storage
+
+* Status: accepted
+* Date: 2025-12-15
+* Deciders: Billy Davies
+* Technical Story: Selecting vector database for RAG system
+
+## Context and Problem Statement
+
+The RAG (Retrieval-Augmented Generation) system requires a vector database to store document embeddings and perform similarity search. We need to store millions of embeddings and query them with low latency.
+
+## Decision Drivers
+
+* Query performance (< 100ms for top-k search)
+* Scalability to millions of vectors
+* Kubernetes-native deployment
+* Active development and community
+* Support for metadata filtering
+* Backup and restore capabilities
+
+## Considered Options
+
+* Milvus
+* Pinecone (managed)
+* Qdrant
+* Weaviate
+* pgvector (PostgreSQL extension)
+* Chroma
+
+## Decision Outcome
+
+Chosen option: "Milvus", because it provides production-grade vector search with excellent Kubernetes support, scalability, and active development.
+
+### Positive Consequences
+
+* High-performance similarity search
+* Horizontal scalability
+* Rich filtering and hybrid search
+* Helm chart for Kubernetes
+* Active CNCF sandbox project
+* GPU acceleration available
+
+### Negative Consequences
+
+* Complex architecture (multiple components)
+* Higher resource usage than simpler alternatives
+* Requires object storage (MinIO)
+* Learning curve for optimization
+
+## Pros and Cons of the Options
+
+### Milvus
+
+* Good, because production-proven at scale
+* Good, because rich query API
+* Good, because Kubernetes-native
+* Good, because hybrid search (vector + scalar)
+* Good, because CNCF project
+* Bad, because complex architecture
+* Bad, because higher resource usage
+
+### Pinecone
+
+* Good, because fully managed
+* Good, because simple API
+* Good, because reliable
+* Bad, because external dependency
+* Bad, because cost at scale
+* Bad, because data sovereignty concerns
+
+### Qdrant
+
+* Good, because simpler than Milvus
+* Good, because Rust performance
+* Good, because good filtering
+* Bad, because smaller community
+* Bad, because less enterprise features
+
+### Weaviate
+
+* Good, because built-in vectorization
+* Good, because GraphQL API
+* Good, because modules system
+* Bad, because more opinionated
+* Bad, because schema requirements
+
+### pgvector
+
+* Good, because familiar PostgreSQL
+* Good, because simple deployment
+* Good, because ACID transactions
+* Bad, because limited scale
+* Bad, because slower for large datasets
+* Bad, because no specialized optimizations
+
+### Chroma
+
+* Good, because simple
+* Good, because embedded option
+* Bad, because not production-ready at scale
+* Bad, because limited features
+
+## Links
+
+* [Milvus](https://milvus.io)
+* [Milvus Helm Chart](https://github.com/milvus-io/milvus-helm)
+* Related: [DOMAIN-MODEL.md](../DOMAIN-MODEL.md) - Chunk/Embedding entities