feat: add comprehensive architecture documentation
- Add AGENT-ONBOARDING.md for AI agents - Add ARCHITECTURE.md with full system overview - Add TECH-STACK.md with complete technology inventory - Add DOMAIN-MODEL.md with entities and bounded contexts - Add CODING-CONVENTIONS.md with patterns and practices - Add GLOSSARY.md with terminology reference - Add C4 diagrams (Context and Container levels) - Add 10 ADRs documenting key decisions: - Talos Linux, NATS, MessagePack, Multi-GPU strategy - GitOps with Flux, KServe, Milvus, Dual workflow engines - Envoy Gateway - Add specs directory with JetStream configuration - Add diagrams for GPU allocation and data flows Based on analysis of homelab-k8s2 and llm-workflows repositories and kubectl cluster-info dump data.
This commit is contained in:
107
decisions/0008-use-milvus-for-vectors.md
Normal file
107
decisions/0008-use-milvus-for-vectors.md
Normal file
@@ -0,0 +1,107 @@
|
||||
# Use Milvus for Vector Storage
|
||||
|
||||
* Status: accepted
|
||||
* Date: 2025-12-15
|
||||
* Deciders: Billy Davies
|
||||
* Technical Story: Selecting vector database for RAG system
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The RAG (Retrieval-Augmented Generation) system requires a vector database to store document embeddings and perform similarity search. We need to store millions of embeddings and query them with low latency.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* Query performance (< 100ms for top-k search)
|
||||
* Scalability to millions of vectors
|
||||
* Kubernetes-native deployment
|
||||
* Active development and community
|
||||
* Support for metadata filtering
|
||||
* Backup and restore capabilities
|
||||
|
||||
## Considered Options
|
||||
|
||||
* Milvus
|
||||
* Pinecone (managed)
|
||||
* Qdrant
|
||||
* Weaviate
|
||||
* pgvector (PostgreSQL extension)
|
||||
* Chroma
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: "Milvus", because it provides production-grade vector search with excellent Kubernetes support, scalability, and active development.
|
||||
|
||||
### Positive Consequences
|
||||
|
||||
* High-performance similarity search
|
||||
* Horizontal scalability
|
||||
* Rich filtering and hybrid search
|
||||
* Helm chart for Kubernetes
|
||||
* Active CNCF sandbox project
|
||||
* GPU acceleration available
|
||||
|
||||
### Negative Consequences
|
||||
|
||||
* Complex architecture (multiple components)
|
||||
* Higher resource usage than simpler alternatives
|
||||
* Requires object storage (MinIO)
|
||||
* Learning curve for optimization
|
||||
|
||||
## Pros and Cons of the Options
|
||||
|
||||
### Milvus
|
||||
|
||||
* Good, because production-proven at scale
|
||||
* Good, because rich query API
|
||||
* Good, because Kubernetes-native
|
||||
* Good, because hybrid search (vector + scalar)
|
||||
* Good, because CNCF project
|
||||
* Bad, because complex architecture
|
||||
* Bad, because higher resource usage
|
||||
|
||||
### Pinecone
|
||||
|
||||
* Good, because fully managed
|
||||
* Good, because simple API
|
||||
* Good, because reliable
|
||||
* Bad, because external dependency
|
||||
* Bad, because cost at scale
|
||||
* Bad, because data sovereignty concerns
|
||||
|
||||
### Qdrant
|
||||
|
||||
* Good, because simpler than Milvus
|
||||
* Good, because Rust performance
|
||||
* Good, because good filtering
|
||||
* Bad, because smaller community
|
||||
* Bad, because less enterprise features
|
||||
|
||||
### Weaviate
|
||||
|
||||
* Good, because built-in vectorization
|
||||
* Good, because GraphQL API
|
||||
* Good, because modules system
|
||||
* Bad, because more opinionated
|
||||
* Bad, because schema requirements
|
||||
|
||||
### pgvector
|
||||
|
||||
* Good, because familiar PostgreSQL
|
||||
* Good, because simple deployment
|
||||
* Good, because ACID transactions
|
||||
* Bad, because limited scale
|
||||
* Bad, because slower for large datasets
|
||||
* Bad, because no specialized optimizations
|
||||
|
||||
### Chroma
|
||||
|
||||
* Good, because simple
|
||||
* Good, because embedded option
|
||||
* Bad, because not production-ready at scale
|
||||
* Bad, because limited features
|
||||
|
||||
## Links
|
||||
|
||||
* [Milvus](https://milvus.io)
|
||||
* [Milvus Helm Chart](https://github.com/milvus-io/milvus-helm)
|
||||
* Related: [DOMAIN-MODEL.md](../DOMAIN-MODEL.md) - Chunk/Embedding entities
|
||||
Reference in New Issue
Block a user