# Use Milvus for Vector Storage * Status: accepted * Date: 2025-12-15 * Deciders: Billy Davies * Technical Story: Selecting vector database for RAG system ## Context and Problem Statement The RAG (Retrieval-Augmented Generation) system requires a vector database to store document embeddings and perform similarity search. We need to store millions of embeddings and query them with low latency. ## Decision Drivers * Query performance (< 100ms for top-k search) * Scalability to millions of vectors * Kubernetes-native deployment * Active development and community * Support for metadata filtering * Backup and restore capabilities ## Considered Options * Milvus * Pinecone (managed) * Qdrant * Weaviate * pgvector (PostgreSQL extension) * Chroma ## Decision Outcome Chosen option: "Milvus", because it provides production-grade vector search with excellent Kubernetes support, scalability, and active development. ### Positive Consequences * High-performance similarity search * Horizontal scalability * Rich filtering and hybrid search * Helm chart for Kubernetes * Active CNCF sandbox project * GPU acceleration available ### Negative Consequences * Complex architecture (multiple components) * Higher resource usage than simpler alternatives * Requires object storage (MinIO) * Learning curve for optimization ## Pros and Cons of the Options ### Milvus * Good, because production-proven at scale * Good, because rich query API * Good, because Kubernetes-native * Good, because hybrid search (vector + scalar) * Good, because CNCF project * Bad, because complex architecture * Bad, because higher resource usage ### Pinecone * Good, because fully managed * Good, because simple API * Good, because reliable * Bad, because external dependency * Bad, because cost at scale * Bad, because data sovereignty concerns ### Qdrant * Good, because simpler than Milvus * Good, because Rust performance * Good, because good filtering * Bad, because smaller community * Bad, because less enterprise features ### Weaviate * Good, because built-in vectorization * Good, because GraphQL API * Good, because modules system * Bad, because more opinionated * Bad, because schema requirements ### pgvector * Good, because familiar PostgreSQL * Good, because simple deployment * Good, because ACID transactions * Bad, because limited scale * Bad, because slower for large datasets * Bad, because no specialized optimizations ### Chroma * Good, because simple * Good, because embedded option * Bad, because not production-ready at scale * Bad, because limited features ## Links * [Milvus](https://milvus.io) * [Milvus Helm Chart](https://github.com/milvus-io/milvus-helm) * Related: [DOMAIN-MODEL.md](../DOMAIN-MODEL.md) - Chunk/Embedding entities