homelab-design/decisions/0008-use-milvus-for-vectors.md

# Use Milvus for Vector Storage

* Status: accepted
* Date: 2025-12-15
* Deciders: Billy Davies
* Technical Story: Selecting vector database for RAG system

## Context and Problem Statement

The RAG (Retrieval-Augmented Generation) system requires a vector database to store document embeddings and perform similarity search. We need to store millions of embeddings and query them with low latency.

## Decision Drivers

* Query performance (< 100ms for top-k search)
* Scalability to millions of vectors
* Kubernetes-native deployment
* Active development and community
* Support for metadata filtering
* Backup and restore capabilities

## Considered Options

* Milvus
* Pinecone (managed)
* Qdrant
* Weaviate
* pgvector (PostgreSQL extension)
* Chroma

## Decision Outcome

Chosen option: "Milvus", because it provides production-grade vector search with excellent Kubernetes support, scalability, and active development.

### Positive Consequences

* High-performance similarity search
* Horizontal scalability
* Rich filtering and hybrid search
* Helm chart for Kubernetes
* Active CNCF sandbox project
* GPU acceleration available

### Negative Consequences

* Complex architecture (multiple components)
* Higher resource usage than simpler alternatives
* Requires object storage (MinIO)
* Learning curve for optimization

## Pros and Cons of the Options

### Milvus

* Good, because production-proven at scale
* Good, because rich query API
* Good, because Kubernetes-native
* Good, because hybrid search (vector + scalar)
* Good, because CNCF project
* Bad, because complex architecture
* Bad, because higher resource usage

### Pinecone

* Good, because fully managed
* Good, because simple API
* Good, because reliable
* Bad, because external dependency
* Bad, because cost at scale
* Bad, because data sovereignty concerns

### Qdrant

* Good, because simpler than Milvus
* Good, because Rust performance
* Good, because good filtering
* Bad, because smaller community
* Bad, because less enterprise features

### Weaviate

* Good, because built-in vectorization
* Good, because GraphQL API
* Good, because modules system
* Bad, because more opinionated
* Bad, because schema requirements

### pgvector

* Good, because familiar PostgreSQL
* Good, because simple deployment
* Good, because ACID transactions
* Bad, because limited scale
* Bad, because slower for large datasets
* Bad, because no specialized optimizations

### Chroma

* Good, because simple
* Good, because embedded option
* Bad, because not production-ready at scale
* Bad, because limited features

## Links

* [Milvus](https://milvus.io)
* [Milvus Helm Chart](https://github.com/milvus-io/milvus-helm)
* Related: [DOMAIN-MODEL.md](../DOMAIN-MODEL.md) - Chunk/Embedding entities