Key Takeaway
Vector database selection should be driven by your query pattern (real-time vs. batch, single-vector vs. multi-vector) rather than benchmark leaderboards that may not reflect your workload.
When to Use This Template
Use this ADR when your team is selecting a vector database for RAG pipelines, semantic search, recommendation systems, or any application that requires similarity search over embeddings. This template is appropriate whether you are evaluating managed services (Pinecone, Weaviate Cloud, Qdrant Cloud), self-hosted options (Milvus, Qdrant, Chroma), or vector extensions on existing databases (pgvector, Elasticsearch kNN). It is especially valuable when the team is transitioning from a prototype vector store to a production-grade solution.
ADR Template
# ADR: Vector Database Selection
## Status
[Proposed | Accepted | Deprecated | Superseded by ADR-XXX]
## Date
YYYY-MM-DD
## Decision Makers
- [Name, Role]
## Context
### Use Case
- Application: [e.g., "RAG for customer support knowledge base"]
- Query type: [single-vector similarity | multi-vector | hybrid keyword+vector]
- Write pattern: [batch ingestion | real-time upserts | append-only]
### Scale Requirements
- Current vector count: [e.g., "2M vectors"]
- Projected vector count (12 months): [e.g., "20M vectors"]
- Embedding dimensions: [e.g., "1536 (OpenAI) or 1024 (Cohere)"]
- Query volume: [e.g., "500 queries/second peak"]
- Latency SLA: [e.g., "p99 < 100ms for top-10 retrieval"]
### Constraints
- Data residency: [region requirements]
- Existing infrastructure: [cloud provider, Kubernetes, managed services]
- Team expertise: [relevant operational experience]
- Budget: [monthly ceiling]
## Options Considered
| Criterion | Option A | Option B | Option C |
|-----------|----------|----------|----------|
| Index algorithm (HNSW, IVF, etc.) | | | |
| Max vector dimensions | | | |
| Filtering capability | | | |
| Hybrid search support | | | |
| Scaling model (vertical/horizontal) | | | |
| Managed vs. self-hosted | | | |
| Multi-tenancy support | | | |
| Backup and recovery | | | |
| Pricing model | | | |
| SDK ecosystem | | | |
## Evaluation Criteria (Weighted)
1. **Query performance at target scale** (30%) — Latency and throughput at projected data volume
2. **Operational complexity** (25%) — Deployment, scaling, backup, upgrades
3. **Cost efficiency** (20%) — Total cost including compute, storage, and operations
4. **Integration fit** (15%) — SDK quality, existing stack compatibility
5. **Feature completeness** (10%) — Filtering, hybrid search, metadata handling
## Decision
We will use [database] because [rationale].
## Consequences
### Accepted Trade-offs
- [e.g., "Higher operational burden in exchange for cost control"]
### Migration Plan
- [Steps to migrate from prototype to selected database]
### Action Items
- [ ] Set up staging environment with representative data volume
- [ ] Run load test at 2x projected query volume
- [ ] Implement backup and recovery procedures
- [ ] Configure monitoring dashboards
## Review Trigger
- [ ] Vector count exceeds [threshold]
- [ ] Query latency p99 exceeds [threshold] for more than [duration]
- [ ] Monthly cost exceeds [threshold]
- [ ] New index algorithm support neededSection-by-Section Guidance
Scale Requirements
Be honest about your current scale and realistic about projections. Many teams over-provision vector databases based on optimistic growth projections, leading to unnecessary cost. Document both current state and 12-month projections, and use the review trigger section to define when you should re-evaluate rather than pre-optimizing for scale you may never reach.
Options Comparison
Always include the option of using a vector extension on your existing database (pgvector, Elasticsearch kNN) alongside dedicated vector databases. For many workloads under 10 million vectors, a well-tuned pgvector extension eliminates an entire infrastructure component and the operational burden that comes with it. Only choose a dedicated vector database when you have a clear scaling or performance requirement that your existing database cannot meet.
Test with your actual embedding dimensions and data distribution. A database that performs well with 384-dimensional vectors may behave very differently with 1536-dimensional vectors. Build a representative test dataset that matches your production characteristics before running benchmarks.
Beware of benchmarks that test empty or cold databases. Production vector databases accumulate data, indexes fragment over time, and concurrent write/read patterns affect query latency. Always run sustained load tests that simulate your actual read/write mix.
Version History
1.0.0 · 2026-03-01
- • Initial ADR template for vector database selection