Milvus
Open-source vector database purpose-built for billion-scale embedding similarity search, supporting ANN indexes (HNSW, IVF, DiskANN) with gRPC and REST APIs.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
Auth is disabled by default in standalone mode — a common misconfiguration. TLS mutual auth supported. RBAC available for collection-level access control in enterprise deployments.
⚡ Reliability
Best When
Running billion-scale ANN search workloads where self-hosted infrastructure control, index flexibility, and high QPS are all required.
Avoid When
Your embedding dataset is small (<1M vectors) and you need a quick setup — Milvus's distributed architecture adds unnecessary operational complexity.
Use Cases
- • Store and query agent memory embeddings at scale — retrieve the top-k most semantically relevant memories for a given task context
- • Build RAG pipelines where agents embed documents, store vectors in Milvus collections, and retrieve relevant chunks at query time
- • Deduplicate agent-generated content by computing cosine similarity against an existing vector index before storing new outputs
- • Power multi-modal agent search by storing image, text, and audio embeddings in separate Milvus fields within one collection
- • Implement agent-facing product recommendation by indexing item embeddings and querying with user preference vectors in real time
Not For
- • Small-scale projects with fewer than 100K vectors where Chroma or pgvector offer simpler setup with sufficient performance
- • Teams needing a fully managed cloud service without ops overhead — Zilliz Cloud is the managed option but adds vendor dependency
- • Use cases requiring full ACID transactional semantics across vector and relational data in a single query
Interface
Authentication
Milvus supports username/password auth and TLS mutual authentication. API key auth available in Zilliz Cloud. Default standalone deployment ships with auth disabled — must be explicitly enabled.
Pricing
Milvus is Apache 2.0 open source. Zilliz Cloud is the fully managed version with SLAs and enterprise support.
Agent Metadata
Known Gotchas
- ⚠ Collections must be loaded into memory before search — agents that query unloaded collections receive an error requiring a separate load() call first
- ⚠ Default consistency level is 'Bounded' (eventual) — agents requiring read-after-write consistency must explicitly set consistency_level='Strong' per query
- ⚠ Index building is asynchronous — agents that insert vectors and immediately query may miss recent inserts until the index is rebuilt
- ⚠ Primary key must be set at collection creation and cannot be changed — agents generating primary keys must use deterministic IDs or accept auto-generated int64s
- ⚠ Partition pruning only works when the partition key field is included in the search filter — missing it causes full-collection scans at billion scale
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Milvus.
Scores are editorial opinions as of 2026-03-06.