Haiku RAG

An opinionated local-first RAG system built on LanceDB, Pydantic AI, and Docling that provides hybrid vector/full-text search, citation-aware Q&A, multi-agent research workflows, and an MCP server for integration with AI assistants like Claude Desktop.

Evaluated Mar 07, 2026 (0d ago) vlatest
Homepage ↗ Repo ↗ Other rag lancedb pydantic-ai docling mcp-server hybrid-search reranking python local-first multi-agent
⚙ Agent Friendliness
79
/ 100
Can an agent use this?
🔒 Security
80
/ 100
Is it safe for agents?
⚡ Reliability
74
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
80
Documentation
82
Error Messages
65
Auth Simplicity
80
Rate Limits
72

🔒 Security

TLS Enforcement
92
Auth Strength
78
Scope Granularity
72
Dep. Hygiene
80
Secret Handling
78

RAG pipeline MCP using Claude Haiku. Inherits security from vector store and LLM provider. Embeddings of sensitive data require same protection as source data.

⚡ Reliability

Uptime/SLA
75
Version Stability
75
Breaking Changes
72
Error Recovery
72
AF Security Reliability

Best When

You want a production-quality, local-first RAG system with strong document structure awareness, citations, and MCP integration for AI assistant workflows.

Avoid When

You need a managed RAG-as-a-service solution or your documents are primarily unstructured web content rather than PDFs and structured documents.

Use Cases

  • Indexing and querying a personal or enterprise document library with citation-backed answers (page numbers, section headings)
  • Running multi-agent research workflows that plan, search, evaluate, and synthesize across a document corpus
  • Integrating a local document knowledge base into Claude Desktop or other MCP clients via the built-in MCP server

Not For

  • Cloud-first teams that need managed vector database infrastructure without self-hosting
  • Simple keyword search use cases where a full RAG pipeline adds unnecessary complexity
  • Non-Python shops or teams without Python 3.12+ capability

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
Yes
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

No authentication on the MCP server or local API. External embedding and model providers require their own API keys configured via environment variables.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

MIT licensed. LanceDB is embedded (free). Costs are pass-through to embedding and LLM providers. Optional cloud storage (S3, GCS, Azure) incurs cloud provider costs.

Agent Metadata

Pagination
offset
Idempotent
Partial
Retry Guidance
Not documented

Known Gotchas

  • Python 3.12+ required — older environments need upgrade before use
  • No MCP server authentication — local deployment assumed; exposing over network is unsafe without adding auth
  • Docling document processing can be slow for large PDFs; agents may time out on initial indexing requests
  • Two packages (haiku.rag vs haiku.rag-slim) with different dependency profiles can cause confusion

Alternatives

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for Haiku RAG.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

$3

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

Scores are editorial opinions as of 2026-03-07.

6470
Packages Evaluated
26150
Need Evaluation
173
Need Re-evaluation
Community Powered