chDB

Embedded OLAP database engine that runs ClickHouse entirely in-process as a Python library (or Go/Node.js). No server, no Docker, no network — just import chdb and run ClickHouse SQL queries on local files (Parquet, CSV, JSON) or in-memory data at ClickHouse speed. Ideal for agents that need powerful analytical SQL without managing database infrastructure. Think DuckDB's architecture applied to ClickHouse's query engine.

Evaluated Mar 06, 2026 (0d ago) v2.x
Homepage ↗ Repo ↗ Other clickhouse embedded olap python in-process analytics columnar parquet
⚙ Agent Friendliness
65
/ 100
Can an agent use this?
🔒 Security
87
/ 100
Is it safe for agents?
⚡ Reliability
74
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
80
Error Messages
78
Auth Simplicity
100
Rate Limits
95

🔒 Security

TLS Enforcement
100
Auth Strength
85
Scope Granularity
80
Dep. Hygiene
80
Secret Handling
90

No network attack surface — pure in-process library. Data never leaves the process. OS-level file permissions govern data access. No credentials to leak. Apache 2.0 source available for audit. C++ core inherits ClickHouse's security track record.

⚡ Reliability

Uptime/SLA
78
Version Stability
72
Breaking Changes
70
Error Recovery
75
AF Security Reliability

Best When

You need ClickHouse's analytical performance for local file processing or single-node analytics without the complexity of running a ClickHouse server.

Avoid When

You need a shared analytics database for multiple users or services — use full ClickHouse server, DuckDB with read-only access, or a managed service.

Use Cases

  • Run ClickHouse-speed analytical queries on Parquet/CSV files directly from Python agents without spinning up a ClickHouse server or cluster
  • Process large local datasets (100GB+) with ClickHouse's columnar query engine in agent data pipelines without network overhead
  • Perform complex aggregations, window functions, and JOIN operations on agent-collected data using familiar SQL without database management
  • Build serverless analytics in Lambda/Cloud Run functions where spinning up a ClickHouse instance is impractical — embed chDB as a library
  • Query cloud storage files (S3 via URL table function) directly from in-process chDB without data movement to a server

Not For

  • Multi-user concurrent write workloads — chDB is a single-process embedded engine, not a shared database server
  • Transactional (OLTP) workloads — chDB is OLAP-focused; use PostgreSQL or SQLite for transactional use cases
  • Production-scale distributed queries requiring ClickHouse cluster features — use full ClickHouse server for distributed query execution across multiple nodes

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

No authentication — chDB runs in-process with the calling application's privileges. Access control is at the OS file system level. No network server means no authentication surface.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

Apache 2.0 open source. Community project backed by ClickHouse Inc. No paid tiers for the embedded library. Completely free for commercial use.

Agent Metadata

Pagination
none
Idempotent
Full
Retry Guidance
Not documented

Known Gotchas

  • chDB uses ClickHouse SQL dialect, not standard SQL — functions like arrayJoin, groupArray, and ARRAY JOIN syntax differ from PostgreSQL/DuckDB; LLM-generated SQL may need adaptation
  • Memory usage can be very high for large aggregations — chDB loads data into columnar format in memory; agents must account for memory limits in constrained environments
  • chDB is a Python library but the underlying engine is C++ — version compatibility between chdb Python package and the embedded ClickHouse binary matters; always pin versions
  • Query results are returned as Arrow, bytes, or DataFrame depending on output format setting — agents must specify the correct output format for downstream processing
  • Parquet file reading is very fast but CSV reading can be slow for large files — agents should prefer Parquet or ORC formats for large datasets
  • No persistent storage by default — chDB state is ephemeral; to persist query results, explicitly write to Parquet or use clickhouse-local syntax with writable paths
  • chDB is relatively new (2023) — some ClickHouse features available in server mode may not yet be available in the embedded version; check feature parity for advanced use cases

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for chDB.

$99

Scores are editorial opinions as of 2026-03-06.

5173
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered