chDB

Embedded OLAP database engine that runs ClickHouse entirely in-process as a Python library (or Go/Node.js). No server, no Docker, no network — just import chdb and run ClickHouse SQL queries on local files (Parquet, CSV, JSON) or in-memory data at ClickHouse speed. Ideal for agents that need powerful analytical SQL without managing database infrastructure. Think DuckDB's architecture applied to ClickHouse's query engine.

Evaluated Mar 06, 2026 (0d ago) v2.x

Homepage ↗ Repo ↗ Other clickhouse embedded olap python in-process analytics columnar parquet

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

100

Rate Limits

🔒 Security

TLS Enforcement

100

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

No network attack surface — pure in-process library. Data never leaves the process. OS-level file permissions govern data access. No credentials to leak. Apache 2.0 source available for audit. C++ core inherits ClickHouse's security track record.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need ClickHouse's analytical performance for local file processing or single-node analytics without the complexity of running a ClickHouse server.

Avoid When

You need a shared analytics database for multiple users or services — use full ClickHouse server, DuckDB with read-only access, or a managed service.

Use Cases

• Run ClickHouse-speed analytical queries on Parquet/CSV files directly from Python agents without spinning up a ClickHouse server or cluster
• Process large local datasets (100GB+) with ClickHouse's columnar query engine in agent data pipelines without network overhead
• Perform complex aggregations, window functions, and JOIN operations on agent-collected data using familiar SQL without database management
• Build serverless analytics in Lambda/Cloud Run functions where spinning up a ClickHouse instance is impractical — embed chDB as a library
• Query cloud storage files (S3 via URL table function) directly from in-process chDB without data movement to a server

Not For

• Multi-user concurrent write workloads — chDB is a single-process embedded engine, not a shared database server
• Transactional (OLTP) workloads — chDB is OLAP-focused; use PostgreSQL or SQLite for transactional use cases
• Production-scale distributed queries requiring ClickHouse cluster features — use full ClickHouse server for distributed query execution across multiple nodes

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: none

OAuth: No Scopes: No

No authentication — chDB runs in-process with the calling application's privileges. Access control is at the OS file system level. No network server means no authentication surface.

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

Apache 2.0 open source. Community project backed by ClickHouse Inc. No paid tiers for the embedded library. Completely free for commercial use.

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ chDB uses ClickHouse SQL dialect, not standard SQL — functions like arrayJoin, groupArray, and ARRAY JOIN syntax differ from PostgreSQL/DuckDB; LLM-generated SQL may need adaptation
⚠ Memory usage can be very high for large aggregations — chDB loads data into columnar format in memory; agents must account for memory limits in constrained environments
⚠ chDB is a Python library but the underlying engine is C++ — version compatibility between chdb Python package and the embedded ClickHouse binary matters; always pin versions
⚠ Query results are returned as Arrow, bytes, or DataFrame depending on output format setting — agents must specify the correct output format for downstream processing
⚠ Parquet file reading is very fast but CSV reading can be slow for large files — agents should prefer Parquet or ORC formats for large datasets
⚠ No persistent storage by default — chDB state is ephemeral; to persist query results, explicitly write to Parquet or use clickhouse-local syntax with writable paths
⚠ chDB is relatively new (2023) — some ClickHouse features available in server mode may not yet be available in the embedded version; check feature parity for advanced use cases

Alternatives

duckdb-api polars-api apache-datafusion-api

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for chDB.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.