kreuzberg

A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 76+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.

Homepage ↗ Repo ↗ AI & Machine Learning bun csharp document-intelligence elixir ffi golang java metadata-extraction node pdf-extraction pdfium php python rag ruby rust table-extraction tesseract text-extraction wasm

⚙ Agent Friendliness

N/A

Not evaluated

Can an agent use this?

🔒 Security

N/A

Not evaluated

Is it safe for agents?

⚡ Reliability

N/A

Not evaluated

Does it work consistently?

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of unknown date.