docTR

End-to-end OCR library from Mindee that combines DBNet document detection and CRNN text recognition to extract structured text from images and PDFs.

Evaluated Mar 06, 2026 (0d ago) v0.9.x

Homepage ↗ Repo ↗ AI & Machine Learning python ocr deep-learning pytorch tensorflow document-understanding text-detection text-recognition

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

100

Rate Limits

100

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

No network layer after model download; ensure model weights are fetched from official HuggingFace Hub; process only trusted document inputs

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need high-accuracy OCR on scanned or photographed documents and have access to a machine with sufficient CPU/GPU resources.

Avoid When

The PDF already contains selectable text, or your pipeline cannot tolerate model download size and inference latency.

Use Cases

• Extracting text from scanned documents and images where native PDF text is absent
• Building document digitization pipelines for invoices, receipts, and forms
• Running OCR on low-quality or photographed documents where traditional tools fail
• Detecting and localizing text regions with bounding boxes for downstream processing
• Processing multi-page document images and returning structured word/line/block hierarchy

Not For

• Native digital PDFs with embedded text (use PyMuPDF or pdfminer for zero-cost extraction)
• Resource-constrained environments — models require ~200MB download and GPU/CPU inference time
• Real-time edge inference without model quantization or a capable local GPU

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: none

OAuth: No Scopes: No

Local Python library — no authentication required; model weights downloaded automatically from HuggingFace Hub on first use

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

Apache 2.0 license. Pre-trained model weights are freely available. Mindee also offers a hosted API (docTR Cloud) with separate pricing.

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ First run downloads ~200MB of model weights — agent pipelines must handle the download delay or pre-warm models
⚠ Install is backend-specific: `pip install python-doctr[torch]` or `python-doctr[tf]` — missing extra causes ImportError
⚠ GPU memory can be exhausted with large batches; agents should process documents in small batches
⚠ Output confidence scores are per-word, not per-document; aggregation logic must be implemented by the caller
⚠ PDF inputs are rasterized internally — processing speed scales with page count and DPI, not file size

Alternatives

tesseract-api paddleocr-api azure-document-intelligence-api

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for docTR.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.