docTR

End-to-end OCR library from Mindee that combines DBNet document detection and CRNN text recognition to extract structured text from images and PDFs.

Evaluated Mar 06, 2026 (0d ago) v0.9.x
Homepage ↗ Repo ↗ AI & Machine Learning python ocr deep-learning pytorch tensorflow document-understanding text-detection text-recognition
⚙ Agent Friendliness
66
/ 100
Can an agent use this?
🔒 Security
88
/ 100
Is it safe for agents?
⚡ Reliability
78
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
83
Error Messages
75
Auth Simplicity
100
Rate Limits
100

🔒 Security

TLS Enforcement
90
Auth Strength
90
Scope Granularity
85
Dep. Hygiene
80
Secret Handling
90

No network layer after model download; ensure model weights are fetched from official HuggingFace Hub; process only trusted document inputs

⚡ Reliability

Uptime/SLA
80
Version Stability
80
Breaking Changes
75
Error Recovery
75
AF Security Reliability

Best When

You need high-accuracy OCR on scanned or photographed documents and have access to a machine with sufficient CPU/GPU resources.

Avoid When

The PDF already contains selectable text, or your pipeline cannot tolerate model download size and inference latency.

Use Cases

  • Extracting text from scanned documents and images where native PDF text is absent
  • Building document digitization pipelines for invoices, receipts, and forms
  • Running OCR on low-quality or photographed documents where traditional tools fail
  • Detecting and localizing text regions with bounding boxes for downstream processing
  • Processing multi-page document images and returning structured word/line/block hierarchy

Not For

  • Native digital PDFs with embedded text (use PyMuPDF or pdfminer for zero-cost extraction)
  • Resource-constrained environments — models require ~200MB download and GPU/CPU inference time
  • Real-time edge inference without model quantization or a capable local GPU

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

Local Python library — no authentication required; model weights downloaded automatically from HuggingFace Hub on first use

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

Apache 2.0 license. Pre-trained model weights are freely available. Mindee also offers a hosted API (docTR Cloud) with separate pricing.

Agent Metadata

Pagination
none
Idempotent
Full
Retry Guidance
Not documented

Known Gotchas

  • First run downloads ~200MB of model weights — agent pipelines must handle the download delay or pre-warm models
  • Install is backend-specific: `pip install python-doctr[torch]` or `python-doctr[tf]` — missing extra causes ImportError
  • GPU memory can be exhausted with large batches; agents should process documents in small batches
  • Output confidence scores are per-word, not per-document; aggregation logic must be implemented by the caller
  • PDF inputs are rasterized internally — processing speed scales with page count and DPI, not file size

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for docTR.

$99

Scores are editorial opinions as of 2026-03-06.

5173
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered