Deepgram API

Provides real-time and batch speech-to-text transcription via REST and WebSocket, supporting 200+ languages with the Nova-3 model.

Evaluated Mar 06, 2026 (0d ago) vcurrent
Homepage ↗ AI & Machine Learning stt transcription voice streaming websocket ai real-time
⚙ Agent Friendliness
65
/ 100
Can an agent use this?
🔒 Security
84
/ 100
Is it safe for agents?
⚡ Reliability
84
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
90
Error Messages
85
Auth Simplicity
88
Rate Limits
84

🔒 Security

TLS Enforcement
100
Auth Strength
82
Scope Granularity
70
Dep. Hygiene
85
Secret Handling
82

Project-scoped API keys are a good practice; HIPAA BAA available for healthcare use cases on qualifying plans

⚡ Reliability

Uptime/SLA
88
Version Stability
84
Breaking Changes
82
Error Recovery
84
AF Security Reliability

Best When

You need low-latency, production-grade streaming transcription or batch processing of audio at scale with strong SDK support.

Avoid When

Your budget is near zero and latency tolerance is high enough to run open-source Whisper locally.

Use Cases

  • Transcribe live phone calls or voice sessions in real time for agent decision-making
  • Convert recorded meeting audio to searchable text for knowledge extraction pipelines
  • Power voice-command interfaces where an agent listens and acts on spoken instructions
  • Generate accurate captions or subtitles for audio/video content at scale
  • Extract structured data (names, dates, intents) from spoken customer interactions via post-transcription NLP

Not For

  • Applications that need on-device, fully offline speech recognition without any cloud dependency
  • Use cases requiring speaker diarization with more than ~10 speakers at very high accuracy
  • Scenarios where per-minute transcription cost must approach zero (self-hosted Whisper may be preferable)

Interface

REST API
Yes
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: api_key
OAuth: No Scopes: No

API key passed via Authorization: Token <key> header; keys are project-scoped and can be created per project in the console

Pricing

Model: usage_based
Free tier: Yes
Requires CC: No

$200 credit requires no credit card; credit card needed when credits are exhausted to continue usage

Agent Metadata

Pagination
cursor
Idempotent
No
Retry Guidance
Documented

Known Gotchas

  • WebSocket connections require sending a CloseStream message to flush final transcript; omitting it causes incomplete final transcripts
  • The is_final vs speech_final distinction in streaming results is critical: agents acting on interim results risk acting on incorrect partial text
  • Audio format must be explicitly declared (encoding, sample_rate, channels) or auto-detection may produce degraded accuracy
  • Free $200 credit expires; agents hitting production without a billing method will receive 402 errors with no grace period
  • The Nova-3 model ID may differ by use case (nova-3 vs nova-3-medical); using the wrong model tier affects both accuracy and cost

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Deepgram API.

$99

Scores are editorial opinions as of 2026-03-06.

5173
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered