Hatchet

Durable workflow orchestration engine designed specifically for AI agent workloads. Hatchet provides background job execution with built-in durability, retries, concurrency control, and real-time streaming. Supports long-running AI workflows (multi-step LLM chains, agent loops) without timeouts. Built-in fan-out/fan-in patterns, priority queues, and rate limiting for agent orchestration. Self-hostable or managed cloud (Hatchet Cloud). Designed as a modern alternative to Celery/BullMQ for AI-native Python/TypeScript applications.

Evaluated Mar 06, 2026 (0d ago) v0.x (preview)
Homepage ↗ Repo ↗ Developer Tools workflow background-jobs orchestration ai python typescript open-source durable-execution
⚙ Agent Friendliness
58
/ 100
Can an agent use this?
🔒 Security
81
/ 100
Is it safe for agents?
⚡ Reliability
69
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
78
Error Messages
75
Auth Simplicity
80
Rate Limits
75

🔒 Security

TLS Enforcement
95
Auth Strength
78
Scope Granularity
75
Dep. Hygiene
80
Secret Handling
78

MIT open source for auditability. TLS for gRPC worker connections. Token-based auth for API access. Self-hosted deployments control all data. Secrets management is application responsibility — integrate with Vault or environment-based secrets.

⚡ Reliability

Uptime/SLA
70
Version Stability
65
Breaking Changes
62
Error Recovery
80
AF Security Reliability

Best When

You're building AI agent applications that need durable multi-step workflows with streaming, retries, and concurrency control — and Celery's task model is too primitive.

Avoid When

Your background jobs are simple, stateless tasks with short execution times — a simpler queue (BullMQ, RQ) is sufficient and easier to operate.

Use Cases

  • Orchestrate multi-step AI agent workflows with durable execution — if a step fails, Hatchet retries from the last successful checkpoint rather than restarting the entire workflow
  • Run long-running LLM inference tasks as background jobs with real-time streaming output back to the calling service without HTTP timeout constraints
  • Implement fan-out agent patterns — spawn N parallel agent tasks and collect results with built-in fan-in using Hatchet's child workflow spawning
  • Rate-limit and queue agent API calls (OpenAI, Anthropic) to stay within provider rate limits using Hatchet's built-in concurrency and rate limiting primitives
  • Build human-in-the-loop agent workflows with pause/resume — Hatchet workflows can pause waiting for human approval and resume on signal without losing execution context

Not For

  • Simple fire-and-forget background jobs — Celery, BullMQ, or RQ are simpler for straightforward queue-and-execute patterns without complex orchestration needs
  • Data pipeline orchestration at scale — Prefect, Airflow, or Kestra are more mature for data pipeline-specific orchestration with richer scheduling
  • High-throughput event streaming — Hatchet handles workflow orchestration, not high-volume event streaming; use Kafka for millions of events/second

Interface

REST API
Yes
GraphQL
No
gRPC
Yes
MCP Server
No
SDK
Yes
Webhooks
Yes

Authentication

Methods: api_key bearer_token
OAuth: No Scopes: Yes

Hatchet uses API tokens for worker and service authentication. Tokens scoped to tenant. Hatchet Cloud adds SSO (Okta, Google). Self-hosted uses configurable auth. Worker connections use gRPC with token auth.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

MIT licensed open source. Self-host on any infrastructure. Hatchet Cloud provides managed hosting. Early-stage product (0.x) — expect API changes. Growing quickly in AI agent community.

Agent Metadata

Pagination
cursor
Idempotent
Full
Retry Guidance
Documented

Known Gotchas

  • Hatchet is 0.x software — API surface is evolving rapidly; pin SDK version and test on upgrades; breaking changes between minor versions are possible
  • Workers are long-running Python/TypeScript processes, not stateless functions — agent deployment model requires running Hatchet worker processes alongside application services
  • Workflow step context is serialized via JSON — complex Python objects (non-serializable types, closures) cannot be passed between steps; all step inputs/outputs must be JSON-serializable
  • Streaming output requires Hatchet's streaming-specific API patterns — standard step return values are not streamed; agents must use the streaming SDK primitives explicitly
  • Concurrency slots and rate limits are configured at workflow/step level in code — misconfigured concurrency allows unbounded parallel execution that can overwhelm downstream APIs
  • Self-hosting requires PostgreSQL + Redis + Hatchet server — more infrastructure than simpler queue systems; evaluate whether Hatchet Cloud simplifies operations
  • Human-in-the-loop pause/resume uses signal events — agents must implement event sending to resume paused workflows; timed-out waits require explicit timeout handling in step logic

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Hatchet.

$99

Scores are editorial opinions as of 2026-03-06.

5215
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered