Hatchet
Durable workflow orchestration engine designed specifically for AI agent workloads. Hatchet provides background job execution with built-in durability, retries, concurrency control, and real-time streaming. Supports long-running AI workflows (multi-step LLM chains, agent loops) without timeouts. Built-in fan-out/fan-in patterns, priority queues, and rate limiting for agent orchestration. Self-hostable or managed cloud (Hatchet Cloud). Designed as a modern alternative to Celery/BullMQ for AI-native Python/TypeScript applications.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
MIT open source for auditability. TLS for gRPC worker connections. Token-based auth for API access. Self-hosted deployments control all data. Secrets management is application responsibility — integrate with Vault or environment-based secrets.
⚡ Reliability
Best When
You're building AI agent applications that need durable multi-step workflows with streaming, retries, and concurrency control — and Celery's task model is too primitive.
Avoid When
Your background jobs are simple, stateless tasks with short execution times — a simpler queue (BullMQ, RQ) is sufficient and easier to operate.
Use Cases
- • Orchestrate multi-step AI agent workflows with durable execution — if a step fails, Hatchet retries from the last successful checkpoint rather than restarting the entire workflow
- • Run long-running LLM inference tasks as background jobs with real-time streaming output back to the calling service without HTTP timeout constraints
- • Implement fan-out agent patterns — spawn N parallel agent tasks and collect results with built-in fan-in using Hatchet's child workflow spawning
- • Rate-limit and queue agent API calls (OpenAI, Anthropic) to stay within provider rate limits using Hatchet's built-in concurrency and rate limiting primitives
- • Build human-in-the-loop agent workflows with pause/resume — Hatchet workflows can pause waiting for human approval and resume on signal without losing execution context
Not For
- • Simple fire-and-forget background jobs — Celery, BullMQ, or RQ are simpler for straightforward queue-and-execute patterns without complex orchestration needs
- • Data pipeline orchestration at scale — Prefect, Airflow, or Kestra are more mature for data pipeline-specific orchestration with richer scheduling
- • High-throughput event streaming — Hatchet handles workflow orchestration, not high-volume event streaming; use Kafka for millions of events/second
Interface
Authentication
Hatchet uses API tokens for worker and service authentication. Tokens scoped to tenant. Hatchet Cloud adds SSO (Okta, Google). Self-hosted uses configurable auth. Worker connections use gRPC with token auth.
Pricing
MIT licensed open source. Self-host on any infrastructure. Hatchet Cloud provides managed hosting. Early-stage product (0.x) — expect API changes. Growing quickly in AI agent community.
Agent Metadata
Known Gotchas
- ⚠ Hatchet is 0.x software — API surface is evolving rapidly; pin SDK version and test on upgrades; breaking changes between minor versions are possible
- ⚠ Workers are long-running Python/TypeScript processes, not stateless functions — agent deployment model requires running Hatchet worker processes alongside application services
- ⚠ Workflow step context is serialized via JSON — complex Python objects (non-serializable types, closures) cannot be passed between steps; all step inputs/outputs must be JSON-serializable
- ⚠ Streaming output requires Hatchet's streaming-specific API patterns — standard step return values are not streamed; agents must use the streaming SDK primitives explicitly
- ⚠ Concurrency slots and rate limits are configured at workflow/step level in code — misconfigured concurrency allows unbounded parallel execution that can overwhelm downstream APIs
- ⚠ Self-hosting requires PostgreSQL + Redis + Hatchet server — more infrastructure than simpler queue systems; evaluate whether Hatchet Cloud simplifies operations
- ⚠ Human-in-the-loop pause/resume uses signal events — agents must implement event sending to resume paused workflows; timed-out waits require explicit timeout handling in step logic
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Hatchet.
Scores are editorial opinions as of 2026-03-06.