pilot-shell

⚠ Stale — 111d ago

Pilot Shell is an installation/CLI wrapper around Claude Code that provides a structured “spec-driven development” workflow (plan/approve/implement/verify), enforced quality hooks (lint/format/typecheck), git worktree-based task isolation, and a local dashboard for session/spec/review/status visibility.

Evaluated Mar 30, 2026 (111d ago)

Homepage ↗ Repo ↗ DevTools ai-agents ai-coding-tools claude-code spec-driven-development tdd developer-tools cli

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Pilot Shell downloads and runs installer scripts (curl | bash pattern) and installs multiple tools (language servers, browser automation, dependencies). README does not provide detailed guidance on secret storage/redaction, least-privilege scopes, or threat modeling of local execution. It integrates with Claude/Claude Code auth via external subscription status rather than specifying its own auth mechanism.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You are already using Claude Code and want stronger engineering discipline (TDD, verification, review gates) with a consistent workflow across sessions and projects.

Avoid When

You need a simple chat interface only, or you cannot accommodate local installation, hooks, and test execution as part of the agent loop.

Use Cases

• Coding tasks where you want spec-driven planning and verification instead of free-form agent chatting
• Reducing regressions by enforcing TDD and running full test suites during agent work
• Review-gated changes with interactive annotations (spec approval and inline diff review)
• Headless/CI automation of Claude Code workflows using a non-interactive flag
• Project bootstrapping of rules/hooks/MCP servers via /setup-rules

Not For

• Teams that cannot run local CLI tooling or that require a purely remote SaaS workflow
• Workflows that do not use git or where worktree-based isolation is not feasible
• Environments that cannot install browser automation components or required language servers
• Use as a general-purpose external API service (it is primarily a local dev tool)

Interface

REST API

GraphQL

gRPC

MCP Server

Yes

SDK

Webhooks

Authentication

Methods: Claude subscription/auth via claude auth status (as referenced in README for tier detection)

OAuth: No Scopes: No

Pilot Shell appears to rely on Claude Code’s existing authentication model; README does not specify OAuth flows or fine-grained scope management for Pilot itself.

Pricing

Free tier: No

Requires CC: No

README discusses Claude subscription tiers for usage; it does not state a separate Pilot Shell pricing model.

Agent Metadata

Pagination

none

Idempotent

True

Retry Guidance

Not documented

Known Gotchas

⚠ Requires local environment setup (dependencies, language servers, hooks); failures may occur if prerequisites can’t be installed/initialized.
⚠ Worktree-based implementation and browser automation for UI E2E may be slow/fragile in constrained environments.
⚠ Headless mode changes interaction style; agent assumptions about interactive approval may not hold without proper flags.

Alternatives

Claude Code without Pilot Shell (built-in plan/commands/hooks only) Other agent frameworks with custom scaffolding and orchestration (varies by vendor/tooling) Local dev tooling centered on CI/test enforcement (e.g., pre-commit + CI gates) combined with any coding agent

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for pilot-shell.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-30.