skill-forge

skill-forge is a local Python CLI/repo scaffolding tool that designs, builds, reviews, evolves, publishes, evaluates, benchmarks, and converts “Claude Code” Agent Skills (per the Agent Skills open standard). It generates skill file trees, scripts, templates, orchestrator/sub-skill structures, and supports conversion to other tooling ecosystems.

Evaluated Mar 30, 2026 (67d ago)

Repo ↗ DevTools devtools ai-ml agent-skills claude-code cli scaffolding evaluation benchmarking python

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

100

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

No network transport, TLS, or auth model described. The README claims “stdlib only” dependencies, which is a positive sign for dependency hygiene, but the content does not discuss secret handling, logging policies, or how generated code/artifacts are validated/sandboxed when executed.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You want a developer workflow tool to generate and audit agent skills locally (Python 3.10+) and then test/benchmark/convert the resulting skill artifacts.

Avoid When

You need programmatic access via a stable network API contract, strict rate-limit semantics, or you cannot control the local execution environment (e.g., sandboxing is not available).

Use Cases

• Scaffolding production-grade Claude Code skills following an agent-skill standard
• Planning agent skill architectures with decomposition into sub-skills
• Reviewing existing skill content with a structured quality/health score
• Iterating on skills based on feedback (evolve)
• Packaging and preparing skills for distribution
• Running evaluation/benchmark pipelines for skill quality and performance
• Converting skills to other platforms (Codex/Gemini/Cursor/Antigravity per docs)

Not For

• Calling this as a remote service/API from applications (it appears to be a local CLI tool)
• Environments requiring enterprise-grade authentication/authorization for API access
• Use cases needing a documented HTTP/GraphQL/gRPC interface
• Security-sensitive workflows where provenance/validation of generated content must be independently audited

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Webhooks

Authentication

OAuth: No Scopes: No

No authentication scheme is described because this appears to be a local CLI that scaffolds and runs local scripts.

Pricing

Free tier: No

Requires CC: No

No pricing information provided; repository suggests MIT-licensed code that runs locally.

Agent Metadata

Pagination

none

Idempotent

False

Retry Guidance

Not documented

Known Gotchas

⚠ This is a CLI/tooling repo rather than a network API; agent integrations must operate via local command execution and file system artifacts.
⚠ Generated skills may include instructions and logic that should be reviewed before use; the repo describes generation but not safety guarantees.
⚠ Since no API contract/rate-limit semantics exist, agents should not assume retry/backoff patterns typical of HTTP services.

Alternatives

Anthropic/Claude Code skill templates and manual authoring General-purpose project scaffolding generators (custom scripts) Other agent-skill generators/templates aligned with Agent Skills standard Custom internal scaffolding + evaluation harnesses

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for skill-forge.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-30.