paperbanana-skill

Provides Claude Code skill definitions (and an underlying Python package) to generate publication-quality academic diagrams, statistical plots, and slide decks from text or structured data, using a multi-agent pipeline with evaluation/self-critique and provider fallback across multiple LLM/VLM/image providers.

Evaluated Mar 30, 2026 (45d ago)

Homepage ↗ Repo ↗ Ai Ml claude-code skill academic-diagrams plotting slides multi-agent evaluation llm-providers python

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Security posture inferred from README: uses provider API keys and an interactive setup wizard; includes a plot code injection mitigation via AST-based import blocklist (os/subprocess/socket blocked). However, README does not detail secret logging/redaction, dependency scanning, or fine-grained scopes. Network calls to multiple external providers imply data exposure considerations; no explicit data retention/residency statements.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You want consistent academic visuals quickly, and you can supply prompts/descriptions (and optionally data/PDF) where iterative evaluation/fallback can improve quality.

Avoid When

You need deterministic outputs, formal SLAs, or strict governance that disallows automated self-critique loops and multi-provider network calls.

Use Cases

• Text-to-figure for academic diagrams (methodology, pipeline, architecture)
• CSV/JSON-to-academic statistical plots with auto-styling
• Markdown/text-to-presentation slide decks with selectable style presets
• Venue-specific (NeurIPS/ICML/ACL/IEEE) diagram styling
• Iterative refinement loops (auto/continue with feedback)
• Generating from PDF page ranges for diagram prompts

Not For

• Producing medical/legal imagery that requires strict clinical/regulated validation
• Fully offline/no-external-API environments (relies on external providers)
• Use cases requiring a stable, formal REST API contract for programmatic integration beyond the CLI/skill workflow
• High-assurance content pipelines where autonomous generation must be strictly audited

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Webhooks

Authentication

Methods: API keys for providers via environment variables (GOOGLE_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, AWS credentials, OPENROUTER_API_KEY) Interactive setup wizard for configuring API keys

OAuth: No Scopes: No

Authentication is provider-key based; the skill itself is described as operating through the Claude Code plugin/skill interface and the underlying Python CLI setup.

Pricing

Free tier: Yes

Requires CC: No

Costs depend on selected providers and usage; README does not quantify spend or token/image limits.

Agent Metadata

Pagination

none

Idempotent

False

Retry Guidance

Documented

Known Gotchas

⚠ Outputs may be marked UNREVIEWED when the critic cannot parse/evaluate; manual review may be needed.
⚠ Provider capability differences (e.g., Claude VLM does not support image generation per README).
⚠ Non-determinism across multi-provider fallback chains and iterative loops.
⚠ Long/slow runs depending on iterations/auto/refinement settings; tuning may be required.

Alternatives

llm-based diagram tools with manual layout (e.g., Mermaid+renderers) Figure/table generation workflows using LaTeX/TikZ Graph/diagram generation frameworks with explicit code-first templates Other multimodal/AI diagram generation SDKs (provider-specific)

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for paperbanana-skill.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-30.