verify

A Claude Code plugin (“opslane-verify”) that performs spec-driven verification: it interprets an acceptance-spec document, extracts acceptance criteria, runs one browser automation agent per criterion (Playwright) against a local dev server, then a judge step reviews screenshots/traces to produce pass/fail results with evidence (screenshots and session recordings).

Evaluated Mar 30, 2026 (0d ago)
Repo ↗ Testing claude claude-code claude-code-plugins claude-code-skill verification playwright browser-automation spec-testing qa evidence
⚙ Agent Friendliness
35
/ 100
Can an agent use this?
🔒 Security
48
/ 100
Is it safe for agents?
⚡ Reliability
6
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
35
Documentation
60
Error Messages
0
Auth Simplicity
70
Rate Limits
0

🔒 Security

TLS Enforcement
60
Auth Strength
60
Scope Granularity
20
Dep. Hygiene
45
Secret Handling
50

The provided content does not describe TLS usage, secret handling, or dependency posture for the plugin itself; it only references Playwright and that Claude Code uses OAuth login. Because the workflow runs browser agents against a local dev server, it may have access to application data visible in the UI—users should ensure test environments and data exposure are appropriate.

⚡ Reliability

Uptime/SLA
0
Version Stability
0
Breaking Changes
0
Error Recovery
25
AF Security Reliability

Best When

You have a local dev server and a stable, testable acceptance-spec format, and you want quick feedback with visual evidence for each acceptance criterion.

Avoid When

You need strongly controlled, deterministic testing at scale (e.g., large suites in shared CI environments), or you cannot safely run browser agents that interact with the app under test.

Use Cases

  • Validating web-app behavior against a written set of acceptance criteria before pushing changes
  • Generating automated evidence (screenshots/video/Playwright traces) for failed acceptance criteria
  • Reducing manual QA effort for regression checks driven by spec documents

Not For

  • Use as a general-purpose CI/CD system (README explicitly states “No CI. No infrastructure.”)
  • High-assurance security testing or compliance validation without human review (it’s an automated verification workflow, not a formal verifier)
  • Environments where you cannot run browser automation against a local dev server

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
Yes
SDK
No
Webhooks
No

Authentication

Methods: claude login (OAuth login for Claude Code)
OAuth: Yes Scopes: No

The README mentions Claude Code with OAuth login via `claude login`, but does not describe scopes or plugin-specific authorization beyond requiring that login.

Pricing

Free tier: No
Requires CC: No

No pricing information found in the provided content.

Agent Metadata

Pagination
none
Idempotent
False
Retry Guidance
Not documented

Known Gotchas

  • Spec interpretation/planning may require the user to provide a testable spec and answer clarifying questions before execution.
  • Browser-agent runs can be non-deterministic if the app under test is flaky or uses timing-dependent UI behavior.
  • Parallel execution per acceptance criterion (if implemented) may stress shared dev-server state; without explicit isolation, runs may affect each other.
  • The workflow is evidence-heavy (screenshots/video/trace), which may be impacted by large sessions or environment limitations.

Alternatives

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for verify.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

$3

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

Scores are editorial opinions as of 2026-03-30.

6370
Packages Evaluated
20033
Need Evaluation
586
Need Re-evaluation
Community Powered