Web Eval Agent
A now-sunsetted MCP server that autonomously evaluated web applications by driving a browser agent through user-specified tasks, capturing screenshots, console logs, and network traffic, then returning a rich UX report to the calling AI agent.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
Community/specialized tool. Apply standard security practices for category. Review documentation for specific security requirements.
⚡ Reliability
Best When
Historically best for quick autonomous UX evaluation loops within AI coding editors — but the project is discontinued. Evaluate alternatives instead.
Avoid When
Starting new integrations — the project is sunsetted. Use vibetest-use or a maintained browser testing MCP instead.
Use Cases
- • Autonomous end-to-end testing of web apps from within Cursor, Cline, or Windsurf
- • Letting coding agents self-test their own implementations before committing
- • Capturing network traffic and console errors during automated UI walkthroughs
- • Browser session state setup (login/auth) for subsequent automated test runs
Not For
- • New projects — this tool is sunsetted and no longer maintained
- • Production CI/CD pipelines requiring long-term stability
- • Teams needing enterprise support or SLA guarantees
Interface
Authentication
Required free API key from operative.sh/mcp. Project is sunsetted so key availability is uncertain.
Pricing
Apache-2.0 licensed. Free API key required from operative.sh — availability post-sunset unclear.
Agent Metadata
Known Gotchas
- ⚠ PROJECT IS SUNSETTED — team is building something new at withrefresh.com
- ⚠ Requires API key from operative.sh which may not be available post-sunset
- ⚠ Depends on BrowserUse framework which itself evolves rapidly
- ⚠ Playwright must be installed separately before use
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Web Eval Agent.
Scores are editorial opinions as of 2026-03-06.