cortex-scout

CortexScout is a self-hostable Rust engine for web search and web extraction for AI agents. It exposes tools over an MCP stdio server (and optionally an HTTP server) and includes stateful browser automation (CDP) plus optional Human-in-the-Loop (HITL) flows for handling bot challenges and logins.

Evaluated Mar 30, 2026 (66d ago)

Repo ↗ DevTools ai-agents web-scraping web-extraction search mcp browser-automation hitl rust stateful-automation cdp

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

TLS enforcement is not specified in the README for the optional HTTP server (only HTTP health check is shown). Service-to-service auth for the local MCP/HTTP server is not described (no auth/scopes documented). LLM API keys are provided via environment variables, which is a reasonable pattern, but no details are given about logging/redaction of secrets. The project explicitly supports handling bot challenges and persists browser session state, which increases operational risk and requires careful review of legal/ethical and access-control policies for target sites.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need an agent-facing, self-hosted MCP toolset for search + extraction, including stateful automation and optional HITL to handle hard-to-fetch JS-heavy and auth-gated pages.

Avoid When

You need a simple, stable public API with no browser automation or you must avoid interactions that resemble bypass/automation on protected targets; also avoid in environments where outbound scraping/automation is prohibited.

Use Cases

• AI agent web search and result dedup/scoring
• Token-efficient page fetching and clean structured extraction
• Schema-driven extraction (extract_fields, fetch_then_extract)
• Crawl/bounded discovery for documentation or sub-pages
• Stateful browser automation for agent workflows and E2E-style interactions (scout_browser_automate)
• Human-in-the-loop assistance for CAPTCHA/complex auth (scout_agent_profile_auth / hitl_web_fetch)
• Deep research pipelines (multi-hop search + scrape + LLM synthesis)
• Semantic memory for research history via LanceDB (memory_search)

Not For

• Use as a hosted SaaS with managed credentials (it is self-hosted tooling)
• Use cases requiring strict compliance guarantees without review (bot-handling/automation may conflict with target site policies)
• Highly regulated environments where browser automation and third-party outbound requests are disallowed
• No-auth, read-only environments where even optional LLM-key configuration cannot be supplied

Interface

REST API

Yes

GraphQL

gRPC

MCP Server

Yes

SDK

Webhooks

Authentication

Methods: MCP stdio (local process) usage; environment-variable configuration for optional LLM API key and optional HTTP server runtime

OAuth: No Scopes: No

The README describes local/server runtime configuration via environment variables (e.g., OPENAI_API_KEY for synthesis). It does not describe user-level auth for the MCP interface or HTTP server. For accessing protected target sites, the project provides HITL-assisted login/cookie persistence, but that is not described as a standards-based OAuth flow for the CortexScout service itself.

Pricing

Free tier: No

Requires CC: No

Open-source MIT license; self-hosted binaries.

Agent Metadata

Pagination

none

Idempotent

False

Retry Guidance

Documented

Known Gotchas

⚠ HITL/profile flows may cause Chromium profile lock errors if run concurrently or if the same profile is reused while Chrome/Brave is open.
⚠ Using too-verbose RUST_LOG (info) can flood stderr and confuse MCP clients; recommended RUST_LOG=warn.
⚠ Proxy support is optional and enabled only when IP_LIST_PATH is set; leaving it unset disables proxy tools.
⚠ HITL/browser steps that require visible interaction need a desktop session and an --all-features build.

Alternatives

Playwright/Puppeteer + custom scraping/extraction ZenRows or similar managed scraping APIs Apify (platform + actor marketplace) Browserless (SaaS browser automation) or self-hosted browser automation Readability/DOM extraction libraries plus simple HTTP fetchers LLM-assisted RAG pipelines using a search API + your own fetch/extract layer

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for cortex-scout.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-30.