searxNcrawl
searxNcrawl provides a minimal MCP server (STDIO and HTTP) plus CLI tools to search the web via SearXNG and crawl web pages/sites, extracting readable Markdown using Crawl4AI with configurable deduplication and optional authenticated crawling via Playwright storage_state.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
The README documents optional basic auth credentials for SearXNG via environment variables and authenticated crawling via Playwright storage_state. It does not describe server-side access controls for the MCP HTTP transport, nor does it document secrets handling/logging or operational mitigations (e.g., SSRF protections, allowlists, crawl throttling, robots.txt behavior). TLS is presumably supported via HTTP(S) URLs, but enforcement and configuration are not explicitly stated.
⚡ Reliability
Best When
You need local/agent-harness crawling + Markdown extraction and can run an MCP server you control, optionally pointing to your own SearXNG instance.
Avoid When
You need strict compliance guarantees, guaranteed safe crawling behavior, or you cannot run Playwright (chromium) and required dependencies locally.
Use Cases
- • Generating clean Markdown sources from documentation-heavy websites
- • Batch crawling a list of URLs with concurrency control and deduplication
- • Breadth-first site crawling with max depth/page limits
- • Web search over a SearXNG instance for finding documentation pages
- • Integrating crawling/search as tools for MCP-capable agent harnesses
Not For
- • A production-grade public web crawling service without safeguards
- • High-assurance authentication and authorization workflows (auth flow is described as WIP)
- • Stable, documented REST/SDK-based third-party integrations (not provided as a first-class interface)
Interface
Authentication
Authentication is not described as being enforced/authoritatively scoped within searxNcrawl itself; instead it relies on (a) optional basic auth when calling a SearXNG instance and (b) user-supplied Playwright storage_state for logged-in browsing during crawling.
Pricing
Repo metadata indicates MIT license; no hosted pricing is described.
Agent Metadata
Known Gotchas
- ⚠ Authenticated crawling is WIP; UX/flow may change and may not be fully reliable.
- ⚠ Crawling can be expensive (browser automation via Playwright/chromium) and slow depending on target pages and concurrency.
- ⚠ Site crawling uses BFS with max depth/page limits; agents should set tight limits to avoid unexpectedly large crawls.
Alternatives
Full Evaluation Report
Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for searxNcrawl.
AI-powered analysis · PDF + markdown · Delivered within 30 minutes
Package Brief
Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.
Delivered within 10 minutes
Score Monitoring
Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.
Continuous monitoring
Scores are editorial opinions as of 2026-03-30.