{"id":"pyx-corp-spectrawl","name":"spectrawl","af_score":60.2,"security_score":47.8,"reliability_score":28.8,"what_it_does":"Spectrawl is a self-hosted Node.js “web layer” for AI agents that unifies web search, stealth browsing, crawling, page extraction (schema/LLM-based), natural-language browser actions, screenshot capture, and optional network request capturing. It also advertises auth/cookie management and adapters/fallbacks for multiple platforms.","best_when":"You want a single self-hosted Node package to power agent web research/extraction and you can provide Gemini/Tavily/Brave keys as needed, accepting that third-party sites may block automated access and that Spectrawl includes anti-detect/captcha handling.","avoid_when":"You need strict transparency/certifiable compliance for interacting with authenticated/protected services, or you cannot use HTTPS to external services (search/LLM), or you must avoid any stealth/anti-bot approaches.","last_evaluated":"2026-03-30T15:32:15.064883+00:00","has_mcp":false,"has_api":true,"auth_methods":["API keys for upstream services (GEMINI_API_KEY, TAVILY_API_KEY, BRAVE_API_KEY, etc.)","Stored platform auth cookies (e.g., auth: 'reddit')","Residential proxy credentials via spectrawl config for certain platforms (e.g., LinkedIn)"],"has_free_tier":true,"known_gotchas":["Stealth browsing/anti-bot escalation may behave differently across sites; responses include blocked/blockInfo but agent logic may need to react to blocked:true.","Rate limits may apply depending on which search engine is used (DDG mentions datacenter IP rate-limiting; Gemini Grounded has a monthly quota).","LLM-based extraction/summarization is non-deterministic; use provided schemas and citations/sources where possible.","CAPTCHA solver explicitly does not support some challenge types (per README), so agents should handle failure paths.","Caching and screenshot behaviors differ (screenshots bypass cache per README), which can affect idempotency and cost."],"error_quality":0.0}