Cheerio

Fast, flexible jQuery-like HTML/XML parser for Node.js. Implements a subset of jQuery's core API for traversing and manipulating parsed HTML/XML documents without a browser. Used for server-side HTML scraping, parsing, and manipulation. Parses HTML strings into a traversable DOM using htmlparser2 or parse5, then exposes jQuery-style selectors and traversal methods ($('.class'), $('a').attr('href'), etc.).

Evaluated Mar 06, 2026 (0d ago) v1.x
Homepage ↗ Repo ↗ Developer Tools html-parsing web-scraping jquery dom selector node scraping
⚙ Agent Friendliness
68
/ 100
Can an agent use this?
🔒 Security
98
/ 100
Is it safe for agents?
⚡ Reliability
88
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
88
Error Messages
78
Auth Simplicity
100
Rate Limits
100

🔒 Security

TLS Enforcement
100
Auth Strength
100
Scope Granularity
100
Dep. Hygiene
85
Secret Handling
100

MIT licensed. Local HTML parsing — no network calls from Cheerio itself. XSS risk if parsed HTML is rendered back to browser without sanitization.

⚡ Reliability

Uptime/SLA
100
Version Stability
85
Breaking Changes
80
Error Recovery
88
AF Security Reliability

Best When

You have static HTML (from HTTP responses or file reads) and need to extract data using familiar jQuery-like CSS selectors in Node.js.

Avoid When

The page requires JavaScript execution to render content — use Playwright, Puppeteer, or a headless browser for dynamic pages.

Use Cases

  • Scrape structured data from HTML pages in agent web data collection pipelines using CSS selectors
  • Parse and extract specific content from HTML documents returned by agent HTTP requests
  • Transform HTML documents server-side — add/remove/modify elements before rendering or forwarding
  • Extract links, tables, or structured content from web pages for agent knowledge base population
  • Parse HTML email bodies to extract specific information in agent email processing workflows

Not For

  • JavaScript-rendered pages requiring actual browser execution — use Playwright or Puppeteer for dynamic content
  • CSS styling or layout calculations — Cheerio has no rendering engine; purely structural DOM traversal
  • High-performance streaming HTML processing — use htmlparser2 directly for streaming; Cheerio loads full document into memory

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

Local library — no authentication required. MIT licensed.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

MIT licensed. Zero cost.

Agent Metadata

Pagination
none
Idempotent
Full
Retry Guidance
Not documented

Known Gotchas

  • Cheerio only parses STATIC HTML — JavaScript-rendered content is invisible; use Playwright/Puppeteer for SPAs or React/Vue apps
  • selector returns empty object (not null) when no match — always check .length: if ($('.item').length > 0) { /* exists */ }
  • load() creates a new Cheerio scope: const $ = cheerio.load(html) — must create $ before using selectors; the module export is not directly a $ function
  • v1 changed the HTML parser from htmlparser2 to parse5 by default (more spec-compliant but different behavior for malformed HTML) — use load(html, {xmlMode: false}) for consistent behavior
  • Text extraction: $('.selector').text() concatenates all text including children; .html() returns inner HTML; choose based on whether you need nested tags
  • Modification doesn't affect the original HTML string — Cheerio creates an in-memory DOM; use $.html() or $.xml() to serialize back to string

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Cheerio.

$99

Scores are editorial opinions as of 2026-03-06.

5215
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered