Cheerio

Fast, flexible jQuery-like HTML/XML parser for Node.js. Implements a subset of jQuery's core API for traversing and manipulating parsed HTML/XML documents without a browser. Used for server-side HTML scraping, parsing, and manipulation. Parses HTML strings into a traversable DOM using htmlparser2 or parse5, then exposes jQuery-style selectors and traversal methods ($('.class'), $('a').attr('href'), etc.).

Evaluated Mar 06, 2026 (0d ago) v1.x

Homepage ↗ Repo ↗ Developer Tools html-parsing web-scraping jquery dom selector node scraping

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

100

Rate Limits

100

🔒 Security

TLS Enforcement

100

Auth Strength

100

Scope Granularity

100

Dep. Hygiene

Secret Handling

100

MIT licensed. Local HTML parsing — no network calls from Cheerio itself. XSS risk if parsed HTML is rendered back to browser without sanitization.

⚡ Reliability

Uptime/SLA

100

Version Stability

Breaking Changes

Error Recovery

Best When

You have static HTML (from HTTP responses or file reads) and need to extract data using familiar jQuery-like CSS selectors in Node.js.

Avoid When

The page requires JavaScript execution to render content — use Playwright, Puppeteer, or a headless browser for dynamic pages.

Use Cases

• Scrape structured data from HTML pages in agent web data collection pipelines using CSS selectors
• Parse and extract specific content from HTML documents returned by agent HTTP requests
• Transform HTML documents server-side — add/remove/modify elements before rendering or forwarding
• Extract links, tables, or structured content from web pages for agent knowledge base population
• Parse HTML email bodies to extract specific information in agent email processing workflows

Not For

• JavaScript-rendered pages requiring actual browser execution — use Playwright or Puppeteer for dynamic content
• CSS styling or layout calculations — Cheerio has no rendering engine; purely structural DOM traversal
• High-performance streaming HTML processing — use htmlparser2 directly for streaming; Cheerio loads full document into memory

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: none

OAuth: No Scopes: No

Local library — no authentication required. MIT licensed.

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

MIT licensed. Zero cost.

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ Cheerio only parses STATIC HTML — JavaScript-rendered content is invisible; use Playwright/Puppeteer for SPAs or React/Vue apps
⚠ selector returns empty object (not null) when no match — always check .length: if ($('.item').length > 0) { /* exists */ }
⚠ load() creates a new Cheerio scope: const $ = cheerio.load(html) — must create $ before using selectors; the module export is not directly a $ function
⚠ v1 changed the HTML parser from htmlparser2 to parse5 by default (more spec-compliant but different behavior for malformed HTML) — use load(html, {xmlMode: false}) for consistent behavior
⚠ Text extraction: $('.selector').text() concatenates all text including children; .html() returns inner HTML; choose based on whether you need nested tags
⚠ Modification doesn't affect the original HTML string — Cheerio creates an in-memory DOM; use $.html() or $.xml() to serialize back to string

Alternatives

playwright-api puppeteer-api jsdom-api

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Cheerio.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.