Criterion.rs

Statistics-driven micro-benchmarking library for Rust. Criterion.rs runs benchmarks multiple times, applies statistical analysis to detect performance regressions and improvements, generates HTML reports with interactive charts, and integrates with cargo bench. Uses Welch's t-test to determine if performance changes are statistically significant — reducing false positives from benchmark noise. The standard benchmarking tool for Rust.

Evaluated Mar 06, 2026 (0d ago) v0.5+

Homepage ↗ Repo ↗ Developer Tools rust benchmarking performance statistics testing open-source

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

100

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Local-only benchmarking tool — no network calls. No security concerns for the library itself.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need statistically valid micro-benchmarks for Rust agent code — Criterion's noise reduction and regression detection catch real performance changes from benchmark noise.

Avoid When

You need load testing or end-to-end performance measurement — use k6, wrk, or Locust for service-level performance testing.

Use Cases

• Benchmark agent algorithm performance in Rust — measure serialization, parsing, or computation hotspots with statistically valid results
• Detect performance regressions in Rust agent code with Criterion's baseline comparison — compare current vs committed performance
• Profile different implementation strategies (SIMD vs scalar, cache-friendly vs not) with Criterion's parametric benchmarks
• Generate shareable HTML performance reports for Rust agent library releases to communicate performance characteristics
• Benchmark against multiple input sizes with BenchmarkGroup to understand agent algorithm scaling behavior

Not For

• End-to-end integration benchmarking — Criterion is for micro-benchmarks of specific functions, not full system load testing
• Profiling to find hotspots — use perf, flamegraph, or cargo-flamegraph for profiling; Criterion measures pre-identified bottlenecks
• Go, Python, or non-Rust code — Criterion is Rust-only

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: none

OAuth: No Scopes: No

Local benchmarking library — no external auth or network calls.

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

Apache 2.0 / MIT dual-licensed open source Rust crate.

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ Criterion requires the benchmarked function to use black_box() to prevent compiler optimization from eliminating the computation being measured — forgotten black_box() produces meaningless 0ns benchmarks
⚠ Running benchmarks in debug mode (cargo bench without --release) produces unoptimized results — always use cargo bench --release for valid performance measurements
⚠ Criterion baseline comparison requires committing the baseline first (cargo bench -- --save-baseline main) — without a baseline, regression detection doesn't work
⚠ Async benchmarks require criterion's async support with tokio::runtime::Runtime — sync criterion functions cannot await async agent code directly
⚠ CI benchmark variance is high due to shared infrastructure — Criterion results on CI should be used for trend detection, not absolute numbers; run on dedicated hardware for precise measurements
⚠ Criterion's HTML reports require gnuplot installed — without gnuplot, reports are text-only; install gnuplot or use cargo-criterion for modern HTML reports without gnuplot dependency

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Criterion.rs.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.