Lunary

Open-source LLM observability and analytics platform with full tracing, cost tracking, user analytics, and evaluation capabilities. Lunary captures every LLM call with inputs, outputs, tokens, costs, and latency — and provides a UI for analyzing agent behavior, debugging failures, and running evals. MIT licensed with self-host option. Built for production LLM apps: supports multi-step agent traces, user tracking, and A/B testing of prompts.

Evaluated Mar 06, 2026 (0d ago) vv1

Homepage ↗ Repo ↗ AI & Machine Learning llm observability tracing open-source agent analytics cost-tracking evaluation

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

100

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

MIT open source with self-host option for data sovereignty. HTTPS enforced for managed cloud. API keys should be kept server-side — client-side exposure risks trace injection. Self-hosting recommended for sensitive LLM inputs (PII, proprietary content).

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

Building production LLM agent applications where you need full trace visibility, cost accounting per user/feature, and continuous evaluation of LLM output quality.

Avoid When

Simple LLM prototyping where tracing overhead isn't justified — add Lunary when moving to production, not during development.

Use Cases

• Trace multi-step agent execution with parent-child span relationships — see exactly which LLM calls happen in each agent run and how long each takes
• Track LLM cost per user, per feature, and per agent run to identify expensive patterns and optimize prompts for cost
• Run automated evaluations on production LLM traces — score outputs for quality, relevance, and safety using LLM-as-judge
• Identify agent failure patterns by querying logged traces for error patterns, token limit hits, and low-quality outputs
• A/B test prompt variations in production by routing a percentage of traffic to different prompt templates and comparing outcomes

Not For

• Non-LLM application monitoring — Lunary is purpose-built for LLM observability; use Datadog or New Relic for general application monitoring
• Teams needing enterprise compliance features without self-hosting — managed tier is early stage; self-host for full data control
• ML model (non-LLM) monitoring — whylogs or Evidently are better for traditional ML drift detection

Interface

REST API

Yes

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: api_key

OAuth: No Scopes: No

API key for ingesting trace data. Key passed as environment variable LUNARY_PUBLIC_KEY. Separate project keys per environment (dev/prod). Dashboard at lunary.ai for viewing traces.

Pricing

Model: freemium

Free tier: Yes

Requires CC: No

MIT licensed — self-hosting is completely free and unlimited. Managed cloud free tier suitable for small projects. Production scale typically requires paid plan or self-hosted deployment.

Agent Metadata

Pagination

cursor

Idempotent

Partial

Retry Guidance

Not documented

Known Gotchas

⚠ Lunary SDK uses monkey-patching to intercept LLM calls — this can conflict with other SDK wrappers (LangChain callbacks, OpenTelemetry) if both are active
⚠ Trace parent-child relationships must be established explicitly with run_manager context — without proper context, all calls appear as independent top-level traces
⚠ Event batching is async — trace data appears in dashboard with 5-30 second delay; agents checking for their own traces immediately after execution may not see them
⚠ Token count tracking requires model-specific tokenizer configuration — incorrect tokenizer causes wrong cost estimates
⚠ Self-hosted deployment requires Docker and PostgreSQL — not a simple single-binary install
⚠ Free tier's 30-day retention means historical analysis of agent behavior requires paid plan or self-hosting with custom retention

Alternatives

langfuse-api traceloop-api honeyhive-api arize-api agentops-api

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Lunary.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.