Lunary
Open-source LLM observability and analytics platform with full tracing, cost tracking, user analytics, and evaluation capabilities. Lunary captures every LLM call with inputs, outputs, tokens, costs, and latency — and provides a UI for analyzing agent behavior, debugging failures, and running evals. MIT licensed with self-host option. Built for production LLM apps: supports multi-step agent traces, user tracking, and A/B testing of prompts.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
MIT open source with self-host option for data sovereignty. HTTPS enforced for managed cloud. API keys should be kept server-side — client-side exposure risks trace injection. Self-hosting recommended for sensitive LLM inputs (PII, proprietary content).
⚡ Reliability
Best When
Building production LLM agent applications where you need full trace visibility, cost accounting per user/feature, and continuous evaluation of LLM output quality.
Avoid When
Simple LLM prototyping where tracing overhead isn't justified — add Lunary when moving to production, not during development.
Use Cases
- • Trace multi-step agent execution with parent-child span relationships — see exactly which LLM calls happen in each agent run and how long each takes
- • Track LLM cost per user, per feature, and per agent run to identify expensive patterns and optimize prompts for cost
- • Run automated evaluations on production LLM traces — score outputs for quality, relevance, and safety using LLM-as-judge
- • Identify agent failure patterns by querying logged traces for error patterns, token limit hits, and low-quality outputs
- • A/B test prompt variations in production by routing a percentage of traffic to different prompt templates and comparing outcomes
Not For
- • Non-LLM application monitoring — Lunary is purpose-built for LLM observability; use Datadog or New Relic for general application monitoring
- • Teams needing enterprise compliance features without self-hosting — managed tier is early stage; self-host for full data control
- • ML model (non-LLM) monitoring — whylogs or Evidently are better for traditional ML drift detection
Interface
Authentication
API key for ingesting trace data. Key passed as environment variable LUNARY_PUBLIC_KEY. Separate project keys per environment (dev/prod). Dashboard at lunary.ai for viewing traces.
Pricing
MIT licensed — self-hosting is completely free and unlimited. Managed cloud free tier suitable for small projects. Production scale typically requires paid plan or self-hosted deployment.
Agent Metadata
Known Gotchas
- ⚠ Lunary SDK uses monkey-patching to intercept LLM calls — this can conflict with other SDK wrappers (LangChain callbacks, OpenTelemetry) if both are active
- ⚠ Trace parent-child relationships must be established explicitly with run_manager context — without proper context, all calls appear as independent top-level traces
- ⚠ Event batching is async — trace data appears in dashboard with 5-30 second delay; agents checking for their own traces immediately after execution may not see them
- ⚠ Token count tracking requires model-specific tokenizer configuration — incorrect tokenizer causes wrong cost estimates
- ⚠ Self-hosted deployment requires Docker and PostgreSQL — not a simple single-binary install
- ⚠ Free tier's 30-day retention means historical analysis of agent behavior requires paid plan or self-hosting with custom retention
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Lunary.
Scores are editorial opinions as of 2026-03-06.