LiteLLM
Universal LLM routing library and proxy server that provides a single OpenAI-compatible interface to 100+ LLM providers with cost tracking, fallbacks, and load balancing.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
API keys for upstream providers are passed as environment variables or YAML config; ensure config files are not committed to version control. Proxy master key provides coarse access control only. No built-in secret rotation.
⚡ Reliability
Best When
You need a single call site that can target any of 100+ LLM providers and want automatic fallback, retries, and cost logging without rewriting agent code per provider.
Avoid When
You rely on provider-specific streaming events, function-call schemas, or response fields that LiteLLM's translation layer does not yet map correctly.
Use Cases
- • Route agent LLM calls across multiple providers (OpenAI, Anthropic, Bedrock, Vertex) with a single unified API surface
- • Implement automatic fallback chains so agents continue operating when a primary LLM provider is unavailable
- • Track per-agent token usage and cost across providers without instrumenting each provider SDK separately
- • Deploy a shared LiteLLM proxy server so multiple agents share rate limit budgets and a single set of API keys
- • Load-balance inference across multiple deployments of the same model to increase throughput for high-QPS agents
Not For
- • Fine-tuning or training models — LiteLLM is inference routing only
- • Storing or retrieving conversation history — no built-in memory or persistence layer
- • Applications requiring vendor-specific features not exposed through the OpenAI-compatible interface
Interface
Authentication
No auth required when used as a local Python library. Proxy server mode supports a LITELLM_MASTER_KEY or per-virtual-key auth for team use. Upstream provider keys are passed via environment variables or config.
Pricing
The core library and proxy are free and open source. Enterprise tier is optional for large team deployments.
Agent Metadata
Known Gotchas
- ⚠ Provider-specific parameters (e.g. Anthropic top_k, Bedrock guardrail IDs) must be passed via extra_body or provider-prefixed kwargs; they are silently ignored otherwise
- ⚠ Streaming response chunks differ subtly between providers even through the adapter — agents parsing raw chunks may break on provider switch
- ⚠ Router fallback order is defined in config, not inferred; if no fallback list is set, a provider outage raises immediately rather than trying alternatives
- ⚠ Cost tracking requires setting up a cost database (Redis or Postgres); without it, usage() returns None silently
- ⚠ Virtual key rate limits in proxy mode are enforced in-memory per pod — horizontal scaling without shared Redis will allow each pod its own full quota
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for LiteLLM.
Scores are editorial opinions as of 2026-03-06.