Seldon Core
Enterprise-grade ML model serving platform for Kubernetes with advanced features including multi-armed bandits, outlier detection, concept drift detection, and model graphs (chaining models in pipelines). Seldon Core v2 is a complete rewrite with a Pipeline abstraction for composing models, transformers, and routers into inference graphs. Used in regulated industries for production ML deployment with explainability, monitoring, and governance. Backed by Seldon Technologies (commercial company).
Score Breakdown
⚙ Agent Friendliness
🔒 Security
Apache 2.0, open source. Kubernetes RBAC + Istio service mesh. SOC2 for commercial Seldon Deploy. Established company (Seldon Technologies) with security track record. Regulated industry deployments supported.
⚡ Reliability
Best When
You need enterprise ML serving with model pipelines, A/B testing, drift detection, and explainability on Kubernetes — especially in regulated industries requiring ML governance.
Avoid When
You need simple model serving or don't have Kubernetes infrastructure — BentoML or Ray Serve are more accessible.
Use Cases
- • Deploy model inference pipelines where predictions route through preprocessors, multiple models, and postprocessors as a graph of components
- • Implement multi-armed bandit routing to continuously test model variants and automatically route more traffic to better-performing models
- • Deploy production ML with built-in outlier detection and drift detection to flag when inputs deviate from training distribution
- • Serve models with explainability endpoints (SHAP, LIME, Anchor) alongside predictions for regulated industry compliance
- • Build agent inference pipelines that combine multiple models (e.g., retriever + ranker + LLM) as a Seldon Pipeline definition
Not For
- • Simple single-model serving — TorchServe or BentoML are simpler for basic inference without pipeline composition
- • Teams without Kubernetes experience — Seldon requires significant Kubernetes and operator knowledge to operate
- • LLM-first workloads without traditional ML — vLLM or TGI are more optimized for pure LLM serving
Interface
Authentication
Seldon Core v2 uses Kubernetes RBAC and Istio for auth. Seldon Deploy (commercial product) adds enterprise auth. Service mesh (Istio) provides JWT-based auth for inference endpoints. Open source version relies on Kubernetes native auth.
Pricing
Seldon Core is Apache 2.0. Seldon Technologies offers Seldon Deploy as a commercial product on top. Core OSS is free to self-host.
Agent Metadata
Known Gotchas
- ⚠ Seldon Core v2 is a full rewrite from v1 — v1 documentation and code examples don't apply to v2; verify which version is deployed
- ⚠ Pipeline component startup order matters — downstream components wait for upstream model servers to be ready before accepting traffic
- ⚠ Seldon uses V2 Inference Protocol (KServe-compatible) for REST and gRPC — agents must use correct request format per protocol spec
- ⚠ Multi-model pipelines add network latency for each hop — pipeline graphs with many components can have significantly higher total latency
- ⚠ Custom server implementations must implement V2 Inference Protocol — simple Flask wrappers won't work without protocol compliance
- ⚠ Drift detection and outlier detection run as separate server processes — they consume GPU/CPU in addition to the main model server
- ⚠ Seldon operator logs and CRD events are the primary debugging interface — agents need kubectl access or Seldon Deploy dashboard
Alternatives
Full Evaluation Report
Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Seldon Core.
Scores are editorial opinions as of 2026-03-06.