Seldon Core

Enterprise-grade ML model serving platform for Kubernetes with advanced features including multi-armed bandits, outlier detection, concept drift detection, and model graphs (chaining models in pipelines). Seldon Core v2 is a complete rewrite with a Pipeline abstraction for composing models, transformers, and routers into inference graphs. Used in regulated industries for production ML deployment with explainability, monitoring, and governance. Backed by Seldon Technologies (commercial company).

Evaluated Mar 06, 2026 (0d ago) v2.x
Homepage ↗ Repo ↗ AI & Machine Learning kubernetes model-serving inference mlops open-source enterprise a-b-testing
⚙ Agent Friendliness
54
/ 100
Can an agent use this?
🔒 Security
82
/ 100
Is it safe for agents?
⚡ Reliability
70
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
72
Error Messages
68
Auth Simplicity
78
Rate Limits
82

🔒 Security

TLS Enforcement
90
Auth Strength
80
Scope Granularity
78
Dep. Hygiene
80
Secret Handling
82

Apache 2.0, open source. Kubernetes RBAC + Istio service mesh. SOC2 for commercial Seldon Deploy. Established company (Seldon Technologies) with security track record. Regulated industry deployments supported.

⚡ Reliability

Uptime/SLA
75
Version Stability
68
Breaking Changes
60
Error Recovery
78
AF Security Reliability

Best When

You need enterprise ML serving with model pipelines, A/B testing, drift detection, and explainability on Kubernetes — especially in regulated industries requiring ML governance.

Avoid When

You need simple model serving or don't have Kubernetes infrastructure — BentoML or Ray Serve are more accessible.

Use Cases

  • Deploy model inference pipelines where predictions route through preprocessors, multiple models, and postprocessors as a graph of components
  • Implement multi-armed bandit routing to continuously test model variants and automatically route more traffic to better-performing models
  • Deploy production ML with built-in outlier detection and drift detection to flag when inputs deviate from training distribution
  • Serve models with explainability endpoints (SHAP, LIME, Anchor) alongside predictions for regulated industry compliance
  • Build agent inference pipelines that combine multiple models (e.g., retriever + ranker + LLM) as a Seldon Pipeline definition

Not For

  • Simple single-model serving — TorchServe or BentoML are simpler for basic inference without pipeline composition
  • Teams without Kubernetes experience — Seldon requires significant Kubernetes and operator knowledge to operate
  • LLM-first workloads without traditional ML — vLLM or TGI are more optimized for pure LLM serving

Interface

REST API
Yes
GraphQL
No
gRPC
Yes
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: bearer_token api_key
OAuth: Yes Scopes: Yes

Seldon Core v2 uses Kubernetes RBAC and Istio for auth. Seldon Deploy (commercial product) adds enterprise auth. Service mesh (Istio) provides JWT-based auth for inference endpoints. Open source version relies on Kubernetes native auth.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

Seldon Core is Apache 2.0. Seldon Technologies offers Seldon Deploy as a commercial product on top. Core OSS is free to self-host.

Agent Metadata

Pagination
none
Idempotent
Full
Retry Guidance
Not documented

Known Gotchas

  • Seldon Core v2 is a full rewrite from v1 — v1 documentation and code examples don't apply to v2; verify which version is deployed
  • Pipeline component startup order matters — downstream components wait for upstream model servers to be ready before accepting traffic
  • Seldon uses V2 Inference Protocol (KServe-compatible) for REST and gRPC — agents must use correct request format per protocol spec
  • Multi-model pipelines add network latency for each hop — pipeline graphs with many components can have significantly higher total latency
  • Custom server implementations must implement V2 Inference Protocol — simple Flask wrappers won't work without protocol compliance
  • Drift detection and outlier detection run as separate server processes — they consume GPU/CPU in addition to the main model server
  • Seldon operator logs and CRD events are the primary debugging interface — agents need kubectl access or Seldon Deploy dashboard

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Seldon Core.

$99

Scores are editorial opinions as of 2026-03-06.

5178
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered