Seldon Core

Enterprise-grade ML model serving platform for Kubernetes with advanced features including multi-armed bandits, outlier detection, concept drift detection, and model graphs (chaining models in pipelines). Seldon Core v2 is a complete rewrite with a Pipeline abstraction for composing models, transformers, and routers into inference graphs. Used in regulated industries for production ML deployment with explainability, monitoring, and governance. Backed by Seldon Technologies (commercial company).

Evaluated Mar 06, 2026 (0d ago) v2.x

Homepage ↗ Repo ↗ AI & Machine Learning kubernetes model-serving inference mlops open-source enterprise a-b-testing

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Apache 2.0, open source. Kubernetes RBAC + Istio service mesh. SOC2 for commercial Seldon Deploy. Established company (Seldon Technologies) with security track record. Regulated industry deployments supported.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need enterprise ML serving with model pipelines, A/B testing, drift detection, and explainability on Kubernetes — especially in regulated industries requiring ML governance.

Avoid When

You need simple model serving or don't have Kubernetes infrastructure — BentoML or Ray Serve are more accessible.

Use Cases

• Deploy model inference pipelines where predictions route through preprocessors, multiple models, and postprocessors as a graph of components
• Implement multi-armed bandit routing to continuously test model variants and automatically route more traffic to better-performing models
• Deploy production ML with built-in outlier detection and drift detection to flag when inputs deviate from training distribution
• Serve models with explainability endpoints (SHAP, LIME, Anchor) alongside predictions for regulated industry compliance
• Build agent inference pipelines that combine multiple models (e.g., retriever + ranker + LLM) as a Seldon Pipeline definition

Not For

• Simple single-model serving — TorchServe or BentoML are simpler for basic inference without pipeline composition
• Teams without Kubernetes experience — Seldon requires significant Kubernetes and operator knowledge to operate
• LLM-first workloads without traditional ML — vLLM or TGI are more optimized for pure LLM serving

Interface

REST API

Yes

GraphQL

gRPC

Yes

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: bearer_token api_key

OAuth: Yes Scopes: Yes

Seldon Core v2 uses Kubernetes RBAC and Istio for auth. Seldon Deploy (commercial product) adds enterprise auth. Service mesh (Istio) provides JWT-based auth for inference endpoints. Open source version relies on Kubernetes native auth.

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

Seldon Core is Apache 2.0. Seldon Technologies offers Seldon Deploy as a commercial product on top. Core OSS is free to self-host.

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ Seldon Core v2 is a full rewrite from v1 — v1 documentation and code examples don't apply to v2; verify which version is deployed
⚠ Pipeline component startup order matters — downstream components wait for upstream model servers to be ready before accepting traffic
⚠ Seldon uses V2 Inference Protocol (KServe-compatible) for REST and gRPC — agents must use correct request format per protocol spec
⚠ Multi-model pipelines add network latency for each hop — pipeline graphs with many components can have significantly higher total latency
⚠ Custom server implementations must implement V2 Inference Protocol — simple Flask wrappers won't work without protocol compliance
⚠ Drift detection and outlier detection run as separate server processes — they consume GPU/CPU in addition to the main model server
⚠ Seldon operator logs and CRD events are the primary debugging interface — agents need kubectl access or Seldon Deploy dashboard

Alternatives

kserve-api torchserve-api ray-serve-api bentoml-api nvidia-triton-api

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for Seldon Core.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.