statsmodels

Python library for statistical modeling and econometrics that provides OLS regression, GLMs, logistic regression, time series models (ARIMA, VAR, SARIMAX), and a comprehensive suite of hypothesis tests with R-style model summaries.

Evaluated Mar 06, 2026 (0d ago) vcurrent
Homepage ↗ Repo ↗ Other python statistics econometrics regression time-series hypothesis-testing arima
⚙ Agent Friendliness
64
/ 100
Can an agent use this?
🔒 Security
88
/ 100
Is it safe for agents?
⚡ Reliability
78
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
82
Error Messages
74
Auth Simplicity
100
Rate Limits
100

🔒 Security

TLS Enforcement
90
Auth Strength
90
Scope Granularity
85
Dep. Hygiene
84
Secret Handling
90

No network layer; security surface limited to local file I/O for model persistence

⚡ Reliability

Uptime/SLA
82
Version Stability
80
Breaking Changes
75
Error Recovery
74
AF Security Reliability

Best When

You need rigorous statistical inference with p-values, confidence intervals, and diagnostic tests rather than pure predictive accuracy, especially for econometric or academic research.

Avoid When

Your goal is maximizing predictive accuracy on held-out data rather than statistical inference — use scikit-learn or XGBoost instead.

Use Cases

  • Fitting OLS regression with full statistical output (coefficients, p-values, confidence intervals, R-squared) in a .summary() table
  • Building ARIMA and SARIMAX time series models for forecasting with automatic order selection
  • Running hypothesis tests (t-tests, chi-square, Granger causality, cointegration, heteroskedasticity) on data
  • Estimating generalized linear models (Poisson, negative binomial, logit, probit) with link functions for count or binary outcomes
  • Analyzing panel data with fixed and random effects models for econometric research

Not For

  • Prediction-focused machine learning pipelines — scikit-learn's fit/predict API is better suited for that workflow
  • Deep learning or neural network models
  • Real-time streaming statistical analysis at high throughput

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

Local Python library — no authentication required

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

BSD 3-Clause license; completely free and open source

Agent Metadata

Pagination
none
Idempotent
Full
Retry Guidance
Not documented

Known Gotchas

  • statsmodels API is deliberately different from scikit-learn — it uses fit() returning a Results object, not a fitted estimator, so sklearn Pipeline is not directly compatible
  • Formula API (smf.ols('y ~ x', data=df)) and array API (sm.OLS(y, X)) behave differently — the formula API adds an intercept automatically while the array API does not; forgetting sm.add_constant() in the array API produces silent wrong results
  • Convergence warnings from MLE optimization (logit, ARIMA) do not raise exceptions — agents must check result.mle_retvals or inspect warnings to detect failed convergence
  • ARIMA order selection is not automatic by default — agents must specify (p, d, q) order explicitly or use auto_arima from pmdarima as a wrapper
  • The .summary() output is a human-readable text/HTML object designed for display, not a machine-readable dict — use result.params, result.pvalues, result.conf_int() to extract values programmatically

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for statsmodels.

$99

Scores are editorial opinions as of 2026-03-06.

5173
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered