statsmodels

Python library for statistical modeling and econometrics that provides OLS regression, GLMs, logistic regression, time series models (ARIMA, VAR, SARIMAX), and a comprehensive suite of hypothesis tests with R-style model summaries.

Evaluated Mar 06, 2026 (0d ago) vcurrent

Homepage ↗ Repo ↗ Other python statistics econometrics regression time-series hypothesis-testing arima

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

100

Rate Limits

100

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

No network layer; security surface limited to local file I/O for model persistence

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need rigorous statistical inference with p-values, confidence intervals, and diagnostic tests rather than pure predictive accuracy, especially for econometric or academic research.

Avoid When

Your goal is maximizing predictive accuracy on held-out data rather than statistical inference — use scikit-learn or XGBoost instead.

Use Cases

• Fitting OLS regression with full statistical output (coefficients, p-values, confidence intervals, R-squared) in a .summary() table
• Building ARIMA and SARIMAX time series models for forecasting with automatic order selection
• Running hypothesis tests (t-tests, chi-square, Granger causality, cointegration, heteroskedasticity) on data
• Estimating generalized linear models (Poisson, negative binomial, logit, probit) with link functions for count or binary outcomes
• Analyzing panel data with fixed and random effects models for econometric research

Not For

• Prediction-focused machine learning pipelines — scikit-learn's fit/predict API is better suited for that workflow
• Deep learning or neural network models
• Real-time streaming statistical analysis at high throughput

Interface

REST API

GraphQL

gRPC

MCP Server

SDK

Yes

Webhooks

Authentication

Methods: none

OAuth: No Scopes: No

Local Python library — no authentication required

Pricing

Model: open_source

Free tier: Yes

Requires CC: No

BSD 3-Clause license; completely free and open source

Agent Metadata

Pagination

none

Idempotent

Full

Retry Guidance

Not documented

Known Gotchas

⚠ statsmodels API is deliberately different from scikit-learn — it uses fit() returning a Results object, not a fitted estimator, so sklearn Pipeline is not directly compatible
⚠ Formula API (smf.ols('y ~ x', data=df)) and array API (sm.OLS(y, X)) behave differently — the formula API adds an intercept automatically while the array API does not; forgetting sm.add_constant() in the array API produces silent wrong results
⚠ Convergence warnings from MLE optimization (logit, ARIMA) do not raise exceptions — agents must check result.mle_retvals or inspect warnings to detect failed convergence
⚠ ARIMA order selection is not automatic by default — agents must specify (p, d, q) order explicitly or use auto_arima from pmdarima as a wrapper
⚠ The .summary() output is a human-readable text/HTML object designed for display, not a machine-readable dict — use result.params, result.pvalues, result.conf_int() to extract values programmatically

Alternatives

scikit-learn scipy-stats pingouin pymc r-via-rpy2

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for statsmodels.

$99

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-06.