Tabby

Self-hosted AI coding assistant with a REST API. Tabby runs locally or on-premises, providing GitHub Copilot-like code completions using open-source LLMs (CodeLlama, StarCoder, DeepSeek Coder). Ships with IDE plugins (VS Code, JetBrains, Vim/Neovim) and a REST API for custom integrations. Enterprise features include RBAC, Git context indexing, and a team management UI. Core value: full data privacy with no code sent to external services.

Evaluated Mar 07, 2026 (0d ago) v0.18+
Homepage ↗ Repo ↗ AI & Machine Learning code-completion self-hosted ai-coding ide-plugin open-source local-llm privacy rust
⚙ Agent Friendliness
62
/ 100
Can an agent use this?
🔒 Security
82
/ 100
Is it safe for agents?
⚡ Reliability
74
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
80
Error Messages
75
Auth Simplicity
88
Rate Limits
95

🔒 Security

TLS Enforcement
85
Auth Strength
80
Scope Granularity
75
Dep. Hygiene
82
Secret Handling
88

Full data sovereignty — no code leaves the network. Self-hosted eliminates SaaS data retention risks. TLS configuration is admin responsibility. Apache 2.0 licensed, Rust-based server.

⚡ Reliability

Uptime/SLA
75
Version Stability
75
Breaking Changes
72
Error Recovery
75
AF Security Reliability

Best When

You have security requirements preventing code from leaving your network and have GPU infrastructure available for inference. Best for privacy-first teams or enterprises with compliance requirements.

Avoid When

You can use cloud-based tools and prioritize completion quality over privacy — Cursor or GitHub Copilot will outperform self-hosted open models.

Use Cases

  • Provide AI code completions to development teams without sending proprietary code to external APIs — all inference runs on-premises
  • Build custom coding assistant workflows using Tabby's REST API for code completion and repository context retrieval
  • Index team repositories in Tabby's context engine to provide codebase-aware completions aware of internal patterns and libraries
  • Offer AI coding assistance in air-gapped environments where external API access is prohibited by security policy
  • Run as a shared team coding assistant on a GPU server, providing Copilot-equivalent features without per-seat SaaS costs

Not For

  • Teams without GPU infrastructure — Tabby requires a GPU for good performance; CPU-only inference is extremely slow
  • Users wanting the absolute best code completion quality — GitHub Copilot (GPT-4) and Cursor (Claude) have higher quality than available open-source models
  • Quick setup without infrastructure — requires Docker/GPU server setup vs. simple SaaS signup

Interface

REST API
Yes
GraphQL
Yes
gRPC
No
MCP Server
No
SDK
No
Webhooks
No

Authentication

Methods: bearer_token
OAuth: No Scopes: No

Bearer token auth via Authorization header. Tokens created in Tabby admin UI. Enterprise version adds LDAP/SSO integration and per-user token management.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

Core Tabby is Apache 2.0 open source. Enterprise adds LDAP, RBAC, and admin dashboard. Infrastructure costs (GPU server) are the primary cost driver.

Agent Metadata

Pagination
offset
Idempotent
Partial
Retry Guidance
Not documented

Known Gotchas

  • Tabby requires a CUDA-capable GPU for production use — CPU inference is available but 10-50x slower, making interactive completions impractical
  • Model download on first startup can take 10-30 minutes depending on model size — health check endpoint (/health) returns unhealthy until model is loaded
  • Repository context indexing is asynchronous — after configuring a Git repository, wait for the indexer to complete before expecting context-aware completions
  • Tabby's completion API format is compatible with GitHub Copilot's API — IDE plugins using Copilot protocol work without modification
  • Tabby Enterprise LDAP integration requires network access to the LDAP server from the Tabby container — network policy must be configured correctly
  • Model selection affects VRAM requirements significantly — CodeLlama 7B requires ~8GB VRAM, 34B requires ~70GB; choose model based on available hardware

Alternatives

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for Tabby.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

$3

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

Scores are editorial opinions as of 2026-03-07.

6470
Packages Evaluated
26150
Need Evaluation
173
Need Re-evaluation
Community Powered