chatgpt-retrieval-plugin

⚠ Stale — 114d ago

A self-hosted Retrieval-Augmented Generation (RAG) backend that exposes a FastAPI service for semantic search over user-provided documents. It chunks documents, creates embeddings via OpenAI, stores/query them in a chosen vector database provider, and serves query/upsert/delete endpoints intended to be used as a ChatGPT Retrieval Plugin backend (and for custom GPT actions/function calling).

Evaluated Mar 29, 2026 (114d ago)

Repo ↗ Ai Ml ai-ml retrieval rag vector-database fastapi semantic-search self-hosted python

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Security guidance is present at least conceptually (authentication methods, bearer token setup). However, the provided content does not show fine-grained scopes/authorization, token rotation policies, or explicit rate-limit/error security semantics. Deployment should ensure HTTPS termination, protect the server from public exposure, and securely manage environment variables containing API keys and vector DB credentials.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You want a self-hosted, configurable RAG retrieval layer with pluggable vector database backends and metadata filtering, and you can manage hosting/operations and API authentication for your environment.

Avoid When

You cannot guarantee secure deployment of the FastAPI server (auth, network controls) or you need strong multi-tenant/least-privilege authorization semantics beyond the provided token-based approaches.

Use Cases

• Semantic search over personal or organizational documents using natural-language queries
• RAG pipelines where you want control over chunking, embedding dimensions/models, and vector DB provider
• Building ChatGPT custom GPTs that can retrieve relevant snippets from documents
• Automating document ingestion/updating via webhooks into upsert/delete endpoints
• Enterprise internal knowledge retrieval (self-hosted) with metadata filtering (e.g., author/date/source)

Not For

• Serving as a general-purpose document storage service without a vector-search focus
• Use cases requiring multi-tenant isolation with fine-grained per-user authorization unless additional controls are implemented
• High-availability systems without operational readiness (monitoring, backups, vector DB scaling)

Interface

REST API

Yes

GraphQL

gRPC

MCP Server

SDK

Webhooks

Yes

Authentication

Methods: None API key (Basic) API key (Bearer) OAuth (per README) Bearer token via BEARER_TOKEN environment variable (for local/server setup)

OAuth: Yes Scopes: No

Auth methods are described at a high level (None/Basic-Bearer/OAuth). Scopes/granularity are not evident from the provided excerpt; the quickstart indicates a shared BEARER_TOKEN for access to the API.

Pricing

Free tier: No

Requires CC: No

As an open-source self-hosted service, direct subscription pricing is not specified in the provided content; costs depend on hosting and upstream APIs (notably embeddings/completions via OpenAI or Azure OpenAI) and your chosen vector database.

Agent Metadata

Pagination

none

Idempotent

False

Retry Guidance

Not documented

Known Gotchas

⚠ Shared-token auth can complicate agent workflows in multi-tenant contexts if you need per-user separation.
⚠ The service relies on external vector database configuration and correct environment variables; misconfiguration may cause failures that agents can’t automatically remediate.
⚠ No explicit guidance is provided (in the excerpt) about pagination, rate-limit response headers, or safe retry/idempotency semantics for upsert/query operations.

Alternatives

OpenAI/ChatGPT native file retrieval features (if you don’t need granular retrieval control) Other self-hosted RAG stacks such as LlamaIndex/your own FastAPI + vector DB integration Vector DB + embedding pipeline with a custom API (e.g., using pgvector, Qdrant, Weaviate, Pinecone)

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for chatgpt-retrieval-plugin.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-29.