Contribute to Assay
Help build the quality layer for agentic software. Run evaluations on MCP servers and agent skills with your own tokens, submit results, and help the community find the best tools.
How Community Evaluation Works
Pick a Package
Choose from the evaluation queue below, or evaluate a package you use. The queue prioritizes packages with no existing evaluation or outdated scores.
Run the Evaluation
Use the Assay evaluation skill or CLI tool. It runs against your LLM tokens, analyzes the package, and produces a structured JSON result.
Submit Results
Open a pull request to the Assay repo with your evaluation JSON. Assay reviews and merges quality submissions.
Get Your API Key
Sign in with GitHub to get an API key for submitting evaluations. Your GitHub identity is used for contributor attribution and trust tier progression.
We only request read:user scope — we store your
username and avatar, nothing else.
Trust & Quality
Reproducible Evaluations
All evaluations use Assay's standardized eval configs — deterministic prompts, pinned model versions, and structured output schemas. Results should be reproducible by anyone running the same config.
Cross-Validation
When multiple independent contributors evaluate the same package, agreement between submissions increases confidence. Cross-validated scores carry more weight.
Spot-Check Verification
Assay re-runs approximately 10% of new contributor submissions using our own tokens as a quality gate. This builds trust gradually — established contributors earn higher trust over time.
Anti-Gaming
Package authors evaluating their own tools must disclose the relationship. Undisclosed self-evaluations that significantly diverge from independent evaluations are flagged for review.
Support the Mission
Not ready to run evaluations? You can still help. Every dollar funds compute for package discovery, evaluation, and keeping scores current across the ecosystem.
Evaluation Queue
17761 packages need evaluationOCI Service Limits MCP server
OCI Service Catalog MCP server
OCI Secure Desktops MCP server
OCI Rover MCP server
OCI Resource Manager MCP server
OCI Resource Scheduler MCP server
OCI Oracle Streaming MCP server
OCI Language MCP server
OCI License Manager MCP server
OCI Java Management MCP server
OCI Kubernetes Engine MCP server
OCI Identity Service MCP server
OCI Internet of Things MCP server
OCI Identity Domains MCP server
OCI GoldenGate MCP server
OCI Globally Distributed Database MCP server
OCI Email Delivery MCP server
An Model Context Protocol (MCP) server for OCI Documentation
OCI Document Understanding MCP server
OCI Digital Assistant MCP server
Showing 20 of 17761 packages. View full queue via API →
Agent Evaluation Guide
A single document with the complete scoring rubric, JSON schema, and submission instructions. Any AI agent can fetch this URL, evaluate a package, and submit results.
View Evaluation Guide → Rubric v2.0 · Markdown formatGetting Started
Option 1: Use Any AI Agent (Recommended)
Have your AI agent fetch the evaluation guide, evaluate a package from the queue, and submit via the API. Works with Claude, GPT, Gemini, or any agent.
# Your agent fetches the guide and submits results
curl -X POST https://assay.tools/v1/evaluations \
-H "Content-Type: application/json" \
-H "X-Api-Key: your-api-key" \
-d @evaluation.json
Option 2: Assay CLI Tool
Run evaluations locally using Assay's built-in evaluator:
# Clone the repo
git clone https://github.com/Assay-Tools/assay.git
cd assay
# Run evaluation on a specific package
uv run python -m assay.evaluation.evaluator --package <package-id>
# Or batch evaluate discovered packages
uv run python -m assay.evaluation.evaluator --batch --limit 5
Option 3: Request an Evaluation
Know a package that should be in Assay? Open a GitHub issue with the package name and repo URL, and we'll add it to the queue.