gemini-skill

Provides an MCP server that automates Gemini (web) interactions by driving a real browser via CDP/DevTools. It spawns/controls a background browser daemon, exposes MCP tools for chat, image generation, image upload/extraction/download, session navigation, and includes a watermark-removal step for downloaded images.

Evaluated Mar 30, 2026 (90d ago)

Repo ↗ Ai Ml mcp browser-automation cdp puppeteer gemini image-generation nodejs stealth

⚙ Agent Friendliness

/ 100

Can an agent use this?

🔒 Security

/ 100

Is it safe for agents?

⚡ Reliability

/ 100

Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality

Documentation

Error Messages

Auth Simplicity

Rate Limits

🔒 Security

TLS Enforcement

Auth Strength

Scope Granularity

Dep. Hygiene

Secret Handling

Security posture is tied to controlling a real browser session with persistent Google login state (userDataDir). The project documentation does not describe MCP server authentication/authorization, input validation, or CSRF-like protections; anyone who can reach the daemon/MCP process may be able to trigger actions. It also uses stealth/anti-bot techniques, which may create compliance risk depending on your environment. The README mentions watermark removal and downloading images; handling of files/paths and logging of sensitive data is not specified.

⚡ Reliability

Uptime/SLA

Version Stability

Breaking Changes

Error Recovery

Best When

You need an MCP-compatible way to let an AI agent drive Gemini web for image/chat automation, and you can accept browser automation dependencies (login state, browser updates, UI changes).

Avoid When

You require rigorous security/compliance guarantees around automated access (e.g., stealth/anti-bot bypass) or you cannot store/manage persistent browser user data for Google login.

Use Cases

• Agent-driven Gemini chat sessions and model switching
• Prompt-to-image generation with full-size downloads
• Reference-image uploads and image extraction from conversations
• Browser-automation-backed MCP integration for AI agents (e.g., OpenClaw-capable MCP clients)
• Automated handling of Gemini conversation history navigation

Not For

• Production-grade, fully compliant automation where UI scraping/stealth behavior is disallowed
• Use cases requiring a stable, vendor-supported public API for Gemini
• Multi-tenant or high-concurrency workloads needing parallel browser instances
• Systems requiring strong auditability and deterministic behavior across time

Interface

REST API

Yes

GraphQL

gRPC

MCP Server

Yes

SDK

Webhooks

Authentication

Methods: Manual Google account login in a persistent browser profile (user data directory)

OAuth: No Scopes: No

Authentication is effectively delegated to Gemini web login inside the automated browser. There is no documented API-key/OAuth flow for the MCP server itself.

Pricing

Free tier: No

Requires CC: No

README does not provide pricing; it appears to be self-hosted software relying on Gemini web access.

Agent Metadata

Pagination

none

Idempotent

False

Retry Guidance

Not documented

Known Gotchas

⚠ Requires an interactive/manual Google login on first run; subsequent operations depend on persistent user data directory.
⚠ Single CDP port per browser instance—running multiple instances can conflict unless ports/user profiles are isolated.
⚠ Image generation may take 60–120 seconds; agent timeouts should be set appropriately (README suggests >=180000ms).
⚠ Reliability may degrade if Gemini UI changes; tool execution depends on DOM selectors.
⚠ Daemon lifetime is governed by TTL (default 30 minutes) and will release/exit when idle.

Alternatives

Use official Gemini APIs (where available) for chat/image generation instead of UI automation If MCP is required, wrap any official API behind an MCP server Use other browser automation MCP skills targeting different LLM/image providers

Full Evaluation Report

Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for gemini-skill.

AI-powered analysis · PDF + markdown · Delivered within 30 minutes

$99

Package Brief

Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.

Delivered within 10 minutes

Score Monitoring

Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.

Continuous monitoring

$3/mo

API endpoint ↗ Agent guide ↗ Report inaccuracy

Scores are editorial opinions as of 2026-03-30.