gemini-skill
Provides an MCP server that automates Gemini (web) interactions by driving a real browser via CDP/DevTools. It spawns/controls a background browser daemon, exposes MCP tools for chat, image generation, image upload/extraction/download, session navigation, and includes a watermark-removal step for downloaded images.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
Security posture is tied to controlling a real browser session with persistent Google login state (userDataDir). The project documentation does not describe MCP server authentication/authorization, input validation, or CSRF-like protections; anyone who can reach the daemon/MCP process may be able to trigger actions. It also uses stealth/anti-bot techniques, which may create compliance risk depending on your environment. The README mentions watermark removal and downloading images; handling of files/paths and logging of sensitive data is not specified.
⚡ Reliability
Best When
You need an MCP-compatible way to let an AI agent drive Gemini web for image/chat automation, and you can accept browser automation dependencies (login state, browser updates, UI changes).
Avoid When
You require rigorous security/compliance guarantees around automated access (e.g., stealth/anti-bot bypass) or you cannot store/manage persistent browser user data for Google login.
Use Cases
- • Agent-driven Gemini chat sessions and model switching
- • Prompt-to-image generation with full-size downloads
- • Reference-image uploads and image extraction from conversations
- • Browser-automation-backed MCP integration for AI agents (e.g., OpenClaw-capable MCP clients)
- • Automated handling of Gemini conversation history navigation
Not For
- • Production-grade, fully compliant automation where UI scraping/stealth behavior is disallowed
- • Use cases requiring a stable, vendor-supported public API for Gemini
- • Multi-tenant or high-concurrency workloads needing parallel browser instances
- • Systems requiring strong auditability and deterministic behavior across time
Interface
Authentication
Authentication is effectively delegated to Gemini web login inside the automated browser. There is no documented API-key/OAuth flow for the MCP server itself.
Pricing
README does not provide pricing; it appears to be self-hosted software relying on Gemini web access.
Agent Metadata
Known Gotchas
- ⚠ Requires an interactive/manual Google login on first run; subsequent operations depend on persistent user data directory.
- ⚠ Single CDP port per browser instance—running multiple instances can conflict unless ports/user profiles are isolated.
- ⚠ Image generation may take 60–120 seconds; agent timeouts should be set appropriately (README suggests >=180000ms).
- ⚠ Reliability may degrade if Gemini UI changes; tool execution depends on DOM selectors.
- ⚠ Daemon lifetime is governed by TTL (default 30 minutes) and will release/exit when idle.
Alternatives
Full Evaluation Report
Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for gemini-skill.
AI-powered analysis · PDF + markdown · Delivered within 30 minutes
Package Brief
Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.
Delivered within 10 minutes
Score Monitoring
Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.
Continuous monitoring
Scores are editorial opinions as of 2026-03-30.