PyAutoGUI

Cross-platform GUI automation library — controls mouse, keyboard, and takes screenshots programmatically. PyAutoGUI features: pyautogui.moveTo(x, y), pyautogui.click(), pyautogui.typewrite('text'), pyautogui.press('enter'), pyautogui.screenshot(), pyautogui.locateOnScreen() (image template matching), pyautogui.hotkey('ctrl', 'c'), pyautogui.scroll(), drag and drop, window management, PAUSE between actions, FAILSAFE (move mouse to corner to abort), and cross-platform support (Windows, macOS, Linux). Primary Python library for desktop GUI automation for agent computer-use tasks.

Evaluated Mar 06, 2026 (0d ago) v0.9.x
Homepage ↗ Repo ↗ Developer Tools python pyautogui gui-automation mouse keyboard screenshot desktop-automation
⚙ Agent Friendliness
63
/ 100
Can an agent use this?
🔒 Security
83
/ 100
Is it safe for agents?
⚡ Reliability
72
/ 100
Does it work consistently?

Score Breakdown

⚙ Agent Friendliness

MCP Quality
--
Documentation
80
Error Messages
75
Auth Simplicity
98
Rate Limits
98

🔒 Security

TLS Enforcement
88
Auth Strength
88
Scope Granularity
80
Dep. Hygiene
82
Secret Handling
75

Full desktop control — pyautogui can control any application including password managers, banking apps, and system dialogs. Agent automation must not log or expose keystrokes containing credentials. FAILSAFE provides emergency stop but doesn't sandbox agent actions. Run in dedicated user account for sensitive automation.

⚡ Reliability

Uptime/SLA
72
Version Stability
75
Breaking Changes
78
Error Recovery
65
AF Security Reliability

Best When

Automating desktop applications (non-browser) for agent computer-use tasks — PyAutoGUI provides simple cross-platform mouse/keyboard control for legacy applications, desktop GUIs, and screen-based automation where Playwright/Selenium don't apply.

Avoid When

You're automating web browsers (use Playwright), need headless operation without Xvfb, or require precise timing for fast UI interactions.

Use Cases

  • Agent desktop automation — pyautogui.click(500, 300); pyautogui.typewrite('agent task input', interval=0.05); pyautogui.press('enter') — agent clicks UI element and types text; desktop automation without Playwright or Selenium for non-browser apps; legacy application automation
  • Agent screenshot and OCR — screenshot = pyautogui.screenshot(); screenshot.save('agent_view.png') — agent captures current screen state; combined with Tesseract/EasyOCR for text extraction; agent perceives screen content for decision making
  • Agent image-based clicking — location = pyautogui.locateOnScreen('button.png', confidence=0.9); pyautogui.click(location) — click button by visual template matching; agent finds UI elements by screenshot without coordinate hardcoding; confidence parameter for fuzzy matching
  • Agent hotkey sequences — pyautogui.hotkey('ctrl', 'alt', 't'); pyautogui.sleep(0.5); pyautogui.typewrite('ls -la ') — agent opens terminal and runs command; keyboard shortcut automation for agent desktop workflows
  • Agent failsafe automation — pyautogui.PAUSE = 0.5; pyautogui.FAILSAFE = True — 0.5 second pause between all pyautogui calls; move mouse to corner aborts automation; agent automation with safety mechanisms to prevent runaway actions

Not For

  • Browser automation — use Playwright or Selenium; pyautogui can control browsers but lacks DOM access, wait conditions, and proper web automation
  • Fast precise automation — pyautogui.PAUSE slows all actions; for high-frequency automation use platform-specific APIs (win32api on Windows, Quartz on macOS)
  • Headless environments — pyautogui requires a display; for headless agent automation use virtual framebuffer (Xvfb on Linux) or Playwright headless

Interface

REST API
No
GraphQL
No
gRPC
No
MCP Server
No
SDK
Yes
Webhooks
No

Authentication

Methods: none
OAuth: No Scopes: No

No auth — local desktop automation library.

Pricing

Model: open_source
Free tier: Yes
Requires CC: No

PyAutoGUI is BSD licensed. Free for all use.

Agent Metadata

Pagination
none
Idempotent
Partial
Retry Guidance
Not documented

Known Gotchas

  • PyAutoGUI requires display — running on headless Linux server (Docker, CI) fails with 'Cannot connect to X server'; agent GUI automation in CI must use Xvfb: Xvfb :99 -screen 0 1280x720x24 & export DISPLAY=:99; Docker agent containers need --privileged and Xvfb startup
  • locateOnScreen fails on HiDPI displays — macOS Retina/Windows HiDPI scales screenshots at 2x; pyautogui.locateOnScreen('button.png') fails because screenshot coordinates don't match physical pixels; agent automation on HiDPI must use pyautogui.locateOnScreen with correct scale or use pyautogui.screenshot() then scale template
  • pyautogui.PAUSE slows all operations — default pyautogui.PAUSE=0.1 adds 100ms after every mouse/keyboard call; 100 actions = 10 seconds minimum; agent automation loops with many actions must set pyautogui.PAUSE=0 and add explicit waits only where needed; never set to 0 without understanding risk
  • typewrite() only accepts ASCII — pyautogui.typewrite('hello') works; pyautogui.typewrite('héllo') silently skips non-ASCII characters; agent automation typing Unicode must use pyperclip.copy() + pyautogui.hotkey('ctrl', 'v') for clipboard paste as workaround
  • FAILSAFE moves mouse to (0,0) not corner — pyautogui.FAILSAFE=True triggers on mouse in top-left corner; agent automation on multi-monitor setups where top-left is not primary monitor origin may accidentally trigger FAILSAFE; set FAILSAFE=True and test on target display configuration
  • Screen resolution changes break coordinate-based automation — pyautogui.click(800, 600) hardcodes pixel coordinates; resolution changes or window repositioning break agent automation; use pyautogui.locateOnScreen() for visual element finding or pygetwindow for window-relative coordinates

Alternatives

Full Evaluation Report

Detailed scoring breakdown, competitive positioning, security analysis, and improvement recommendations for PyAutoGUI.

$99

Scores are editorial opinions as of 2026-03-06.

5215
Packages Evaluated
26151
Need Evaluation
173
Need Re-evaluation
Community Powered