{"id":"ui-tars-desktop","name":"UI-TARS Desktop","af_score":66.5,"security_score":65.0,"reliability_score":null,"what_it_does":"UI-TARS Desktop is an open-source multimodal AI agent stack that enables natural language control of GUIs (desktop, browser, terminal) via vision-language models. It includes Agent TARS (a CLI/web agent) and UI-TARS Desktop (a native GUI automation app), both built on MCP as their kernel.","best_when":"You want an open-source, multimodal computer-use agent that can control GUIs by seeing the screen, supports local models for privacy, and integrates with the MCP ecosystem for tool extensibility.","avoid_when":"You need a managed, hosted service with guaranteed uptime — this is self-hosted open-source software requiring significant setup and model access.","last_evaluated":"2026-03-01T09:50:06.337222+00:00"}