Google Unveils Gemini 2.5 Computer Use That Clicks, Types, and Scrolls Like Humans

0

Today, Google unveiled the Gemini 2.5 Computer Use AI model, a cutting-edge system engineered to interact seamlessly with user interfaces (UIs). Built on the advanced Gemini 2.5 Pro foundation, this model enhances AI agents with powerful visual and reasoning capabilities, enabling them to navigate both browser and web interfaces as well as Android UIs.

The Gemini 2.5 Computer Use model mimics human behavior by executing actions such as clicking, typing, and scrolling to efficiently complete tasks. Demonstrating its superiority, the model scored an impressive 88.9% on the WebVoyager benchmark, surpassing OpenAI’s Computer-Using AI Agent, which scored 87%. It also outperformed OpenAI’s Operator AI agent in the Online-Mind2Web benchmark.

These results highlight Google’s success in training a state-of-the-art AI agent capable of reliably performing complex tasks within browsers. Additionally, Google’s model offers advantages in both accuracy and latency over competitors like Claude Sonnet 4.5 and OpenAI’s Computer-Using Agent.


Currently, versions of this model are deployed in Project Mariner and AI Mode within Google Search. Furthermore, the Gemini 2.5 Computer Use model is accessible via API through Google AI Studio and Vertex AI, opening new possibilities for developers and enterprises to leverage this technology.

This version clarifies key points, improves flow and readability, and uses a more professional tone while maintaining all essential details.

LEAVE A REPLY

Please enter your comment!
Please enter your name here