HeadlinesBriefing favicon HeadlinesBriefing.com

DeepMind Unveils Gemini 2.5 Computer Use Model for UI‑Driven Agents

Google DeepMind Blog •
×

Google DeepMind releases the Gemini 2.5 Computer Use model, a specialized agent built on Gemini 2.5 Pro that can interact with web and mobile interfaces. The model sits behind a new `computer_use` tool in the Gemini API, enabling developers to automate tasks that previously required manual UI navigation.

The design relies on a loop that feeds the model user requests, screenshots, and action history, returning function calls like click or type. After each action, the updated screen and URL feed back into the loop, allowing the agent to finish complex workflows such as form submission or booking appointments.

Benchmarks from Browserbase show Gemini 2.5 Computer Use outperforms competing systems on web‑control tests while keeping latency low. DeepMind also embeds per‑step safety checks and developer‑configurable controls to curb risky actions such as CAPTCHA bypass or high‑stakes commands.

Early adopters have used the preview in production for UI testing, automation, and AI‑powered search features, proving the model’s practical value for software engineering and personal assistant workflows.