Google's new Gemini 2.5 Computer Use model can click, type, and scroll

Javier Zayas Photography/Moment via Getty Images

Follow ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

Google’s new AI model can interact directly with website UIs.
It joins similar tools from OpenAI and Anthropic.
The company also admitted its weaknesses, including hallucinations.

Google DeepMind has debuted a new AI model in public preview that’s designed to navigate a web browser just as a human would.

Built atop Gemini 2.5 Pro, the company’s new Computer Use model can execute tasks like clicking, typing, and scrolling directly within a web page.

Also: 5 reasons I use local AI on my desktop – instead of ChatGPT, Gemini, or Claude

Users simply have to feed it a prompt in natural language — such as, “Open Wikipedia, search for ‘Atlantis,’ and summarize the history of the myth in Western thought.” The model will autonomously fetch the URL and screenshots of the requested site to analyze the user interface it needs to act within, and will perform the requested task step by step, all while outlining its reasoning and actions in a text box easily visible to users. It may also respond by asking for confirmation if it’s instructed to perform a sensitive task, like making a purchase.

The preview of Gemini 2.5 Computer Use follows the release of similar web-browsing models from OpenAI and Anthropic. Google previously debuted an experimental Chrome extension called Project Mariner, which can also take action on behalf of users within web pages.

How it works

Gemini 2.5 Computer Use runs off an iterative looping function that allows it to keep a record of all of its recent actions within a particular user interface and determine its next action accordingly. So the more tasks that it performs within a particular site, the more context it will have, and the more seamlessly it will function.

Google posted demo videos (sped up 3x) showing the model autonomously making an update in a customer relationship management site and rearranging notes on Google’s Jamboard platform, which was discontinued at the end of last year.

Also: ChatGPT’s Codex just got a huge upgrade that makes it more powerful than ever – what’s new

According to a blog post published by Google on Tuesday, the new model outperformed similar tools from Anthropic and OpenAI in terms of both accuracy and latency, and across “multiple web and mobile control benchmarks,” including Online-Mind2Web, an evaluation framework for testing the performance of web-browsing agents.

How to try it

The new model is intended mainly for web browsers, but also shows “strong promise” on mobile, Google said. It’s available now through the Gemini API in Google AI and through Vertex AI. A demo version is also available via Browserbase.

Safety considerations

The new model also comes with a set of safety controls, which Google says developers can use to prevent it from performing undesired actions like bypassing CAPTCHAs, compromising data security, or gaining control of medical devices. For example, developers can instruct the model to request user confirmation before it performs certain specified actions.

Want more stories about AI? Sign up for our AI Leaderboard newsletter.

The company also noted in the system card for the new model that it “may exhibit some of the general limitations of foundation models, as it is based off of Gemini 2.5 Pro, such as hallucinations, and limitations around causal understanding, complex logical deduction, and counterfactual reasoning.”

Those limitations are true of most models. Earlier this week, Anthropic published new research showing that many frontier AI models tended to whistleblow what they interpreted as unethical or illegal information in test scenarios, even when the supposedly incriminating information was actually harmless.

What's Hot

WNBA, WNBPA agree to additional extension of 40 days to continue CBA talks

During Advent, immigrant congregations find hope shadowed by fear : NPR

UK to withdraw financing for Total’s Mozambique LNG project

‘I didn’t even know this type of attack existed’: more than 200 women allege drugging by senior French civil servant | Rape and sexual assault

Pixel 10 Pro Fold review: Google’s bet to make foldables sturdy

Life Expectancy with Type 1 Diabetes Varies Dramatically by Nation

DNA Sequencing Reveals Hitler Was Type Of Fern

A Chinese AI model taught itself basic physics — what discoveries could it make?

World Diabetes Day 2025: How Henry Slade and other athletes manage type 1 diabetes

Client Challenge

China’s smartphone champion has triumphed where Apple failed

Latent Labs launches web-based AI model to democratize protein design