Wednesday, October 8, 2025

Enabling AI to Seamlessly Control Browsers

Google DeepMind has launched the Gemini 2.5 Computer Use model, enhancing browser integration for AI interactions. Building on previous developments like Chrome DevTools, this model, akin to OpenAI’s Computer-Using Agent, enables AI to control user browsers with impressive speed and accuracy. It excels in straightforward tasks, such as retrieving pet care information or organizing club tasks, although it struggles with complexity, particularly in multi-step processes.

Available via the Gemini API on Google AI Studio and Vertex AI, the model demonstrates strong performance benchmarks but hasn’t yet mastered desktop optimization. Key security measures are integrated to prevent misuse, with options for developers to limit high-risk actions.

As technology companies compete in AI agent development, Gemini 2.5 represents a shift towards natural language-driven interactions, heralding a new era in digital computing. In summary, Gemini 2.5’s capabilities signal a pivotal change in how users will engage with technology in the future.

Source link

Share

Read more

Local News