Friday, October 10, 2025

AI Achieves Human-Like Skill in Browser Navigation

Google’s Gemini 2.5 Computer Use model represents a major leap in AI, designed for human-like web interactions. This model, accessible via the Gemini API on Google AI Studio and Vertex AI, autonomously performs tasks like scrolling, clicking, and form-filling without relying on predefined scripts. As detailed by Google DeepMind, Gemini 2.5 excels in browser and mobile applications, showcasing an “agentic” AI that manages multi-step processes independently. Notably, it employs a virtual browser to interpret user interfaces in real-time, enhancing automation for businesses across various sectors, including finance and healthcare. Despite challenges in reliability and privacy, early testing indicates strong potential. As Google pushes for seamless integration with tools like Google Workspace, Gemini 2.5 could redefine productivity and efficiency. The model invites developer experimentation to refine its capabilities, signaling a transformative future for AI in automating complex online tasks while addressing ethical considerations in AI technology.

Source link

Share

Read more

Local News