The IMO gold medal appears overshadowed by recent advancements in AI mathematics. Google’s Aletheia, powered by Gemini 3 Deep Think, achieved remarkable results in the FirstProof competition, solving 6 out of 10 complex mathematical questions without any human assistance. The questions, formulated by 11 renowned mathematicians, were unique and not available online, minimizing the risk of cheating. In contrast, OpenAI’s model managed to answer 5 questions correctly but utilized human intervention during its testing process.
FirstProof featured challenging problems, including one that remained unsolved until Aletheia’s independent resolution. Aletheia’s process involved real-time problem-solving, ensuring logical rigor without human formatting. As it dynamically allocated reasoning resources, it triumphed over difficult queries by effectively managing computation. This latest achievement gives Google a slight edge over OpenAI in AI-driven mathematical prowess, setting a higher bar for future challenges. The next wave of challenging questions is anticipated in mid-March.
Source link