Friday, September 26, 2025

OpenAI’s GPT-5: Claims to Match Expert Expertise

OpenAI recently introduced the GDPval benchmark test to evaluate the performance of its GPT-5 model against industry professionals. This test aims to measure how closely AI systems approach human output in economically significant sectors, a crucial component in the pursuit of artificial general intelligence (AGI). OpenAI claims its GPT-5 and Anthropic’s Claude Opus 4.1 “are already approaching the work quality of industry experts.”

Initial results reveal GPT-5 was rated better or equal to experts in 40.6% of cases, while Claude Opus 4.1 excelled in 49% of tasks. Despite these findings, OpenAI acknowledges that GDPval only addresses a fraction of actual job responsibilities and plans to create more comprehensive assessments.

The results suggest AI can enhance productivity, allowing professionals to focus on higher-value tasks. OpenAI’s leaders express optimism about the progress of GDPval and foresee further advancements in AI capabilities that will support human workers more effectively.

Source link

Share

Read more

Local News