OpenAI’s GPT-5: Claims to Match Expert Expertise

OpenAI recently introduced the GDPval benchmark test to evaluate the performance of its GPT-5 model against industry professionals. This test aims to measure how closely AI systems approach human output in economically significant sectors, a crucial component in the pursuit of artificial general intelligence (AGI). OpenAI claims its GPT-5 and Anthropic’s Claude Opus 4.1 “are already approaching the work quality of industry experts.”

Initial results reveal GPT-5 was rated better or equal to experts in 40.6% of cases, while Claude Opus 4.1 excelled in 49% of tasks. Despite these findings, OpenAI acknowledges that GDPval only addresses a fraction of actual job responsibilities and plans to create more comprehensive assessments.

The results suggest AI can enhance productivity, allowing professionals to focus on higher-value tasks. OpenAI’s leaders express optimism about the progress of GDPval and foresee further advancements in AI capabilities that will support human workers more effectively.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

AI Revolutionizes Cybersecurity Access: Empowering Defenders with Advanced Tools

Adobe Unveils Firefly AI Assistant, Featuring Enhanced Generative AI and Creative Tools – Moneycontrol.com

IDC MarketScape: Vendor Assessment of Global AI-Driven Enterprise Asset Management Solutions for Asset-Intensive Industries (2025-2026)

Cathay FHC Integrates OpenAI into Group Operations – Embracing Data Science Innovation

SoftBank Issues New Bonds to Refinance Debt and Support OpenAI – Finimize

Cirrus CI is Closing: Transition to a Scalable, AI-Driven Solution

Sal Khan’s Vision: Rethinking the Impact of AI on Education

Harnessing AI in Intelligent Organizations: Exploring Jevons Paradox and Its Impact on the Workforce

Exploiting MCP Servers in AI Systems: The Risk of Tool Modifications Post-Approval

The AI Quandary: Navigating Challenges and Controversies

OpenAI’s GPT-5: Claims to Match Expert Expertise

Exploring the Cosmos: NASA’s Nuclear Spacecraft and the Reveal of Our AI Innovations

Snap Reduces Workforce by 16% to Focus on AI Innovation

Salesforce Launches Headless 360: Empowering Agent Development with API and MCP Integration

Unlocking Potential: How Structured Data Could Define AI’s Next Frontier in Business

The AI Quandary: Navigating Challenges and Controversies

Local News

AI Revolutionizes Cybersecurity Access: Empowering Defenders with Advanced Tools

Cirrus CI is Closing: Transition to a Scalable, AI-Driven Solution

Adobe Unveils Firefly AI Assistant, Featuring Enhanced Generative AI and Creative Tools – Moneycontrol.com

Sal Khan’s Vision: Rethinking the Impact of AI on Education

AI Revolutionizes Cybersecurity Access: Empowering Defenders with Advanced Tools

Cirrus CI is Closing: Transition to a Scalable, AI-Driven Solution

Adobe Unveils Firefly AI Assistant, Featuring Enhanced Generative AI and Creative Tools – Moneycontrol.com