OpenAI’s GPT-5: Claims to Match Expert Expertise

OpenAI recently introduced the GDPval benchmark test to evaluate the performance of its GPT-5 model against industry professionals. This test aims to measure how closely AI systems approach human output in economically significant sectors, a crucial component in the pursuit of artificial general intelligence (AGI). OpenAI claims its GPT-5 and Anthropic’s Claude Opus 4.1 “are already approaching the work quality of industry experts.”

Initial results reveal GPT-5 was rated better or equal to experts in 40.6% of cases, while Claude Opus 4.1 excelled in 49% of tasks. Despite these findings, OpenAI acknowledges that GDPval only addresses a fraction of actual job responsibilities and plans to create more comprehensive assessments.

The results suggest AI can enhance productivity, allowing professionals to focus on higher-value tasks. OpenAI’s leaders express optimism about the progress of GDPval and foresee further advancements in AI capabilities that will support human workers more effectively.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

YouTube Trials AI Remix Features to Transform Shorts into New Videos Using Prompts

Study Reveals AI Tools Lead to Increased Workloads and Reduced Breaks

China’s Parents Turn to A.I. for Homework Help – The New York Times

Streamlining Purchases: The Emergence of Agentic Commerce – FTI Consulting

😸 A Clash of Titans: President Trump, Anthropic, and OpenAI Face Off

Revolutionizing Software Engineering: The Impact of Coding Agents on the Factory Model

Ask HN: How Soon Will AI Evolve into a Deity?

THE THREE DOORS: Humanity’s Dilemma in the Era of Machine Consciousness

Navigating AI Compliance: Strategies from Engineering Teams

Employ Your AI: Create a 24/7 Virtual Assistant That Operates While You Sleep

OpenAI’s GPT-5: Claims to Match Expert Expertise

YouTube Trials AI Remix Features to Transform Shorts into New Videos Using Prompts

Following a $30 Billion Investment, Nvidia Sets to Develop New AI Chip for OpenAI: What Distinguishes It from Current GPUs

Feng’s AI Agent Session Center: Transforming AI Coding Interactions into Animated 3D Robots with Live Dashboards, Terminals, and Tool Logs Across All Devices

FoundationsConnect Web

The Decline of the Surface Web — zeitraum.blog

Local News

Revolutionizing Software Engineering: The Impact of Coding Agents on the Factory Model

YouTube Trials AI Remix Features to Transform Shorts into New Videos Using Prompts

Ask HN: How Soon Will AI Evolve into a Deity?

Study Reveals AI Tools Lead to Increased Workloads and Reduced Breaks

Revolutionizing Software Engineering: The Impact of Coding Agents on the Factory Model

YouTube Trials AI Remix Features to Transform Shorts into New Videos Using Prompts

Ask HN: How Soon Will AI Evolve into a Deity?