Is OpenAI’s New Open-Source Model More Intelligent Than a 10-Year-Old?

I recently explored OpenAI’s gpt-oss:20b, its first open-source model based on GPT-4 with a June 2024 knowledge cutoff. This model can also perform web searches, potentially enhancing its capabilities. I decided to test gpt-oss:20b against a real-world challenge, utilizing a UK 11+ practice test. Using my RTX 5080 GPU, which struggled with the model’s demands, I prompted the AI to solve the test questions.

The results were disappointing; it only correctly answered 9 out of 80 questions, often producing completely irrelevant responses. Despite demonstrating some reasoning skills in its output, the model frequently skipped problems or generated its own quiz instead of answering my prompts. Increasing the context length improved its performance slightly but still resulted in slow processing and irrelevant outputs. Overall, while my 10-year-old son outperformed the AI, the experiment yielded valuable insights into AI reasoning and limitations, despite the challenges posed by my hardware.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

OpenAI Shines on Opening Night of the 2026 Primary Season – Axios

OpenAI Investment Provides Microsoft (MSFT) Stock with a Breather

Nvidia’s $30 Billion Investment in OpenAI Aligns with IPO Plans, Says Deepwater’s Munster

WordPress Unveils AI Plugins for Anthropic Claude, Google Gemini, and OpenAI

“New Lawsuit Alleges Google’s Gemini May Be Linked to Suicide Cases” – TipRanks

Exploring Why Top AI Engineers Often Come from Managerial Backgrounds

Demonstrate Your AI Skills with Confidence—No More Guessing!

Ask HN: Are We Entering a New Era of AI Agent Sandboxes?

Exploring AI Agents in Microsoft 365 and Google Workspace

How Every AI Code Review Vendor Establishes Its Own Success Benchmark – DeepSource Insights

Is OpenAI’s New Open-Source Model More Intelligent Than a 10-Year-Old?

Nvidia CEO Jensen Huang Indicates Shift Away from OpenAI Investments as Apple Reveals MacBook Neo

Anthropic CEO Accuses OpenAI of Spreading Falsehoods About Military Deal

Father Blames Google AI for Son’s Descent into Delusion – BBC

Unveiling OpenClaw on Amazon Lightsail: Empower Your Autonomous Private AI Agents

How Every AI Code Review Vendor Establishes Its Own Success Benchmark – DeepSource Insights

Local News

Exploring Why Top AI Engineers Often Come from Managerial Backgrounds

OpenAI Shines on Opening Night of the 2026 Primary Season – Axios

Demonstrate Your AI Skills with Confidence—No More Guessing!

OpenAI Investment Provides Microsoft (MSFT) Stock with a Breather

Exploring Why Top AI Engineers Often Come from Managerial Backgrounds

OpenAI Shines on Opening Night of the 2026 Primary Season – Axios

Demonstrate Your AI Skills with Confidence—No More Guessing!