AI Showdown: Grok Amazes Mrwhosetheboss While ChatGPT Emerges Victorious

In a recent YouTube video, Mrwhosetheboss evaluated four AI models: Grok (Grok 3), Gemini (2.5 Pro), ChatGPT (GPT-4o), and Perplexity (Sonar Pro), providing insights into their performance on various tasks. He assessed real-world problem-solving skills by asking how many Aerolite 29″ suitcases fit in a Honda Civic trunk. Grok excelled with the correct answer of “2,” while ChatGPT and Gemini both offered practical insights. He also tested AI models on cake-making advice, where Grok accurately identified the odd item as dried porcini mushrooms, while other models faltered. Throughout the tests, all models displayed some hallucinations, presenting incorrect information confidently. The final ranking was ChatGPT (29 points), Grok (24 points), Gemini (22 points), and Perplexity (19 points). This evaluation highlights the advancements and limitations of AI technologies, emphasizing their growing impact in solving everyday problems.

Source link

News

Company:

Join our community of SUBSCRIBERS and be part of the conversation.

Empowering Girls: Shift from ‘AI Tool Usage’ to ‘AI Business Innovation’ – Eye Witness News

Ensuring Child Safety: Social Media, Devices, and AI Tools for Kids – Sunderland Echo

Anthropic’s AI Plugins Disrupt India’s Labor-Intensive IT Sector; Stocks Plunge 6% – Reuters

OpenAI’s Military Partnership Signals the Waning of Tech Idealism

Sam Altman, CEO of OpenAI, Justifies Pentagon Partnership – The Information

Bridging the AI Return on Investment Gap

Collaborative Solutions: A Platform for Tackling Challenges Beyond AI’s Reach

Show HN: Seamless File Uploads with Instant URLs for AI Agents

Model Collapse Signals the End of AI Hype

Critique My Website: AI-Powered Feedback Tool

AI Showdown: Grok Amazes Mrwhosetheboss While ChatGPT Emerges Victorious

GitHub Repository: OldeUCryptoBoi’s LinkedIn AI Detector

YouTube Trials AI Remix Features to Transform Shorts into New Videos Using Prompts

Poll: Is AI Experiencing Another Winter? | Hacker News

SleuthCo ClawShield: Advanced Security Proxy for AI Agents – Comprehensive Protection Against Prompt Injection, PII, and Secrets Features: Go Proxy, iptables Firewall, eBPF Kernel Monitor,...

Breakdown of Talks Between Anthropic and the Defense Department: Insights from The New York Times

Local News

Empowering Girls: Shift from ‘AI Tool Usage’ to ‘AI Business Innovation’ – Eye Witness News

Bridging the AI Return on Investment Gap

Ensuring Child Safety: Social Media, Devices, and AI Tools for Kids – Sunderland Echo

Anthropic’s AI Plugins Disrupt India’s Labor-Intensive IT Sector; Stocks Plunge 6% – Reuters

Empowering Girls: Shift from ‘AI Tool Usage’ to ‘AI Business Innovation’ – Eye Witness News

Bridging the AI Return on Investment Gap

Ensuring Child Safety: Social Media, Devices, and AI Tools for Kids – Sunderland Echo