In a recent YouTube video, Mrwhosetheboss evaluated four AI models: Grok (Grok 3), Gemini (2.5 Pro), ChatGPT (GPT-4o), and Perplexity (Sonar Pro), providing insights into their performance on various tasks. He assessed real-world problem-solving skills by asking how many Aerolite 29″ suitcases fit in a Honda Civic trunk. Grok excelled with the correct answer of “2,” while ChatGPT and Gemini both offered practical insights. He also tested AI models on cake-making advice, where Grok accurately identified the odd item as dried porcini mushrooms, while other models faltered. Throughout the tests, all models displayed some hallucinations, presenting incorrect information confidently. The final ranking was ChatGPT (29 points), Grok (24 points), Gemini (22 points), and Perplexity (19 points). This evaluation highlights the advancements and limitations of AI technologies, emphasizing their growing impact in solving everyday problems.
Source link

Share
Read more