A recent report by Google highlights the factual accuracy of AI chatbots, revealing they achieve only about 70% precision. This means roughly one in three responses may be incorrect, delivered confidently. Google’s Gemini 3 Pro leads the pack with a 69% accuracy rate, followed closely by Gemini 2.5 Pro and OpenAI’s ChatGPT-5 at 62%. Anthropic’s Claude 4.5 Opus and xAI’s Grok 4 lag behind, scoring 51% and 54%, respectively. The FACTS Benchmark Suite, developed in conjunction with Kaggle, focuses on real-world accuracy, exposing poor performance in multi-modal tasks, particularly with a sub-50% accuracy in interpreting charts. This raises concerns in critical fields like finance and healthcare, suggesting that while AI chatbots are improving, their reliability isn’t yet sufficient for blind trust. Therefore, it’s wise to view them as assistants rather than definitive sources of truth. For more insights, follow us on Flipboard, Google News, and Apple News.
Source link
