In 2025, the promise of “AI agents” remains largely unfulfilled, as highlighted in the paper “Hallucination Stations,” which argues against the reliability of Large Language Models (LLMs) in complex tasks. Authors Vishal Sikka and his son assert that LLMs cannot consistently perform agentic tasks, questioning the feasibility of AI taking on critical roles, such as running nuclear power plants. Despite this skepticism, the AI industry reports advancements, notably in coding through startups like Harmonic, which employs mathematical methods to improve AI reliability. This tension between optimistic breakthroughs and ongoing hallucination issues reveals a complicated landscape. Even as AI models exhibit increased capabilities, they continue to struggle with accuracy. OpenAI acknowledges that errors are inherent, with a claim stating accuracy in AI will never reach 100%. The future of fully automated generative AI remains uncertain, as industry leaders and critics alike grapple with its limitations.
Source link
