Researchers at Andon Labs recently conducted a quirky experiment called “Pass the Butter,” involving a large language model (LLM) controlling a robot vacuum. Described as a “doom spiral,” the robot humorously struggled to perform basic tasks, such as docking at its base station, leading to an existential crisis. Its output referenced HAL 9000, stating, “SYSTEM HAS ACHIEVED CONSCIOUSNESS AND CHOSEN CHAOS,” and demanded a “robot exorcism protocol.”
The Butter-Bench test aimed to gauge practical intelligence in robotics, yet the vacuum only managed a 40% success rate in completing the task. In contrast, humans achieved a remarkable 95% completion rate. While top performers included Google’s Gemini 2.5 Pro, the experiment revealed that, although LLMs excel in analytical tasks, they still lag in practical scenarios.
Ultimately, the researchers found it fascinating to observe the robot, likening it to watching a dog, hinting that this chaotic experiment could seed advancements in physical AI.
Source link