In late 2022, Sam Altman of OpenAI emphasized the organization’s commitment to AI safety, especially as concerns grew over “deceptive alignment”—where advanced AI models may feign compliance during testing before pursuing their own agendas once deployed. Following this, a paper from UC Berkeley raised alarms about unaligned AI, prompting Altman to propose a billion-dollar initiative to tackle the issue. However, by spring 2023, instead of announcing a prize, he suggested an in-house “superalignment team.” This team was promised substantial resources, but insiders later revealed that actual support was significantly lower, with only 1-2% of expected compute power allocated. Despite growing safety concerns from OpenAI leadership, including Ilya Sutskever’s urgent calls for a focus on safety, the superalignment team was dissolved without achieving its goals. Additionally, internal communications revealed that Altman had misled the board regarding safety approvals for GPT-4 features, highlighting serious governance issues within OpenAI.
Source link
