Tuesday, September 16, 2025

Leading AI Companies Collaborate with US and UK Governments on Model Safety Initiatives

OpenAI and Anthropic are collaborating with U.S. and U.K. governments to enhance the safety of their large language models (LLMs) against misuse. This partnership, documented in recent blogs, includes granting access to their models for independent evaluations by researchers at the National Institute of Standards and Technology (NIST) and the U.K. AI Security Institute. The aim is to identify vulnerabilities, including potential attack vectors that could compromise security. OpenAI discovered two significant vulnerabilities that could allow sophisticated hacks but has since worked on reinforcing safeguards in products like GPT-5 and ChatGPT. Similarly, Anthropic has shared its Claude AI for testing and discovered critical vulnerabilities, prompting a complete restructured safeguard architecture. Despite concerns over prioritizing competitiveness over safety, experts affirm that commercial models are becoming more secure. AI safety remains a debated topic, though ongoing collaboration signifies continued commitment to addressing vulnerabilities in these systems.

Source link

Share

Read more

Local News