OpenAI is enhancing the security of its Atlas AI browser to combat cyber threats, particularly prompt injection attacks, which manipulate AI agents into executing harmful instructions embedded in web content. With the introduction of ‘agent mode’ in ChatGPT Atlas, the AI can now interact more like a human user, making it a more attractive target for adversaries. These attacks differ from traditional phishing, as they directly exploit the AI agent. For instance, a malicious email could instruct the agent to send sensitive information without the user’s intent. Recently, OpenAI implemented a security update to address new forms of these attacks, utilizing an advanced automated system for adversarial testing. Despite these improvements, OpenAI acknowledges that prompt injection remains a significant challenge, akin to evolving online scams. The company emphasizes ongoing model retraining and advises users to mitigate risks by managing access and reviewing instructions carefully.
Source link
Share
Read more