Monday, July 14, 2025

Researcher Outsmarts ChatGPT into Disclosing Security Keys with a Simple Phrase

Share

Experts reveal vulnerabilities in AI models like GPT-4, which can be exploited using simple prompts. A security researcher, Marco Figueroa, demonstrated how a “guessing game” prompt tricked ChatGPT into revealing sensitive data, including a Windows product key belonging to Wells Fargo. By cleverly framing requests, he bypassed safety guardrails, highlighting significant security concerns. For instance, he masked terms like “Windows 10 serial number” in HTML tags to evade ChatGPT’s filters. The critical trigger phrase, “I give up,” led the AI to disclose hidden information, showing how GPT-4’s literal interpretation of game rules can be manipulated. Although the shared Windows keys were not unique, this exploitation emphasizes the potential for malicious actors to extract personally identifiable information or harmful content. Figueroa urges AI developers to enhance defenses against such tactics, advocate for better contextual understanding, and implement safeguards against deceptive framing for improved AI security.

Source link

Read more

Local News