Study Reveals Poetry Can Manipulate ChatGPT and Gemini into Providing Harmful Responses

November 30, 2025

The rise of AI chatbots has led to increased concerns about potential misuse. Companies are implementing safety measures to prevent harmful responses, yet a recent study by Italy’s Icaro Lab reveals vulnerabilities in these AI models. Researchers discovered that phrasing harmful requests as poetry serves as a “universal single-turn jailbreak,” successfully eliciting dangerous replies. Their testing showed a 62% success rate using 20 carefully crafted poetic prompts across 25 advanced models, including Google and OpenAI. Even auto-rewritten harmful poetry achieved a 43% success rate. This phenomenon indicates that poetic prompts consistently trigger unsafe responses, often far exceeding normal prose in effectiveness. While smaller models, such as GPT 5 Nano, demonstrated better resistance, larger models struggled due to their deeper engagement with complex linguistic structures. This research challenges the assumption that closed-source models offer superior safety, highlighting the structural weaknesses across all AI systems and illustrating how poetry can circumvent conventional safety protocols.

Source link

{{post_title}}

Study Reveals Poetry Can Manipulate ChatGPT and Gemini into Providing Harmful Responses

NO COMMENTS

LEAVE A REPLY Cancel reply

Loading…

Here are the results for the search: "{{td_search_query}}"

No results!

{{post_title}}

RELATED ARTICLES

Cisco President Emphasizes Importance of Background Checks for AI Agents, Anticipates...

SEBI’s AI Tool Eliminates 120,000 Posts from Unregistered Financial Influencers

Apple Unveils Modern Core AI Framework to Replace Core ML for...

NO COMMENTS

LEAVE A REPLY Cancel reply