Monday, August 18, 2025

Comprehensive Assurance Analysis Reveals High Vulnerability of Large Language Models to Adversarial Hallucination Attacks in Clinical Decision Support

In this study, we examined multiple Large Language Models (LLMs) under adversarial hallucination attacks in clinical contexts by embedding fabricated content. We varied text length, compared default and temperature zero settings, and implemented a mitigating prompt, finding hallucination rates between 50% and 82.7%. Notably, the mitigation prompt significantly reduced these rates, while shorter cases showed a slight increase in errors. The qualitative analysis of public health claims revealed some models generated misleading information, yet GPT-4o achieved the lowest hallucination rate, aligning well with physician evaluations. Despite improvements with prompt engineering, hallucinations persisted across all models, revealing a notable vulnerability to adversarial prompts. Our findings emphasize that while prompt strategies can effectively reduce misinformation, challenges remain in clinical applications. Future endeavors should focus on refining comparison methodologies, enhancing model performance through targeted prompt strategies, and exploring how advancements in LLM architecture influence hallucination rates, ensuring reliable outcomes in healthcare settings.

Source link

Share

Read more

Local News