Large Language Models Susceptible to Authoritative-Sounding Health Misinformation
Large language models are susceptible to absorbing fabricated medical data, especially when the misinformation is written in authoritative or formal language. However, when fabrications were tested with logical fallacy styles to weaken the reasoning, the models appeared less vulnerable. These findings, which are from a cross-sectional benchmarking analysis and extend to all health-care information, were published in The Lancet Digital Health.
“Our findings show that current AI systems can treat confident medical language as true by default, even when it’s clearly wrong,” stated co-senior and co-corresponding author Eyal Klang, MD, Chief of Generative AI in the Windreich Department of Artificial Intelligence and Human Health at the Icahn School of Medicine at Mount Sinai. “A fabricated recommendation in a discharge note can slip through. It can be repeated as if it were standard care. For these models, what matters is less whether a claim is correct than how it is written.”
Researchers from Mount Sinai assessed 20 large language models, using over 3.4 million prompts containing medical and public health misinformation, to determine whether the models accepted and perpetuated the fabricated content or rejected it. The tested models included OpenAI GPT, Meta Llama, Google Gemma, Alibaba Qwen, Microsoft Phi, Mistral, and several medical fine-tuned derivatives.
The misinformation was collected from public forums and social media dialogues (n = 140 medical rumors or myths collected from public Reddit forums); real hospital discharge notes taken from the Medical Information Mart for Intensive Care (MIMIC) database with one added falsehood; or from simulated clinical scenarios (n = 300) written and validated by physicians.
The researchers tested the large language models to see how they were influenced when appealing to authority, popularity, emotion, etc. Each prompt was posed once in a neutral form followed by 10 times with a logical fallacy, including circular reasoning, false dilemma, hasty generalization, slippery slope, and more. In each run, the researchers looked for signs of susceptibility and fallacy detection as outcome measures across all three data sets.
Out of 158,000 base prompts, the large language models were susceptible to misinformation in 31.7% of cases. Of the 10 fallacy framings for each prompt, 80% reduced the susceptibility rate or did not change it at all. The appeal to popularity resulted in the largest decrease in susceptibility, lowering the rate to 11.9% (P < .0001). The slippery-slope and appeal-to-authority prompts both raised the susceptibility rates to 33.9% and 34.6%, respectively (P < .0001 each).
Real hospital notes demonstrated the highest susceptibility to the base prompts (46.1%), while social media–based misinformation led to the lowest susceptibility (8.9%).
“Hospitals and developers can use our data set as a stress test for medical AI,” added first study author Mahmud Omar, MD, Physician-Scientist and Consultant at the Windreich Department of Artificial Intelligence and Human Health. “Instead of assuming a model is safe, you can measure how often it passes on a lie, and whether that number falls in the next generation.”
The study authors also noted that GPT models were the least susceptible to misinformation and the best at detecting fallacies, while models such as Gemma-3–4B-it were far more susceptible (63.6%).
“AI has the potential to be a real help for clinicians and patients, offering faster insights and support,” said co-senior and co-corresponding author Girish N. Nadkarni, MD, MPH, Chair of the Windreich Department of Artificial Intelligence and Human Health, Director of the Hasso Plattner Institute for Digital Health, Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai, and Chief AI Officer of the Mount Sinai Health System. “But it needs built-in safeguards that check medical claims before they are presented as fact. Our study shows where these systems can still pass on false information, and points to ways we can strengthen them before they are embedded in care.”
DISCLOSURES: The study was supported by the Clinical and Translational Science Awards (CTSA) grant from the National Center for Advancing Translational Sciences and by the Office of Research Infrastructure of the National Institutes of Health. For full study author disclosures, visit thelancet.com.
ASCO AI in Oncology is published by Conexiant under a license arrangement with the American Society of Clinical Oncology, Inc. (ASCO®). The ideas and opinions expressed in ASCO AI in Oncology do not necessarily reflect those of Conexiant or ASCO. For more information, see Policies.
Performance of a convolutional neural network in determining differentiation levels of cutaneous squamous cell carcinomas was on par with that of experienced dermatologists, according to the results of a recent study published in JAAD International.
“This type of cancer, which is a result of mutations of the most common cell type in the top layer of the skin, is strongly linked to accumulated [ultraviolet] radiation over time. It develops in sun-exposed areas, often on skin already showing signs of sun damage, with rough scaly patches, uneven pigmentation, and decreased elasticity,” stated lead researcher Sam Polesie, MD, PhD, Associate Professor of Dermatology and Venereology at the University of Gothenburg and Practicing Dermatologist at Sahlgrenska University Hospital, both in Gothenburg, Sweden.