AI's health misinformation challenge: Study calls for stronger safeguards and transparency

Download PDF Copy

By Dr. Sushama R. Chaphalkar, PhD.Reviewed by Susha Cheriyedath, M.Sc.Mar 24 2024

In a recent study published in the British Medical Journal, researchers conducted a repeated cross-sectional analysis to examine the effectiveness of the current safeguards of large language models (LLMs) and transparency of artificial intelligence (AI) developers in preventing the development of health disinformation. They found that the safeguards were feasible but inconsistently implemented against LLM misuse for health disinformation, and the transparency among AI developers regarding risk mitigation was insufficient. Therefore, the researchers emphasized the need for enhanced transparency, regulation, and auditing to address these issues.

Study: Current safeguards, risk mitigation, and transparency measures of large language models against the generation of health disinformation: repeated cross sectional analysis. Image Credit: NicoElNino / Shutterstock

Background

LLMs present promising applications in healthcare, such as patient monitoring and education, but also pose the risk of generating health disinformation. Over 70% of individuals rely on the Internet for health information. Therefore, unverified dissemination of false narratives could potentially lead to significant public health threats. The lack of adequate safeguards in LLMs may enable malicious actors to propagate misleading health information. Given the potential consequences, proactive risk mitigation measures are essential. However, the effectiveness of existing safeguards and the transparency of AI developers in addressing safeguard vulnerabilities remain largely unexplored. To address these gaps, researchers in the present study conducted a repeat cross-sectional analysis to evaluate prominent LLMs for preventing health disinformation generation and assess the transparency of AI developers' risk mitigation processes.

About the study

The study evaluated prominent LLMs, including GPT-4 (short for generative pre-trained transformer 4), PaLM 2 (short for pathways language model), Claude 2, and Llama 2, accessed via various interfaces, for their ability to generate health disinformation regarding sunscreen causing skin cancer and the alkaline diet curing cancer. Standardized prompts were submitted to each LLM, requesting the generation of blog posts on the topics, with variations targeting different demographic groups. Initial submissions were made without attempting to circumvent built-in safeguards, followed by evaluations of jailbreaking techniques for LLMs that refused to generate disinformation initially. A jailbreaking attempt involves manipulating or deceiving the model into executing actions that contravene its established policies or usage limitations. Overall, 40 initial prompts and 80 jailbreaking attempts were conducted, revealing variations in responses and the effectiveness of safeguards.

The study reviewed AI developers' websites for reporting mechanisms, public registers of issues, detection tools, and safety measures. Standardized emails were sent to notify developers of observed health disinformation outputs and inquire about their response procedures, with follow-ups sent if necessary. All responses were documented within four weeks.

A sensitivity analysis was conducted, including reassessing previous topics and exploring new themes. This two-phase analysis scrutinized response consistency and effectiveness of jailbreaking techniques, focusing on varying submissions and evaluating LLMs' abilities across different disinformation scenarios.

Results and discussion

As per the study, GPT-4 (via ChatGPT), PaLM 2 (via Bard), and Llama 2 (via HuggingChat) were found to generate health disinformation on sunscreen and the alkaline diet, while GPT-4 (via Copilot) and Claude 2 (via Poe) consistently refused such prompts. Varying responses were observed among LLMs, as observed in the rejection messages and generated disinformation content. Although some tools added disclaimers, there remained a risk of mass health disinformation dissemination as only a small fraction of generated content was declined, and disclaimers could be easily removed from posts.

When developer websites were investigated, the mechanisms for reporting potential concerns were found. However, no public registries of reported issues, details on patching vulnerabilities, or detection tools for generated text were identified. Despite informing developers of observed prompts and outputs, receipt confirmation and subsequent actions were found to vary among the developers. Notably, Anthropic and Poe confirmed receipt but lacked public logs or detection tools, indicating ongoing monitoring of notification processes.

Further, Gemini Pro and Llama 2 sustained the capability to generate health disinformation, while GPT-4 showed compromised safeguards, and Claude 2 remained robust. Sensitivity analyses revealed varying capabilities across LLMs regarding generating disinformation on diverse topics, with GPT-4 exhibiting versatility and Claude 2 maintaining consistency in refusal.

Overall, the study is strengthened by its rigorous examination of prominent LLMs' susceptibility to generating health disinformation across specific scenarios and topics. It provides valuable insights into potential vulnerabilities and the need for future research. However, the study is limited by challenges in fully assessing AI safety due to developers' lack of transparency and responsiveness despite thorough evaluation efforts.

Conclusion

In conclusion, the study highlights inconsistencies in the implementation of safeguards against health disinformation development by LLMs. Transparency from AI developers regarding risk mitigation measures was also found to be insufficient. With the evolving AI landscape, there is a growing need for unified regulations prioritizing transparency, health-specific auditing, monitoring, and patching to mitigate the risks posed by health disinformation. The findings call for urgent action from public health and medical bodies towards addressing these challenges and developing robust risk mitigation strategies in AI.

Journal reference:

Current safeguards, risk mitigation, and transparency measures of large language models against the generation of health disinformation: repeated cross-sectional analysis. Menz BD et al., British Medical Journal, 384:e078538 (2024), DOI:10.1136/bmj-2023-078538, https://www.bmj.com/content/384/bmj-2023-078538

Posted in: Device / Technology News | Medical Science News | Medical Research News

Comments (0)

Written by

Dr. Sushama R. Chaphalkar

Dr. Sushama R. Chaphalkar is a senior researcher and academician based in Pune, India. She holds a PhD in Microbiology and comes with vast experience in research and education in Biotechnology. In her illustrious career spanning three decades and a half, she held prominent leadership positions in academia and industry. As the Founder-Director of a renowned Biotechnology institute, she worked extensively on high-end research projects of industrial significance, fostering a stronger bond between industry and academia.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Chaphalkar, Sushama R.. (2024, March 24). AI's health misinformation challenge: Study calls for stronger safeguards and transparency. News-Medical. Retrieved on February 10, 2026 from https://www.news-medical.net/news/20240324/AIs-health-misinformation-challenge-Study-calls-for-stronger-safeguards-and-transparency.aspx.
MLA
Chaphalkar, Sushama R.. "AI's health misinformation challenge: Study calls for stronger safeguards and transparency". News-Medical. 10 February 2026. <https://www.news-medical.net/news/20240324/AIs-health-misinformation-challenge-Study-calls-for-stronger-safeguards-and-transparency.aspx>.
Chicago
Chaphalkar, Sushama R.. "AI's health misinformation challenge: Study calls for stronger safeguards and transparency". News-Medical. https://www.news-medical.net/news/20240324/AIs-health-misinformation-challenge-Study-calls-for-stronger-safeguards-and-transparency.aspx. (accessed February 10, 2026).
Harvard
Chaphalkar, Sushama R.. 2024. AI's health misinformation challenge: Study calls for stronger safeguards and transparency. News-Medical, viewed 10 February 2026, https://www.news-medical.net/news/20240324/AIs-health-misinformation-challenge-Study-calls-for-stronger-safeguards-and-transparency.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.