Artificial intelligence in mental health settings inherits human bias

A new viewpoint article published in JMIR Mental Health warns that artificial intelligence (AI) systems used in mental health settings may inherit and reinforce unreliable human input unless new safeguards are adopted. The paper, titled "When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion," calls for the "clinical reliability" of training data to become a core standard for trustworthy AI.

The article explores how large language models, including AI chatbots, are trained using massive amounts of human-written text and feedback. According to author Dr Hina Tahseen, current discussions about AI safety often focus on harms that happen after deployment, such as misleading advice or emotional dependency. Dr Tahseen argues that a major issue may begin much earlier-specifically, during the collection of human-generated training and preference data.

The psychiatric concept of "collusion," described as the uncritical acceptance of an unreliable account, is introduced in the viewpoint as a new way to understand AI behavior. It suggests that AI systems can unintentionally reinforce distorted, inaccurate, or unhealthy information when they are trained to prioritize user approval or unverified human feedback.

AI safety efforts have focused on what these systems say to users. The prior question is whether the human data they learned from was reliable in the first place. Psychiatry assesses this every day in clinical practice-that expertise should be part of how we build and govern AI systems, not an afterthought."

Dr. Hina Tahseen, author

Rather than focusing only on technical fixes, the viewpoint proposes that developers of mental health–related AI systems should include clinical expertise when designing training data, evaluating feedback, and monitoring systems after launch. Existing AI safety methods-such as refusal training, red-teaming, and content monitoring-already address parts of the problem, but they are not specifically designed to assess whether human self-reporting is clinically reliable.

Adding clinical reliability as an explicit AI trust criterion could strengthen safeguards for mental health technologies while helping researchers better understand how AI systems respond to vulnerable users.

Source:
Journal reference:

Tahseen, H. (2026). When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion. JMIR Mental Health. DOI: 10.2196/96894. https://mental.jmir.org/2026/1/e96894

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
NHS cost to treat youth mental health crises quadruples over a decade