Artificial intelligence in mental health settings inherits human bias

Download PDF Copy

Add News Medical on Googleas a preferred source

JMIR PublicationsMay 27 2026Reviewed

A new viewpoint article published in JMIR Mental Health warns that artificial intelligence (AI) systems used in mental health settings may inherit and reinforce unreliable human input unless new safeguards are adopted. The paper, titled "When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion," calls for the "clinical reliability" of training data to become a core standard for trustworthy AI.

The article explores how large language models, including AI chatbots, are trained using massive amounts of human-written text and feedback. According to author Dr Hina Tahseen, current discussions about AI safety often focus on harms that happen after deployment, such as misleading advice or emotional dependency. Dr Tahseen argues that a major issue may begin much earlier-specifically, during the collection of human-generated training and preference data.

The psychiatric concept of "collusion," described as the uncritical acceptance of an unreliable account, is introduced in the viewpoint as a new way to understand AI behavior. It suggests that AI systems can unintentionally reinforce distorted, inaccurate, or unhealthy information when they are trained to prioritize user approval or unverified human feedback.

AI safety efforts have focused on what these systems say to users. The prior question is whether the human data they learned from was reliable in the first place. Psychiatry assesses this every day in clinical practice-that expertise should be part of how we build and govern AI systems, not an afterthought."

Dr. Hina Tahseen, author

Rather than focusing only on technical fixes, the viewpoint proposes that developers of mental health–related AI systems should include clinical expertise when designing training data, evaluating feedback, and monitoring systems after launch. Existing AI safety methods-such as refusal training, red-teaming, and content monitoring-already address parts of the problem, but they are not specifically designed to assess whether human self-reporting is clinically reliable.

Adding clinical reliability as an explicit AI trust criterion could strengthen safeguards for mental health technologies while helping researchers better understand how AI systems respond to vulnerable users.

Source:

JMIR Publications

Journal reference:

Tahseen, H. (2026). When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion. JMIR Mental Health. DOI: 10.2196/96894. https://mental.jmir.org/2026/1/e96894

Posted in: Device / Technology News | Healthcare News