ChatGPT fills gaps like a confabulator, not a hallucinating mind

From ChatGPT’s confident falsehoods to Whisper’s phantom words, the article argues that AI errors may offer a powerful mirror for how predictive systems, human or artificial, construct meaning when information is incomplete.

Perspective: Does ChatGPT need a psychiatrist? Similarities between human psychopathology and errors in large language models. Image Credit: Summit Art Creations / Shutterstock

Perspective: Does ChatGPT need a psychiatrist? Similarities between human psychopathology and errors in large language models. Image Credit: Summit Art Creations / Shutterstock

A recent Perspective article published in the journal NPP–Digital Psychiatry and Neuroscience compared errors in artificial intelligence (AI) systems with confabulations and hallucinations observed in psychiatry.

Large language models (LLMs) and speech recognition tools have gained widespread popularity, transforming education, business, healthcare, and research. However, significant concerns remain over the tendency of these AI systems to produce misinformation. For instance, ChatGPT sometimes generates text that is factually incorrect but appears plausible.

Likewise, automatic speech recognition (ASR) tools, such as Whisper, can produce severe transcription errors under some conditions, resulting in output that is unfaithful or nonsensical to the source input. Such errors are often called hallucinations, although clarification is necessary. Hallucinations, in humans, are false sensory experiences that occur in the absence of external stimuli. Meanwhile, LLM errors do not involve perception and are better described as confabulations.

In contrast, ASR errors are structurally different from LLM errors and may be more akin to hallucinations in a functional, rather than experiential, sense. Although LLM errors may appear at first glance to be software bugs, they resemble psychiatric phenomena in several ways. Outputs in artificial and biological systems can appear coherent, context-sensitive, and confident, but detached from reality. In the article, the authors explored the parallels between human psychiatric symptoms and AI model output errors.

Confabulations in AI Systems and Humans

Confabulations are false memories produced to fill memory gaps, with no intent to deceive. These fabrications appear coherent in the context of the individual’s life. They are most common in conditions associated with memory impairment, e.g., Korsakoff’s syndrome and dementia. Confabulations range from minor inaccuracies to highly detailed fabrications, highlighting the brain's role in memory construction and reconstruction.

LLMs, such as ChatGPT, exhibit similar flaws. Under certain conditions, ChatGPT tends to generate nonsensical and inaccurate information. It generates incorrect responses that appear plausible when information is missing, i.e., when there are functional “memory gaps” related to training data, parameter encoding, or context limitations rather than human-like episodic memory gaps. Vague, ambiguous, or broad prompts encourage AI models to fill in missing information based on assumptions derived from their training data.

Tricky questions and multi-step reasoning tasks also lead the model to generate logically consistent but incorrect responses. Notably, LLM and human confabulations are context-dependent. In humans, emotional states, strong beliefs, or leading questions increase the odds of filling memory gaps with inaccurate but plausible details. Similarly, GPT models can confidently generate inaccurate responses when prompted with assumptions.

The similarities between LLM and human confabulation lie in gap-filling, observable behavior, and coherence-seeking, but the mechanisms remain fundamentally distinct. LLMs lack self-modeling consciousness, executive control, or episodic memory. While earlier LLMs operated within fixed parameters and lacked persistent memory across interactions, newer models store limited user-controlled information across sessions but do not incorporate it into a continuously self-updating model.

Hallucinations in Humans and ASR Systems

ASR and human hallucinations show some superficial similarities. In Whisper, more than one-third of hallucinations are explicitly harmful, such as demographic stereotypes, physical violence/death, and sexual innuendo. Similarly, human auditory verbal hallucinations (AVHs) are characterized by often threatening and negative content. Voices in both non-clinical and clinical groups often deliver threats of violence and verbal abuse.

Further, human and ASR hallucinations show repetition. Whisper hallucinations endlessly loop phrases or words similar to human AVHs that repeat themes and wording. At a behavioral level, both ASR systems and humans appear particularly vulnerable to hallucination-like errors or experiences in the presence of degraded/weak perceptual signals. Human AVHs involve aberrant corollary discharge mechanisms, cortical-subcortical loops, and predictive processing of sensory input.

On the other hand, Whisper hallucinations reflect the completion of probabilistic patterns applied to acoustic features. These systems do not hear voices but compute odds over acoustic-text mappings. Repetitive or harmful content does not arise from misattributed inner speech, affective states, or threat processing, but from statistical regularities in the training data.

Mitigation Strategies

In psychiatry, cognitive-behavioral therapy is used to improve the ability of individuals to critically assess the validity of their hallucinations. Similar mechanisms could be integrated into AI systems. Plausibility assessment can be operationalized using uncertainty estimation methods. Moreover, internal consistency checks can be implemented to require the model to re-assess its output.

Increasing resource allocation for LLMs, including multi-pass verification or additional processing steps, can reduce error rates by enabling more thorough, slower internal assessment. Other mitigation strategies include retrieval-augmented generation, cross-model verification, multi-agent debate, semantic entropy methods, prompt design, and temperature tuning. A similar principle may apply to humans: error monitoring and cognitive performance rely on sufficient neurobiological resources, which are restored by processes such as sleep. Therefore, strengthening systemic capacity may decrease susceptibility to errors rather than targeting symptoms.

Concluding Remarks

Collectively, confabulation-like and hallucination-like errors are well-recognized problems in LLMs and LLM-dependent software, underscoring the need for continued human oversight. These errors superficially resemble pathological symptoms reported in neurology and psychiatry. Errors in human minds and machines are features of systems built for prediction and explanation. As such, comparing confabulations and hallucinations in LLMs and humans can help better understand both, provided the parallels are treated as provisional, model-dependent, and mechanistically limited.

Download your PDF copy by clicking here.

Journal reference:
  • de Boer JN, Ciampelli S, Hailemariam AK, Koops S, Sommer IEC (2026). Does ChatGPT need a psychiatrist? Similarities between human psychopathology and errors in large language models. NPP–Digital Psychiatry and Neuroscience, 4(1), 12. DOI: 10.1038/s44277-026-00064-1, https://www.nature.com/articles/s44277-026-00064-1
Tarun Sai Lomte

Written by

Tarun Sai Lomte

Tarun is a writer based in Hyderabad, India. He has a Master’s degree in Biotechnology from the University of Hyderabad and is enthusiastic about scientific research. He enjoys reading research papers and literature reviews and is passionate about writing.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Sai Lomte, Tarun. (2026, June 11). ChatGPT fills gaps like a confabulator, not a hallucinating mind. News-Medical. Retrieved on June 11, 2026 from https://www.news-medical.net/news/20260611/ChatGPT-fills-gaps-like-a-confabulator-not-a-hallucinating-mind.aspx.

  • MLA

    Sai Lomte, Tarun. "ChatGPT fills gaps like a confabulator, not a hallucinating mind". News-Medical. 11 June 2026. <https://www.news-medical.net/news/20260611/ChatGPT-fills-gaps-like-a-confabulator-not-a-hallucinating-mind.aspx>.

  • Chicago

    Sai Lomte, Tarun. "ChatGPT fills gaps like a confabulator, not a hallucinating mind". News-Medical. https://www.news-medical.net/news/20260611/ChatGPT-fills-gaps-like-a-confabulator-not-a-hallucinating-mind.aspx. (accessed June 11, 2026).

  • Harvard

    Sai Lomte, Tarun. 2026. ChatGPT fills gaps like a confabulator, not a hallucinating mind. News-Medical, viewed 11 June 2026, https://www.news-medical.net/news/20260611/ChatGPT-fills-gaps-like-a-confabulator-not-a-hallucinating-mind.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Japanese study reveals how healthcare systems shape immigration preferences