Advanced AI models outperform pediatricians in diagnosing rare diseases

Download PDF Copy

Reviewed

Hospital Sant Joan de Déu BarcelonaMay 4 2026

Artificial intelligence (AI) is increasingly being explored as a tool to support clinical decision-making, yet its real-world performance in pediatric diagnosis remains unclear. Now, a Pediatric Investigation study using authentic clinical cases reports that advanced AI models outperform clinicians in diagnostic accuracy, particularly for rare diseases, while a combined human-AI approach achieves the highest overall success. The findings highlight the potential of AI as a complementary tool to improve diagnostic precision and patient outcomes.

Accurate diagnosis in pediatric care can be particularly challenging, especially when rare diseases present with subtle or overlapping symptoms. Early uncertainty in diagnosis may delay treatment and increase the risk of complications. While artificial intelligence (AI) has shown potential in healthcare, most previous studies have relied on simplified or curated cases rather than real-world clinical data. This leaves an important gap in understanding how large language models perform in everyday clinical settings, where decisions are often made with limited information.

Against this backdrop, a team of researchers led by Dr. Cristian Launes from Hospital Sant Joan de Déu in Barcelona, Spain, evaluated the performance of AI models using real pediatric clinical cases. The study, published in the journal Pediatric Investigation on [25/March/2026], compared four advanced language models with 78 pediatric clinicians across 50 cases, including both common conditions and rare diseases.

Dr. Launes is a Clinical Professor and pediatrician at Hospital Sant Joan de Déu, Barcelona, specializing in pediatric infectious diseases, particularly respiratory viruses. His expertise includes respiratory viral infections, pediatric epidemiology, and infectious disease research.

To reflect real clinical practice, the researchers used patient summaries based on the first 72 hours of presentation. Each case was assessed multiple times to examine both diagnostic accuracy and consistency. Performance was evaluated based on whether the correct diagnosis appeared as the top prediction or within the top five suggestions.

The results showed that the most advanced AI models achieved higher diagnostic accuracy than clinicians overall. This advantage was particularly evident in rare disease cases, where AI systems were more likely to identify correct diagnoses that clinicians initially missed. However, clinicians demonstrated strengths in certain complex or context-dependent scenarios, highlighting differences in how humans and AI approach diagnostic reasoning.

Importantly, the study did not evaluate a real-time, interactive "human-plus-AI" diagnostic workflow. Instead, the researchers estimated potential complementarity using a prespecified "union" approach, asking whether the correct diagnosis appeared in the Top-5 list of either clinicians or model runs. Under this estimate, the best-performing pairing reached 94.3% Top-5 union accuracy, suggesting that clinicians and AI may contribute different correct hypotheses in difficult cases, particularly for rare diseases. "Our results suggest that AI can be evaluated as a clinician-supervised second opinion, especially in difficult cases where rare diseases are involved," said Dr. Launes. "Rather than replacing clinicians, these tools may help broaden the differential diagnosis and reduce the likelihood of missed diagnoses - as long as outputs are interpreted critically and within robust oversight frameworks."

From a governance perspective, medical diagnostic decision-support systems are generally considered high-risk applications under the European Union AI Act. This classification implies expectations around risk management, data governance, transparency, human oversight, and cybersecurity. The authors emphasize that any clinical use should remain advisory, with clear accountability, monitoring, and safeguards to address variability and the risk of misleading outputs.

The researchers also observed that additional clinical information improved diagnostic performance for both groups. When more detailed data, such as laboratory or imaging results, were included, accuracy increased. This finding underscores the importance of continuous clinical assessment and suggests that AI systems may be most effective when integrated into evolving, information-rich workflows.

The interaction between data quality and diagnostic performance is critical. AI systems perform best when they are part of a continuous clinical process, where clinicians iteratively gather, verify, and curate the evolving clinical picture to feed the model, with ongoing reassessment and human oversight - not a one-time input-output tool."

Dr. Cristian Launes from Hospital Sant Joan de Déu, Barcelona

These findings highlight the potential of AI-assisted tools to support earlier and more accurate diagnosis, particularly for rare diseases where expertise may be limited. In the longer term, integrating AI into clinical workflows could enable more collaborative and data-driven decision-making, while also encouraging closer collaboration between clinicians, engineers, and policymakers.

Overall, this study demonstrates that advanced AI models can outperform clinicians in certain pediatric diagnostic tasks, particularly for rare conditions, while achieving the greatest benefit when used alongside human expertise. Although challenges such as variability in responses and the need for appropriate oversight remain, the findings point to a promising role for AI as a supportive tool in pediatric healthcare.

Source:

Hospital Sant Joan de Déu Barcelona

Journal reference:

Launes, C., et al. (2026). Large‐language‐models for pediatric diagnosis: Performance evaluation using real‐world clinical notes from common and rare cases. Pediatric Investigation. DOI: 10.1002/ped4.70053. https://onlinelibrary.wiley.com/doi/10.1002/ped4.70053

Posted in: Child Health News | Device / Technology News