Can one AI analyze all medical scans? MedVersa shows promise across multiple imaging tasks

A massive new multimodal AI system trained on tens of millions of medical images could help unify fragmented radiology tools and assist doctors in interpreting scans and generating reports more efficiently.

Study: MedVersa: A Generalist Foundation Model for Diverse Medical Imaging Tasks. Image Credit: Thitisan / Shutterstock

Study: MedVersa: A Generalist Foundation Model for Diverse Medical Imaging Tasks. Image Credit: Thitisan / Shutterstock

In a recent study published in the journal NEJM AI, researchers introduced “MedVersa”, a generalist artificial intelligence (AI) model capable of ingesting and interpreting a wide range of medical imaging modalities and task types. Unlike traditional AI models trained for specific, limited tasks, MedVersa was built on tens of millions of medical imaging instances, allowing it to detect pathologies and generate reports within a unified analytical framework.

Encouragingly, when MedVersa’s performance was compared with that of a human radiologist in a blinded evaluation of chest radiograph reports, the model produced reports that were judged clinically comparable to human-written reports in many cases, particularly for scans with normal findings, while significantly reducing the time human radiologists spend documenting their findings. Together, these results posit MedVersa as a promising step toward developing a new generation of unified, multimodal foundation models that may help consolidate the currently fragmented ecosystem of AI tools currently used in clinical care settings.

Background: Fragmentation of Medical Artificial Intelligence Tools

While recent advances in computational power and artificial intelligence (AI) model logic have allowed several of these tools to be approved for use in the medical field, their use is often fragmented. Models trained on X-ray datasets can accurately detect pneumonia in patient chest X-rays, but cannot use MRI or ultrasound data for holistic patient evaluation.

These "specialist" models often struggle to adapt to complex clinical workflows where a patient’s diagnosis involves multiple data types. Computational biologists sought to address this discrepancy by introducing the concept of Generalist Medical Artificial Intelligence (GMAI).

Their goal was to create a "foundation model" (similar to the “agentic” technology adopted by ChatGPT, Google Gemini, and other large language models [LLMs]) that can process multimodal inputs and outputs. Unfortunately, previous attempts to realize this concept largely focused on text-based inputs and proved incapable of elucidating the complex visual tasks indispensable in radiology.

Development of the MedVersa Multimodal AI Model

The present study aimed to address this functional gap by engineering “MedVersa,” a radiology-focused generalist AI model capable of ingesting, annotating, diagnosing, reporting, and documenting multimodal clinical imaging data. The model was trained using “MedInterp”, a massive dataset aggregating 91 public datasets that together comprised over 29 million medical instances, including images, bounding-box annotations, segmentation masks, captions, and other vision–language supervision signals used across diverse imaging tasks.

The model features a unique architecture that uses a trained LLM as an “orchestrator”, evaluating users' requirements (e.g., "Where is the patient’s tumor?") and dynamically selecting appropriate internal vision modules within the MedVersa framework for request execution. Unlike previous GMAIs, which were primarily text-based, MedVersa was designed to either generate a text response or deploy specialized "vision modules" for object detection or segmentation.

MedVersa can consequently process inputs as diverse as 2D X-rays, 3D CT and MRI scans, and patients’ clinical history text simultaneously. Following model training, MedVersa’s performance was validated against two separate traditional competitors across nine distinct imaging tasks: 1. Approved specialist AI models, 2. Board-certified radiologists (n = 10).

Evaluation Framework and Comparative Testing

Performance evaluation required the expert (an AI model or a human radiologist) to review reports generated by humans, ChatGPT-4o, and MedVersa for chest X-rays. Crucially, experts were blinded to the data source. Performance was scored based on the clinical accuracy of expert output and evaluation efficiency (time taken to complete the evaluation and generate a report).

Study Findings: Performance Across Imaging Tasks

Study findings revealed that MedVersa’s GMAI architecture was competitive with and frequently exceeded traditional “gold standard” specialist models across many object-detection and segmentation evaluation metrics.

When evaluating model report generation, in the BLEU-4 test (higher is better, measures text similarity), MedVersa achieved a score of 17.8, compared with MAIRA’s 14.2, BiomedGPT’s 12.0, and Med-PaLM M’s 11.5. In the RadCliQ test (lower is better, measures deviation from human clinical reporting), MedVersa achieved a score of 2.71 versus MAIRA’s 3.10 and BiomedGPT’s 3.25. While Med-PaLM M reported a slightly better RadCliQ score (2.67), this was statistically indistinguishable from MedVersa.

Comparison With Human Radiologist Reporting

When compared with human experts, researchers found that MedVersa’s reports were clinically comparable to human-written reports in 64% of cases. For scans with normal findings, this equivalence increased to 91%. However, for scans with abnormal findings involving more complex pathology, equivalence was substantially lower, and human-written reports were more often preferred by reviewing radiologists.

Researchers also demonstrated that using MedVersa as an assistant enabled doctors to complete report-drafting workflows more quickly. It reduced report-writing time and, crucially, resulted in fewer "urgent" discrepancies (errors requiring immediate attention) than reports drafted by GPT-4o (a 20% reduction in the 5-to-10-minute reporting interval).

Conclusions: Toward Unified Clinical AI Assistants

The present study reveals that MedVersa represents an important step toward developing a unified clinical assistant rather than relying on traditionally fragmented AI tools. Its architecture, which leverages an LLM to orchestrate specialized vision tools, enabled this novel model to achieve performance competitive with or exceeding specialized AI models across several tasks while significantly streamlining and accelerating expert human radiologists’ workflows.

However, the study emphasizes that while MedVersa excelled at routine cases, board-certified radiologists remain preferred for complex, abnormal cases involving intricate pathologies, underscoring the importance of expert supervision. The authors also note that broader generalizability across imaging modalities remains an ongoing challenge because several non–chest X-ray datasets in the study were dominated by segmentation tasks rather than full diagnostic interpretation.

Consequently, while the present study validates MedVersa as a powerful proof-of-concept, future GMAI models should be trained with expanded datasets that include more modalities (e.g., genetic information and electronic health records [EHRs]) to fully realize the potential of AI-assisted, human expert-mediated patient care.

Journal reference:
  • Zhou, H.-Y., Acosta, J. N., Adithan, S., Datta, S., Topol, E. J., & Rajpurkar, P. (2026). MedVersa: A Generalist Foundation Model for Diverse Medical Imaging Tasks. NEJM AI. DOI – 10.1056/aioa2500595. https://ai.nejm.org/doi/full/10.1056/AIoa2500595 
Hugo Francisco de Souza

Written by

Hugo Francisco de Souza

Hugo Francisco de Souza is a scientific writer based in Bangalore, Karnataka, India. His academic passions lie in biogeography, evolutionary biology, and herpetology. He is currently pursuing his Ph.D. from the Centre for Ecological Sciences, Indian Institute of Science, where he studies the origins, dispersal, and speciation of wetland-associated snakes. Hugo has received, amongst others, the DST-INSPIRE fellowship for his doctoral research and the Gold Medal from Pondicherry University for academic excellence during his Masters. His research has been published in high-impact peer-reviewed journals, including PLOS Neglected Tropical Diseases and Systematic Biology. When not working or writing, Hugo can be found consuming copious amounts of anime and manga, composing and making music with his bass guitar, shredding trails on his MTB, playing video games (he prefers the term ‘gaming’), or tinkering with all things tech.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Francisco de Souza, Hugo. (2026, March 08). Can one AI analyze all medical scans? MedVersa shows promise across multiple imaging tasks. News-Medical. Retrieved on March 09, 2026 from https://www.news-medical.net/news/20260308/Can-one-AI-analyze-all-medical-scans-MedVersa-shows-promise-across-multiple-imaging-tasks.aspx.

  • MLA

    Francisco de Souza, Hugo. "Can one AI analyze all medical scans? MedVersa shows promise across multiple imaging tasks". News-Medical. 09 March 2026. <https://www.news-medical.net/news/20260308/Can-one-AI-analyze-all-medical-scans-MedVersa-shows-promise-across-multiple-imaging-tasks.aspx>.

  • Chicago

    Francisco de Souza, Hugo. "Can one AI analyze all medical scans? MedVersa shows promise across multiple imaging tasks". News-Medical. https://www.news-medical.net/news/20260308/Can-one-AI-analyze-all-medical-scans-MedVersa-shows-promise-across-multiple-imaging-tasks.aspx. (accessed March 09, 2026).

  • Harvard

    Francisco de Souza, Hugo. 2026. Can one AI analyze all medical scans? MedVersa shows promise across multiple imaging tasks. News-Medical, viewed 09 March 2026, https://www.news-medical.net/news/20260308/Can-one-AI-analyze-all-medical-scans-MedVersa-shows-promise-across-multiple-imaging-tasks.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Researchers show red blood cells drive better glucose tolerance at high altitude