Artificial intelligence (AI) has already changed many aspects of medical imaging, but traditional ophthalmic AI systems are often built for a single disease, dataset, or task. That narrow design can limit generalizability across hospitals, imaging devices, patient populations, and disease categories. Ophthalmology is well suited to the next stage of AI development because it depends heavily on standardized, image-rich examinations such as fundus imaging and OCT. Foundation models may help move the field beyond narrow task-specific systems by learning reusable representations from large and diverse datasets. At the same time, more evidence is still needed on how these models can be integrated into ophthalmology safely, effectively, and fairly.
Researchers from the Eye Center of the Second Affiliated Hospital, Zhejiang University School of Medicine/Zhejiang University Eye Hospital, together with Professor Andrzej Grzybowski of Poland and collaborators from China, the United States, the United Kingdom, Australia, Singapore, Poland, and Hong Kong, published a review in Advances in Ophthalmology Practice and Research on October 24, 2025 (DOI: 10.1016/j.aopr.2025.10.004). The article systematically examines vision and vision-language foundation models in ophthalmology, with particular attention to diagnostic performance, clinical potential, interpretability, fairness, deployment barriers, and future directions.
Following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, the authors searched PubMed, Web of Science, Scopus, and Google Scholar for studies published between January 2020 and July 2025. Ten studies met the inclusion criteria, covering representative ophthalmic foundation models such as RETFound, FLAIR, VisionFM, EyeCLIP, FMUE, MetaGP, MINIM, RETFound-DE, RetiZero, and OSPM. Retinal disease diagnosis was the most common application area, particularly diabetic retinopathy (DR), age-related macular degeneration (AMD), and diabetic macular edema (DME). RETFound achieved an area under the curve (AUC) of 0.94 for DR detection on the EyePACS dataset, while VisionFM reached an AUC of 0.974 for AMD in external validation. In glaucoma, RETFound-DE achieved an AUC of 0.902 on REFUGE-2, and EyeCLIP showed promising performance across several external datasets. For ocular surface tumors (OSTs), OSPM achieved AUC values of about 0.986 to 0.993. The review also highlighted RetiZero's ability to recognize more than 400 rare fundus diseases, with a top-five accuracy of 75.6%. Several models also showed few-shot and zero-shot learning capacity, suggesting that they may adapt to new diagnostic tasks with limited labeled data.
The authors said these findings point to a shift in ophthalmic AI, from single-purpose algorithms to more flexible systems that can connect images, clinical language, and patient data.
"Foundation models may help clinicians extract more value from the data already generated in routine eye care," they said. "But strong performance on research datasets is only a starting point. To be trusted in clinical settings, these tools still need transparency, careful validation, and designs that support rather than replace clinical judgment."
The review also shows that real-world adoption will require more than strong accuracy metrics. Current foundation models still face major challenges, including limited data diversity, algorithmic bias, overfitting, high computational demands, limited interpretability, electronic health record (EHR) interoperability, and insufficient clinical validation. The authors argue that future work should prioritize larger and more representative datasets, explainable AI tools such as saliency maps, Shapley Additive Explanations (SHAP), and counterfactual reasoning, as well as post-deployment monitoring for fairness and performance drift. If these challenges can be addressed, foundation models may support earlier diagnosis, improve referral decisions, expand access to specialist-level eye care, and help build safer, more scalable AI-assisted ophthalmic workflows.
Source:
Journal reference:
Jin, K., et al. (2026). A systematic review of vision and vision-language foundation models in ophthalmology. Advances in Ophthalmology Practice and Research. DOI: 10.1016/j.aopr.2025.10.004. https://www.sciencedirect.com/science/article/pii/S2667376225000514?via%3Dihub