Artificial intelligence could make cancer diagnosis safer and fairer by learning when to defer to human pathologists without overloading them, according to researchers from the University of Surrey and Monash University.
The approach tackles two critical problems that have limited the use of AI-assisted decision-making in cancer pathology, radiology and other fields where human expertise remains essential. Current collaborative human-AI systems require every expert to review each case during training, creating an expensive and time-consuming process. They also tend to overwork the most accurate experts during testing, risking burnout and errors.
The research introduces a probabilistic method that allows AI systems to learn from incomplete expert input while distributing workload evenly across teams.
The research team tested their approach on colon cancer pathology images, where three professional pathologists classified tissue samples into normal, precancerous and cancerous categories. Even when 70 per cent of expert annotations were missing during training, the system maintained high accuracy whilst ensuring no single pathologist was overwhelmed with cases.
Professor Gustavo Carneiro, co-author of the study from the Centre for Vision, Speech and Signal Processing at the University of Surrey, said:
"In cancer pathology and radiology, we know that overloading experts leads to mistakes. There is a documented case where a radiologist misdiagnosed because they interpreted 162 cases in one day when the average is only 50. Our system prevents this by ensuring work is distributed fairly while maintaining high accuracy. The AI learns to handle routine cases independently and defer complex ones to humans, but crucially, it doesn't always defer to the same person."
The challenge is particularly acute in cancer diagnosis, where distinguishing between benign, precancerous and malignant tissue requires expert judgement, but pathologists face growing caseloads. An AI system that can confidently handle straightforward cases whilst flagging complex ones for human review could reduce pressure on specialists without compromising diagnostic accuracy.
Dr Cuong Nguyen, lead author and researcher at Surrey's Centre for Vision, Speech and Signal Processing, said:
"Previous systems assumed you could get every expert to review every training sample, which simply is not realistic for large datasets or busy clinical teams. We have shown you can train effective Human-AI systems even when experts only review portions of the data. This makes the technology far more practical for real-world deployment in cancer pathology and other high-stakes medical fields."
The system uses an algorithm that treats both the choice of which expert to consult and any missing expert opinions as variables that can be inferred during training. It also includes a mechanism to control how much work is assigned to each expert and to the AI classifier itself, allowing organisations to set workload limits during training rather than adjusting them afterwards.
The research addresses growing concerns about AI deployment in healthcare, where purely automated systems may miss important details, but consulting humans for every decision is impractical and costly. The team also tested the approach on chest X-ray interpretation and bone disease imaging, demonstrating its versatility across different medical imaging tasks.
The research was presented at the International Conference on Learning Representations (ICLR) 2025.