Reports that medical errors are the third leading cause of death in the United States have led the Institute of Medicine and several state legislatures to suggest that data from patient safety event reporting systems could help health care providers better understand safety hazards and, ultimately, improve patient care.
"Tens of thousands of these safety report databases provide a free text field that does not constrain the reporter to fixed, predefined categories.
Srijan Sengupta, assistant professor of statistics in the College of Science and a faculty member at the Discovery Analytics Center
Sengupta has received an $815,218 Research Project Grant (R01) from the National Institutes of Health to develop novel statistical methods to analyze such unstructured data in safety reports.
"Detailed information that spans multiple categories can be more valuable than identifying an event by just checking off a category," he said.
For example, in one free text field it was reported that a patient was administered prescribed premedication prior to a scheduled MRI but, because of a miscommunication about transport policy, was not put on the MRI table until a full hour later. At that point she was uncomfortable and tried to get out of the scanner. The technician sent the patient back to her room given the circumstances. Upon learning about this chain of events, her family expressed their belief that her behavior was due to the medication having worn off.
In a structured text field, this situation would be classified simply as a "diagnosis/imaging" event, Sengupta said, but, in analyzing the free text, there are several other contributing factors, including medication and communication, which would also need improvement for a more favorable outcome.
"Identifying temporal trends and patterns in unstructured data is particularly important to improving patient safety and patient care," Sengupta said. "What may seem like an infrequent hazard at a hospital may be part of a broader national trend when viewed across health care systems. Using our algorithms to effectively analyze documents from reporting systems has the potential to dramatically improve the safety and quality of care by exposing possible weaknesses in the care process."
Sengupta said that this funded project builds on his existing research on social network analysis, statistical process monitoring, and anomaly detection.
The grant is also affording an opportunity for two seniors at Virginia Tech majoring in computational modeling and data analytics, Cameron Bissell and Raghav Sawhney, to work with Sengupta on applying text analytics and analysis to find relationships in the given data.
"Since I am considering a career in data science and in medicine, I believe this research is a great way for me to gain experience in data science and it also provides valuable insight into the medical field," said Bissell.
"Working with Professor Sengupta on this project is giving me an opportunity to apply the machine learning techniques I have learned in class to real life applications and problem solving," said Sawhney, who is planning to continue with research in the machine learning field.
For the three-year project, which began on Aug. 1, 2019, Sengupta is partnering with Raj Ratwani, director of the MedStar Health National Center for Human Factors in Healthcare and associate professor at Georgetown University School of Medicine, and Allan Fong, research scientist with the center.
MedStar Health's Human Factors Center is contributing its expertise on machine learning and natural language processing and providing domain knowledge on patient safety, on reporting databases that can be used to identify safety trends and patterns, and on how health care providers can use the outputs from this grant to improve their safety.
Ratwani said that tens of thousands of safety issues are reported to the FDA but most health care providers are unaware of them because they are not analyzed and presented in a way that health care providers can use in practice. As a result, U.S. health care providers may continue to use technologies and processes that have already been reported to the FDA as having potential issues that could result in patient harm.
"This research is critical to identifying patterns in the reported data and turning data into knowledge that the health care provider can then use to assess the safety of their technologies and processes and develop actions and interventions to prevent patients from being harmed by recognized hazards," said Ratwani.
"Releasing open-source software that will enable other practitioners in public and private health care systems to implement our methods on their own proprietary datasets will be one of the most important outcomes of our research," said Sengupta.