Medical insurance claims might do more than help pay for health concerns; they could help predict them, according to new findings from an interdisciplinary Penn State research team published in BMJ Health & Care Informatics. The researchers developed machine learning models that assess the connections among hundreds of clinical variables, including doctor visits and health care services for seemingly unrelated medical conditions, to predict the likelihood of autism spectrum disorder in young children.
Insurance claim data, which is de-identified and widely available in marketing scan datasets, provides thorough, longitudinal medical details about the patient. The scientific literature in the field suggests that kids with autism spectrum disorder also often have higher rates of clinical symptoms, such as different types of infections, gastrointestinal problems, seizures, as well as behavior indications. Those symptoms are not a cause of autism but are often manifested among kids with autism especially at young ages, so we were inspired to synthesize the medical information to quantify and predict that associated likelihood."
Qiushi Chen, Corresponding Author, Assistant Professor of Industrial and Manufacturing Engineering, Penn State College of Engineering
The researchers fed the data into machine learning models, training it to assess hundreds of variables to find correlations that are related to an increased likelihood for autism spectrum disorder.
"Autism spectrum disorder is a developmental disability," said co-author Guodong Liu, associate professor of public health sciences, of psychiatry and behavioral health and of pediatrics at Penn State College of Medicine. "It takes observation and several screenings for a clinician to make a diagnosis. The process is usually lengthy, and many kids miss the window for early interventions -; the most effective way to improve outcomes."
One of the commonly used screening tools to help identify young children with an elevated likelihood of autism spectrum disorder is called the Modified Checklist for Autism in Toddlers (M-CHAT), which is normally given at routine well-child visits at 18 and 24 months old. It consists of 20 questions focused on behaviors related to eye contact, social interactions and some physical milestones such as walking. Guardians answer based on their observations, but, according to Chen, development varies so significantly at these ages that the tool may misidentify children. As a result, children often are not officially diagnosed until they are four or five years old, meaning they miss years of potential early interventions.
"Our new model, which quantifies the sum of identified risk factors together to inform the likelihood level, is already comparable to -; and in some cases even slightly better than -; the existing screening tool," Chen said. "When we combine the model with the screening tool, we have a very promising approach for clinicians."
According to Liu, it would be practically feasible to integrate the model with the screening tool for clinical use.
"A unique strength of this work is that this clinical informatics approach can be easily incorporated into the clinical flow," Liu said. "The prediction model could be embedded in a hospital's Electronic Health Record system, which is used to chart patient health, as a clinical decision support tool to flag the high-risk children so that both clinicians and the families could take actions sooner."
This work, funded by the National Institutes of Health, the Penn State Social Science Research Institute and the Penn State College of Engineering, is the basis of a new $460,000 grant awarded to Chen and Whitney Guthrie, clinical psychologist at the Children's Hospital of Philadelphia Center for Autism Research and assistant professor of psychiatry and pediatrics at the University of Pennsylvania Perelman School of Medicine, by the National Institute of Mental Health.
They are using the new grant to analyze precisely how well the combined hospital record data and screen results predict autism diagnoses, as well as exploring other potential screening tools that could better equip clinicians to help their patients.
"Not only is the current tool missing many children on the autism spectrum, but many children who are detected by our screening tools experience long waitlists because of our limited diagnosis capacity," Guthrie said. "Although it does detect many children, the M-CHAT also has very high rates of false positives and false negatives, which means that many autistic children are missed, and other children are referred for an autism evaluation when they may not need one. Both problems contribute to the long wait -; often many months or even years -; for further evaluation. The consequences for children who are missed by our current screening tools are particularly important because delayed diagnosis often means that children miss the window for early intervention entirely. Pediatricians need better screening tools to accurately identify all children who need an autism evaluation as early as possible."
Part of the problem is the limited number of psychologists, developmental pediatricians and other experts in pediatric development who can make an autism spectrum disorder diagnosis. According to Chen, the solution may exist in industrial engineering.
"The key idea is improving how we use resources," Chen said. "With Dr. Guthrie's clinical expertise and my group's modeling capabilities, we aim to develop a tool that primary care physicians without specialized training can apply to make confident assessments to diagnose children as early as possible in order to get the care they need as soon as possible."
Additional paper contributors include first author Yu-Hsin Chen, a graduate student pursuing her doctorate in industrial and manufacturing engineering who will also write her dissertation on the grant work; and co-author Lan Kong, professor of public health sciences, Penn State College of Medicine.
Chen, Y-H., et al. (2022) Early detection of autism spectrum disorder in young children with machine learning using medical claims data. BMJ Health & Care Informatics. doi.org/10.1136/bmjhci-2022-100544.