Machine learning reveals dental caries heterogeneity in NHANES

A new article published in the Journal of Dental Research explores the development an integrated data-cleaning and subtype discovery pipeline using unsupervised machine learning for comprehensive analysis and visualization of data patterns in the National Health and Nutrition Examination Survey (NHANES) database.

Authored by Alena Orlenko, Cedars-Sinai Medical Center, Los Angeles, CA, USA, et al., "Uncovering Dental Caries Heterogeneity in NHANES Using Machine Learning" addresses the limitations of the NHANES, one of the largest curated repositories of nationally representative population-level health-related indicators, by establishing a data-cleaning pipeline with a novel outlier detection algorithm and unsupervised machine learning to identify phenotype subtypes within NHANES dental caries data.

"By bringing the power of machine learning to a large national data set, the authors identify key clusters of factors linked to caries in children or seniors," said Nick Jakubovics, Editor-in-Chief of Journal of Dental Research. "The next challenge is to build on this information and find more effective methods to prevent caries in different groups of people."

The study demonstrates a robust data-cleaning–subtype discovery pipeline that could be applied to investigate other health conditions using NHANES and similar databases for machine learning predictive modeling. Applying a comprehensive bioinformatics pipeline to NHANES data successfully identified substantial age-driven heterogeneity in dental caries, suggesting stratification is crucial for future predictive modeling. 

This integrative approach systematically addresses data quality issues and facilitates exploratory analysis to reveal data patterns associated with subtypes and variables associated with the clinical heterogeneity of caries. It uncovered novel associations between caries status, lead/pollutant exposure, specific laboratory markers and food types, as well as sleep patterns, reflecting additional disease markers in susceptible populations. This demonstrates the value of integrating data science techniques with large-scale observational data to gain deeper insights into complex, multifactorial diseases.

Source:
Journal reference:

Orlenko, A., et al. (2025) Uncovering Dental Caries Heterogeneity in NHANES Using Machine Learning. Journal of Dental Research. doi:10.1177/00220345251398027. https://journals.sagepub.com/doi/10.1177/00220345251398027

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
The future of automation: Machine learning–driven hepatocyte and organoid counting