A sweeping plasma metabolite map from nearly 390,000 participants links blood chemistry to disease risk, comorbidity patterns, and future diagnostic possibilities.

Study: A large-scale human plasma metabolite atlas from over 380,000 participants. Image Credit: ArtemisDiana / Shutterstock
In a recent study published as an 'article in press' in the journal Communications Biology, researchers described the development and predictive performance of a novel comprehensive map linking NMR-derived plasma metabolic traits to diverse health traits and disease outcomes. The study specifically mapped 251 NMR-derived circulating plasma metabolic traits against hundreds of health traits and diseases in nearly 390,000 individuals.
Study findings successfully uncovered tens of thousands of robust associations between participants’ metabolomic profiles and health traits, prevalent diseases, and incident disease risks, demonstrating that metabolite profiles may help identify shared metabolomic signatures and possible biological pathways across seemingly unrelated conditions. This novel metabolomic map (“plasma metabolomic atlas”) was thus presented as a validated open-access resource that could support future predictive and personalized medicine.
Background
Despite decades of research, the field of medicine has historically faced significant challenges in identifying reliable, early diagnostic biomarkers that enable highly personalized treatment. While recent advances in high-throughput genomic and transcriptomic sequencing technologies now provide valuable baseline data, they do not by themselves fully capture patients' current (“snapshot”) health status or consistently predict future clinical outcomes.
Metabolomics, the study of small-molecule intermediates and end-products of cellular processes, offers a complementary approach. Recent research has established that an individual’s blood-derived metabolite profile can help quantify their current physiological state and be used to develop predictive, personalized medicine assessments, thereby facilitating precision medicine.
These benefits are largely attributed to the metabolites’ rapid, measurable responses to environmental changes and disease onset, thereby allowing them to serve as potentially sensitive markers for early risk assessment. Unfortunately, despite this clinical potential, previous research has largely focused on isolated diseases or narrow sets of phenotypes.
Reviews in the field highlight that hitherto, scientists lacked a unified, large-scale platform to evaluate how key metabolites behave across a wide spectrum of health conditions. It remains unclear whether certain molecules consistently track, contribute to, or are potentially causally linked with multimorbidity across multiple organ systems, or how well these chemical markers predict short-term versus long-term health outcomes.
About the study
The present study aimed to address these persistent data limitations and facilitate the medical community's transition toward proactive, precision healthcare by developing a comprehensive metabolic map (“plasma-derived metabolite atlas”) capable of predicting some future diagnoses, especially within defined risk windows.
The study analyzed data from an initial discovery cohort (n = 212,751 individuals; mean age = 56.6 years) and an independent validation cohort (n = 177,013 participants) enrolled in the UK Biobank (total n = 389,764).
Study data were obtained using high-throughput nuclear magnetic resonance (NMR) spectroscopy to quantify distinct metabolic traits (n = 251) in participants’ blood plasma samples. These measured traits included 170 absolute concentrations (e.g., lipids, amino acids, and ketone bodies) alongside 81 calculated (derived) metabolic ratios.
The measured metabolic traits were tested against 884 health-related traits, 722 prevalent (existing) diseases, and 1,137 incident (future) diseases. Statistical analyses included logistic regression and Cox proportional hazards models to assess associations between metabolites and clinical outcomes, with a mean follow-up of 13.8 years.
Furthermore, the study used bidirectional two-sample Mendelian randomization (MR) to investigate potential causal relationships in observed associations. Finally, the study leveraged LightGBM machine learning algorithms to develop disease prediction models.
Study findings
The study’s analyses identified 67,505 metabolite-trait associations and 41,214 metabolite-incident disease associations between identified metabolites and tested diseases. Notably, 83.3% of prevalent disease associations and 74.0% of incident disease associations replicated independently across both the initial discovery and independent validation cohorts, thereby supporting the robustness of the plasma metabolite atlas.
The study further identified several molecules as candidate markers associated with multimorbidity patterns. For example, glycoprotein acetyls (GlycA) were found to be broadly associated with mental and behavioral disorders, including mood disorders, depression, and anxiety disorders.
Similarly, creatinine was found to be an important feature in disease prediction models, with analyses revealing that it was a primary feature in 97.8% of high-performing prevalent disease machine-learning models.
The researchers then extended the atlas beyond association mapping by examining disease clusters, predictive performance, and genetically informed causal signals.
The predictive analyses also showed notable power for participant-specific metabolite profiles. For prevalent type 2 diabetes (T2D), metabolite-based prediction models achieved an area under the curve (AUC) of 0.892, significantly outperforming traditional demographic predictors (AUC = 0.790). For 5-year incident T2D, metabolite-based models achieved an AUC of 0.828.
However, demographic features showed higher average predictive performance across all outcomes, while combined models incorporating both metabolites and demographic variables consistently improved performance for many diseases. This suggests that metabolomics provides complementary biological information rather than replacing established demographic and clinical predictors.
Finally, the genetic MR analysis identified 61 putative causal effects of metabolites on disease. For example, a triglyceride-related large LDL lipid ratio, reported in the paper as triglycerides to L-LDL-C%, showed inverse putative causal associations with several coronary outcomes, including major coronary events.
Conclusions
The present, systematically validated plasma metabolomic atlas is among the first to empirically demonstrate that blood-derived metabolites can improve prediction for selected prevalent and short-term incident diseases compared with demographic predictors alone.
While the current NMR-based platform primarily captures lipid-centric biological data and is limited in its coverage of the broader polar metabolome, the study’s findings represent a valuable open-access tool for understanding the chemical and biological roots of human illness.
Moving forward, integrating these insights into everyday clinical practice will require broader validation, mechanistic studies, and testing in more diverse populations. If confirmed, such approaches could allow future clinical systems to improve risk stratification and identify complex comorbidity patterns before formal diagnosis.
Download your PDF copy by clicking here.
Journal reference:
- Li, Z., Miao, Y., Jin, L., Ma, Y., & Zhang, X. (2026). A large-scale human plasma metabolite atlas from over 380,000 participants. Communications Biology. Article in press. DOI – 10.1038/s42003-026-10505-4. https://www.nature.com/articles/s42003-026-10505-4