A better method to predict polygenic inheritance across multiple ancestries

New methods of analysis and novel markers are currently being identified to predict conditions with polygenic inheritance. These include polygenic risk scores (PRS), which are based on the presence of single nucleotide polymorphisms (SNPs) in several genes. However, their utility is limited, as PRS are largely based on data derived from European populations.

A new paper in Nature Genetics reports the results obtained from the use of a new PRS calculator called CT-SLEB on a multi-national GWAS database.

Study: A new method for multiancestry polygenic prediction improves performance across diverse populations. Image Credit: Yurchanka Siarhei / Shutterstock.com


SNPs refer to different gene variants formed by the presence of one of several possible bases at a given position within a nucleotide. These genetic mutations must be detectable in 1% or more of the population to be considered an SNP.

Genome-wide association studies (GWAS) have been used to identify large numbers of SNPs that are linked to complex traits and diseases. PRS uses combinations of SNPs to provide a predicted risk for the occurrence of complex traits and disease conditions.

PRS built on SNP-trait associations have largely been derived from European cohorts, thereby limiting their generalizability. Especially in African populations, the PRS calculated based on these studies has produced inaccurate results.

As a result, PRS is not suitable for clinical use without favoring European-ancestry populations. Thus, the use of GWAS from across multiple populations could facilitate the development of better PRS scores from larger training samples.

To this end, previous studies have compiled GWAS information from the target population with information from larger European populations. However, ideal PRS would require an appropriate sample size with sufficient power, thus indicating the need for better methods, as well as larger and more diverse databases.

About the study

The current study reports on the performance of CT-SLEB, a powerful computational tool based on clumping and thresholding (CT), superlearning (SL), and empirical Bayes (EB) methods, as compared with nine other methods. While CT selects which SNPs should be considered when calculating PRS in the target population, EB is a method used to estimate the SNP coefficient. SL uses a mix of PRSs from various SNP selection criteria.

CT-SLEB requires GWAS summary statistics from both European and non-European training datasets, a tuning dataset that yields the best parameters for the target population, and a validation dataset that provides the final prediction for the target population.

These results were obtained using GWAS simulations on large populations extending across five different ancestries. These included 23andMe, Inc., the Global Lipids Genetics Consortium (GLGC), All of Us (AoU), and UK Biobank (UKBB) across EUR), AFR (primarily African American), Latino, East Asian, and South Asian (SAS) populations.

GWAS data from over five million individuals from several different ancestral groups were included in the analysis., about 1.2 million of whom were from countries outside Europe. The data were used to predict multi-ancestry PRS from a combination of European and less abundant non-European population data.

In addition to providing comparative data on CT-SLEB and other approaches, the scientists also generated validated PRS for 13 complex traits using this multi-ancestry PRS tool.

What did the study show?

Improved PRS performance using CT-SLEB was observed in the groups from non-European countries as compared to the other simpler tools. This remained true regardless of whether the training dataset was small or large; however, this affected the accuracy of other PRS calculators. The greatest number of comparisons was conducted between CT-SLEB and PRS-CSx, the latter being a Bayesian approach.

CT-SLEB maintained or even surpassed the predictive accuracy of other tools that rely more heavily on computational analysis. As the sample size increases, CT-SLEB becomes more accurate, irrespective of the polygenicity, whereas with smaller samples, it performs better with lower polygenicity.

PRS-CSx performed better than CT-SLEB in many settings; however, both platforms perform best when using data from all five ancestries. Using two-ancestry data, CT-SLEB generates African PRS 25 times faster than PRS-CXs at only 4.3 minutes. When based on five-ancestry data, CT-SLEB was over 90 times faster, taking almost the same time as compared to 420 minutes.

The predictive performance of PRS for minority groups generated by CT-SLEB was comparable to that for the European population if the former had sample sizes at least 45% as large as European cohorts. However, the sample size required for accurate prediction varies dramatically with the heritability of various traits.

CT-SLEB is easily scalable and can process a much larger number of SNPs. Thus, this platform is capable of improving its PRS performance in minority groups within the American population by using denser SNP panels.

For many polygenic traits, including the clinically important cardiovascular disease (CVD) trait, CT-SLEB predicted the risk much better than PRS-CSx and PolyPred-S+. Overall, these three outperformed other platforms; however, none was superior in all settings.  

Even with the best-performing method and large sample, a substantial gap remained for PRS performance in non-EUR populations compared with the EUR Population.”

What are the implications?

CT-SLEB is a new and computationally scalable method for generating powerful PRSs using data from GWASs in diverse populations.”

The study findings emphasize the need to use multiple methods to generate PRSs across multiple ancestries. For African-American populations, representing African-origin populations that have little baseline polygenic data, with correspondingly lower polygenic prediction accuracy, CT-SLEB produced the most improvement in the PRS performance. 

The simulation studies showed the need to determine sample sizes appropriate for such prediction. These studies also highlight the effects of variations in SNP density when predicting the risk of a trait among people of multiple ancestries, which will affect the choice of method for PRS generation.

CT-SLEB produces predictions an order of magnitude faster than PRS-CSx and is easily scalable for handling large increases in the number of SNPs and more populations.

Journal reference:
  • Zhang, H., Zhan, J., Jin, J., et al. (2023). A new method for multiancestry polygenic prediction improves performance across diverse populations. Nature Genetics. doi:10.1038/s41588-023-01501-z.
Dr. Liji Thomas

Written by

Dr. Liji Thomas

Dr. Liji Thomas is an OB-GYN, who graduated from the Government Medical College, University of Calicut, Kerala, in 2001. Liji practiced as a full-time consultant in obstetrics/gynecology in a private hospital for a few years following her graduation. She has counseled hundreds of patients facing issues from pregnancy-related problems and infertility, and has been in charge of over 2,000 deliveries, striving always to achieve a normal delivery rather than operative.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Thomas, Liji. (2023, October 02). A better method to predict polygenic inheritance across multiple ancestries. News-Medical. Retrieved on November 30, 2023 from https://www.news-medical.net/news/20231002/A-better-method-to-predict-polygenic-inheritance-across-multiple-ancestries.aspx.

  • MLA

    Thomas, Liji. "A better method to predict polygenic inheritance across multiple ancestries". News-Medical. 30 November 2023. <https://www.news-medical.net/news/20231002/A-better-method-to-predict-polygenic-inheritance-across-multiple-ancestries.aspx>.

  • Chicago

    Thomas, Liji. "A better method to predict polygenic inheritance across multiple ancestries". News-Medical. https://www.news-medical.net/news/20231002/A-better-method-to-predict-polygenic-inheritance-across-multiple-ancestries.aspx. (accessed November 30, 2023).

  • Harvard

    Thomas, Liji. 2023. A better method to predict polygenic inheritance across multiple ancestries. News-Medical, viewed 30 November 2023, https://www.news-medical.net/news/20231002/A-better-method-to-predict-polygenic-inheritance-across-multiple-ancestries.aspx.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of News Medical.
Post a new comment
You might also like...
Genome haplarithmisis sheds light on complex genetic landscape of miscarriages