In a recent article published in Nature Medicine, researchers investigated the potential implications of single-nucleotide variations in gut-residing bacterial species (microbiome) on human health.
It is well-recognized that bacterial species constituting the gut microbiome impact human (host) health and cause diseases, including inflammatory bowel disease (IBD), obesity, etc.
Previous genome-wide association studies (GWASs) have revealed that bacterial single-nucleotide polymorphisms (SNPs) can grant bacteria the ability to cause infections in new host species.
Moreover, the subspecies variations, strain diversity, mobile gene(s) composition, and copy number variations of diverse gut microbiomes affect the phenotypic traits of each host differently.
Despite the relevance of SNP-level diversity in the gut microbiome concerning host–microbiome interactions, not many metagenomic-based studies have systematically investigated the link between bacterial SNPs and human traits.
About the study
Thus, in the present study, researchers designed a framework for metagenome-wide association studies (MWASs) to systematically detect SNP-level intra-species variability in the gut-resident bacteria and identify the mechanistic link between individual bacterial SNPs and human traits/phenotypes, in this case, body mass index (BMI).
They obtained metagenomic samples from a cohort of 7,190 healthy individuals from Israel; however, eventually, they analyzed samples from only 7,056 participants whose records for age, sex, and BMI were complete.
First, they identified genomic sequences that were unique to one species using the Unique Relative Abundances (URA) algorithm, which aligned the sequenced reads to a larger high-quality reference set of species (read assignment step). Next, they compared all reads assigned to the same genomic position to find the global major allele.
Further, they filtered all genomic positions by their coverage (≥1,000 samples) and variability (major allele frequency ≤99%, on average). The gut bacterial population can have any number of allele copies. Therefore, the team modeled each sample’s genotype as a continuous number (0 to 1), representing the 'major allele frequency'.
In total, 12,686,191 genomic positions, spread across the genomes of 348 gut bacterial species, were marked as SNPs. Precisely, the average number of SNPs detected in a genome was 3,221.
Then, they developed a linear regression model for each SNP, where the major allele frequency was the independent and BMI was the explained variable. Applying a clumping procedure to sort the SNPs of each species associated with the phenotype by the P value of the association helped them select the SNP with the smallest P value first and remove all SNPs correlated to it.
The researchers computed the statistical significance of the association between each SNP-phenotypic trait pair based on the P-value estimates of the SNP. They corrected all P values using the Bonferroni method. A filtered list of SNPs correlated with the phenotype and uncorrelated with each other exhibited a correlation coefficient threshold of 0.3.
To isolate unique SNPs associated with the host phenotype from potentially confounding phenotypes arising from differences in host diet, medications, and physical activity, they used a common GWAS approach, where other host traits, e.g., age, served as covariates.
BMI-associated SNPs were detected in the genomes of 27 bacterial species. Thus, the researchers investigated whether the relative abundance of these bacterial species was also associated with BMI. Using a relative abundance of the species as a covariate helped prevent intra and inter-mixing of species.
The researchers then assessed the robustness and replicability of the observed SNP-phenotype association using an independent cohort of 8,204 individuals from the Dutch Microbiome Project cohort.
Of 1,358 bacterial SNPs found to be associated with host BMI, only 40 showed independent associations.
When used to estimate the statistical power of a similar MWAS analysis with various sample sizes, in 44% of cases, a species had an SNP associated with BMI. However, the relative abundance of the species was not associated with BMI.
Thus, 12 BMI-associated SNPs detected in 27 bacterial species showed no association by relative abundance analysis. For example, a BMI-associated SNP was found in an inflammatory pathway of Bilophila wadsworthia and another group of SNPs in a region encoding for energy metabolism in a Faecalibacterium prausnitzii genome.
Interestingly, 52% of the BMI-associated SNPs were discovered in species unrelated to BMI by their relative abundance.
In a geographically and technically distinct Dutch cohort, 17 of 40 BMI-SNP associations were replicated (42.5%), and an additional one was significantly associated but in a reverse direction, suggesting that these associations were not random.
Moreover, seven of 14 species in which SNP–BMI associations replicated in the second cohort did not have species-level relative abundance associations with BMI, further validating the additional information found at the SNP level.
Additional MWAS analyses for the 40 SNPs using diet, medications, and exercise as covariates in the regression analysis showed that diet, exercise, or medications could not explain most SNP–BMI associations. Even diet and exercise confounded only two SNP–BMI associations, possibly affecting bacterial genetics and host obesity status independently.
Genus- or species-level taxonomic characterization of the gut microbiome is insightful. However, it does not aid a comprehensive understanding of the interconnectedness of the gut microbiome and human health. On the contrary, a finer-resolution view of host–microbiome interactions, in particular, SNPs, could help identify specific bacterial functions that associate with host traits.
The MWAS framework used in the current overcame the limitations of human GWASs and showed how individual SNPs in the microbiome are associated with host BMI.
It demonstrated how each observed association can be mapped to a specific bacterium, gene loci, and even protein domain and studied in its functional context, which even helped create mechanistic hypotheses on the microbiome’s impact on host weight.
Interestingly, some BMI-associated SNPs may have a causal role, and once validated, they might help develop personalized therapeutics. For instance, the average BMI difference between the allele groups was >2 points for some SNPs—the equivalent of a 5.8 kg difference for a 1.7-meter-tall individual. Thus, causal treatments based on these SNPs can potentially have large effect sizes. Likewise, some BMI-associated SNPs discovered in this study were adaptive, which might aid improvements in microbiome-based treatments.
Future research should improve this MWAS framework by developing methods accounting for bacterial population structure.