In a recent study published in eBioMedicine, researchers developed the Transmission Fitness Polymorphism (TFP) scanner analysis pipeline to detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants with high growth rates, serving as leading indicators to generate early warning signals (EWS) for epidemic waves.
Study: Phylogenomic early warning signals for SARS-CoV-2 epidemic waves. Image Credit: Andrii Vodolazhskyi/Shutterstock.com
Background
The coronavirus disease 2019 (COVID-19) has resulted in recurring epidemic waves associated with continual SARS-CoV-2 emergence and variant emergence.
Rapid identification of the variants is crucial for predicting future waves and implementing countermeasures like social distancing, vaccination, or improvements to healthcare capacity.
Statistical approaches for generating EWS have been developed, frequently based on the incidence or prevalence of infectious diseases. Machine learning has demonstrated the ability to improve sensitivity and specificity.
In addition, researchers have attempted to create EWS using indirect data such as polymerase chain reaction (PCR) cycle threshold (Ct) values, behavioral abnormalities, and job absenteeism.
About the study
In the present study, researchers explored the utility data of SARS-CoV-2 genomic sequences for generating EWS of future COVID-19 waves, analyzing United Kingdom (UK) COVID-19 pandemic data from August 2020 to March 2022.
The team identified leading indicators to generate early warning signs ahead of an exponential rise in COVID-19-related hospitalizations.
Subsequently, they compared the performances of SARS-CoV-2 phylogeny-based leading indicators with non-phylogeny-based ones, such as new hospital admissions, test positivity rates, PCR Ct levels, CoMix survey, and Google mobility data.
They explored the sensitivity of EWS lead duration to the cutoff set for false positive EWS. The study goal was to maximize lead time and minimize the number of false positive EWS to improve the effectiveness of countermeasures.
The team examined large SARS-CoV-2 phylogenies and determined logistic growth rates (LGRs) for clusters within each phylogeny using a generalized linear model (GLM) and a generalized additive model (GAM).
They also calculated a molecular clock outlier's (MCO) statistic, which assesses the degree to which evolutionary rates diverge in a phylogenetic branch. In the TFP Scanner investigation, they varied the minimum cluster age, maximum cluster age, and minimum threshold size for descendant counts in clusters using 24 parameter settings.
The team applied filters to the clusters used to generate the leading indicator time series, which comprised both existing and external clusters.
They estimated EWS lead durations = relative to COVID-19 epidemic wave start dates, determined by applying an optimal GAM to new hospital admissions data from the UK.
They used TFP scanner input parameter sets, varied cluster filters, different possible leading indicators, and a range of EWS threshold values to generate 1.40 million EWS time series. In contrast, they created EWS using non-phylogeny-derived potential leading indicators.
The team used contemporary trees to replicate real-time analysis and avoid data alterations. On May 3, 2022, they connected genomic sequences in the trees to patient case metadata obtained from COG-UK via CLIMB.
They selected only Pillar 2 (P2) samples to avoid sampling bias in Pillar 1 (P1) hospital samples and provide a more representative sample of SARS-CoV-2 transmission in the community.
Results
Phylogeny-derived leading indicators, such as the maximal logistic growth rate (LGR) among the predominant Pango lineage clusters and the average LGR across more numerous clusters, showed promising results in generating EWS ahead of significant increases in COVID-19 hospitalizations in the UK.
The leading indicators had a lead time ranging from a lead time of 20 days (for the SARS-CoV-2 Delta variant wave) to a lag of seven days (for the SARS-CoV-2 B.1.177 variant), with a five-day mean lead duration, indicating their effectiveness in predicting epidemic waves.
The phylogenomic approach evaluated SARS-CoV-2 phylogenomic data and extracted EWS for COVID-19-related hospitalizations in successive pandemic waves.
Phylogeny-derived leading indicators performed better than non-phylogeny-derived ones concerning lead time and minimizing false positive EWS. The team achieved longer lead times by tolerating more false-positive EWS.
Conclusion
Overall, the study findings highlighted the development of the TFP scanner pipeline to identify SARS-CoV-2 strains with high growth rates and generate early warning signals for COVID-19 waves in the United Kingdom.
The phylogenomic approach using logistic growth rate clusters has demonstrated the ability to produce lead times ahead of epidemic wave maxima, which would help public health authorities.
The EWS lead times indicate that the method could benefit broader SARS-CoV-2 surveillance programs and may apply to other nations and regions with various sequencing capacities and sampling procedures. Future studies could analyze EWS produced from wastewater and diagnostic test samples.