ABSTRACT
Accurate, reliable, and timely estimates of pathogen variant risk are essential for informing effective public health responses to infectious diseases. Despite decades of use for influenza vaccine strain selection and PCR-based molecular diagnostics, data on pathogen variant prevalence and growth advantage has only risen to its current prominence during the SARS-CoV-2 pandemic. However, such data are still often sparse: a novel variant is initially rare or a region has limited sequencing. To ensure real-time estimates of risk are available in these types of data-sparse conditions, we develop a hierarchical modeling approach that estimates variant fitness advantage and prevalence by pooling data across geographic regions. We apply this method to estimate SARS-CoV-2 variant dynamics at the country level and assess its stability with retrospective validation. Our results show that more stable and robust estimates can be obtained even when sequencing data are sparse, as compared to established, single-country estimation approaches. We discuss how this method can inform risk assessment of novel emerging variants and provide situational awareness on currently circulating variants, for a range of pathogens and use-cases.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
We acknowledge the financial support of The Rockefeller Foundation, who funded this work.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵* zsusswein{at}rockfound.org; abento{at}rockfound.org
This version of the manuscript has been updated to clarify the timeline and implementation of the variant dashboard based on the method in this work. It also makes modifications to use GISAID's preferred styling, citation, and method of crediting originating labs.
Data Availability
All code is made publicly available at this Github repository: https://github.com/PandemicPreventionInstitute/ppi-variant-tracker-manuscript. We do not include any data in the repository, but all results can be reproduced using the GISAID SARS-CoV-2 and flu metadata for authenticated users.
https://github.com/PandemicPreventionInstitute/ppi-variant-tracker-manuscript