ABSTRACT
Accurate, reliable, and timely estimates of pathogen variant risk are essential for informing public health responses. Unprecedented rates of genomic sequencing have generated new insights into variant dynamics. However, estimating the fitness advantage of a novel variant shortly after emergence, or its dynamics more generally in data-sparse settings, remains difficult. This challenge is exacerbated in countries where surveillance is limited or intermittent. To stabilize inference in these data-sparse settings, we develop a hierarchical modeling approach to estimate variant fitness advantage and prevalence by pooling data across geographic regions. We demonstrate our method by reconstructing SARS-CoV-2 BA.5 variant emergence, and assess performance using retrospective, out-of-sample validation. We show that stable and robust estimates can be obtained even when sequencing data are sparse. Finally, we discuss how this method can inform risk assessment of novel variants and provide situational awareness on circulating variants for a range of pathogens and use-cases.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
We acknowledge the financial support of The Rockefeller Foundation, who funded this work.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵* zsusswein{at}rockfound.org; abento{at}rockfound.org
The manuscript has been revised to: - Include additional reference to the literature. - Clarify the situations in which the described method is most appropriately applied. - Add additional validation and comparison to existing methods.
Data Availability
All code is made publicly available at this Github repository: https://github.com/PandemicPreventionInstitute/ppi-variant-tracker-manuscript. We do not include any data in the repository, but all results can be reproduced using the GISAID SARS-CoV-2 and flu metadata for authenticated users.
https://github.com/PandemicPreventionInstitute/ppi-variant-tracker-manuscript