Abstract
SARS-CoV-2 evolution threatens vaccine- and natural infection-derived immunity, and the efficacy of therapeutic antibodies. Herein we sought to predict Spike amino acid changes that could contribute to future variants of concern. We tested the importance of features comprising epidemiology, evolution, immunology, and neural network-based protein sequence modeling. This resulted in identification of the primary biological drivers of SARS-CoV-2 intra-pandemic evolution. We found evidence that resistance to population-level host immunity has increasingly shaped SARS-CoV-2 evolution over time. We identified with high accuracy mutations that will spread, at up to four months in advance, across different phases of the pandemic. Behavior of the model was consistent with a plausible causal structure wherein epidemiological variables integrate the effects of diverse and shifting drivers of viral fitness. We applied our model to forecast mutations that will spread in the future, and characterize how these mutations affect the binding of therapeutic antibodies. These findings demonstrate that it is possible to forecast the driver mutations that could appear in emerging SARS-CoV-2 variants of concern. This modeling approach may be applied to any pathogen with genomic surveillance data, and so may address other rapidly evolving pathogens such as influenza, and unknown future pandemic viruses.
Competing Interest Statement
M.C.M, I.B., J.diI., E.F., L.S., F.A.L., G.S., D.C., H.W.V. and A.T. are employees and may hold shares in Vir Biotechnology Inc
Funding Statement
D.L.R. is funded by the MRC (MC_UU_12014/12)
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
N/A
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
We gratefully acknowledge the authors, originating and submitting laboratories of the sequences from GISAID used in the current study. The full acknowledgement list can be found in Suppl. File S4.