Abstract
The COVID-19 pandemic has seen the persistent emergence of fitter Variants of Concern (VOCs) that have successfully out-competed circulating strains, but the determinants of viral fitness remain unknown. Here we define ‘Distinctiveness’ of SARS-CoV-2 sequences based on a proteome-wide comparison with all prior sequences from the same geographical region. From the perspective of viral evolution, Distinctiveness captures “regional herd exposure” and has the advantage over the canonical concept of mutation, which relies foremost on the reference ancestral sequence that is invariant over time. By assessing the correlation between Distinctiveness and change in prevalence for all circulating lineages in each region when a new lineage is introduced, we find that the relative Distinctiveness of emergent SARS-CoV-2 lineages is associated with their competitive fitness (Pearson r = 0.67). Further, by assessing the Delta variant in India versus Brazil, we show that the same lineage can have different Distinctiveness-contributing positions in different geographical regions depending on the other variants that previously circulated in those regions. Finally, analysis of Omicron lineages in India and USA shows the BA.1 and BA.2 sub-lineages have comparable distinctiveness, suggesting that they may have similar levels of competitive fitness. Overall, our study proposes that augmenting the ongoing surveillance of highly mutated variants with real-time assessment of Distinctiveness can aid in achieving robust pandemic preparedness.
Competing Interest Statement
All authors are employees of nference and have financial interests in the company. nference is collaborating with bio-pharmaceutical, medical device and diagnostics companies, public health agencies, academic medical centers and health systems on data science initiatives unrelated to this study. These collaborations had no role in study design, data collection and analysis, decision to publish, or preparation of this manuscript.
Funding Statement
This study was self-funded by nference. No external funding was received for this study.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵+ Joint first authors
Data Availability
All SARS-CoV-2 sequences and associated metadata were downloaded from GISAID.