Abstract
SARS-CoV-2 is a betacoronavirus responsible for the COVID-19 pandemic that has affected millions of people worldwide, with no dedicated treatment or vaccine currently available. As pharmaceutical research against and the most frequently used tests for SARS-CoV-2 infection both depend on the genomic and peptide sequences of the virus for their efficacy, understanding the mutation rates and content of the virus is critical. Two key proteins for SARS-CoV-2 infection and replication are the S protein, responsible for viral entry into the cells, and RdRp, the RNA polymerase responsible for replicating the viral genome. Due to their roles in the viral cycle, these proteins are crucial for the fitness and infectiousness of the virus. Our previous findings had shown that the two most frequently observed mutations in the SARS-CoV-2 genome, 14408C>T in the RdRp coding region, and 23403A>G in the S gene, are correlated with higher mutation density over time. In this study, we further detail the selection dynamics and the mutation rates of SARS-CoV-2 genes, comparing them between isolates carrying both mutations, and isolates carrying neither. We find that the S gene and the RdRp coding region show the highest variance between the genotypes, and their selection dynamics contrast each other over time. The S gene displays higher positive selection in mutant isolates early on, and undergoes increasing negative selection over time, whereas the RdRp region in the mutant isolates shows strong negative selection throughout the pandemic.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Necla Kochan was supported by the Scientific and Technological Research Council of Turkey (TUBITAK-STAR). Yavuz Oktay is supported by the Turkish Academy of Sciences Young Investigator Program (TUBA- GEBİP). The funders had no role in design, data collection, and analysis, decision to publish, or preparation of the manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Only publicly available data from GISAID database were used for in silico analyses, therefore, no approval from an IRB was sought.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The data is available at Mendeley: Kochan, Necla; Eskier, Doga; Suner, Asli; Karakulah, Gokhan; Oktay, Yavuz (2020), SARS-CoV-2 GISAID UK-US isolates (2020-09-07) genotyping VCF, Mendeley Data, V1, doi: 10.17632/5dfj2hhnng.1