Abstract
Importance SARS-CoV-2 genomic variants impacts the overall sensitivity of COVID-19 diagnosis, leading to false-negative diagnosis and the continued spread of the virus.
Objective To evaluate how nucleotide variability in target primer binding sites of the SARS-CoV-2 genomes may impact diagnosis using different recommended primer/probe sets, as well as to suggest the best primer/probes for diagnosis.
Design We downloaded 105,118 public SARS-CoV-2 genomes from GISAID (Sept, 25th, 2020), removed genomes of apparent worst quality (genome length <29kb and/or >5% ambiguous bases) and missing metadata, and performed an analysis of complementarity for the 13 most used diagnostic primers/probe sets for RT-PCR detection. We calculated the N rate and % of genome recovery, with all primer/probe-sets considering viral origin and clade. Results: Our findings indicate that currently, the Paris_nCoV-IP2, -IP4 and WHO|E_Sarbeco primer/probe sets for COVID-19, to perform the best diagnostically worldwide, recovering >99.5% of the good quality SARS-CoV-2 genomes from GISAID, with no mismatches. The Chinese_CDC|2019-nCoV-NP primer/probe set, among the first to be designed during the pandemic, was the most susceptible to currently most abundant SARS-CoV-2 variants. Mismatches encompassing the binding sites for this set are more frequent in Clade-GR and are highly prevalent in over 30 countries globally, including Brazil and India, two of the hardest hit countries. Conclusions: Detection of SARS-CoV-2 in patients may be hampered by significant variability in parts of the viral genome that are targeted by some widely used primer sets. The geographic distribution of different viral clades indicates that continuous assessment of primer sets via sequencing-based surveillance and viral evolutionary analysis is critical to accurate diagnostics. This study highlights sequence variance in target regions that may reduce the efficiency of primer:target hybridization that in turn may lead to the undetected spread of the virus. As such, due to this variance, the Chinese_CDC|2019-nCoV-NP-set should be used with caution, or avoided, especially in countries with high prevalence of the GR clade.
Question How variable are the binding-sites of primers/probes used for COVID-19 diagnosis?
Findings We investigated nucleotide variations in primer-binding sites used for COVID-19 diagnosis, in 93,143 SARS-CoV-2 genomes, and found primer sets targeting regions of increasingly nucleotide variance over time, such as the Chinese_CDC|2019-nCoV-NP. The frequency of these variations is higher in Clade-GR whose frequency is increasing worldwide. Paris_nCoV-IP2, IP4 and WHO|E_Sarbeco performed best.
Meaning We suggest the use of some sets to be halted and reinforce the importance of a continuous surveillance of SARS-CoV-2 variations to prompt the use of the best primers.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The author(s) received no financial support for the work presented here
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
IRB/oversight body is exempted
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Emails: Renan – renan.valieris{at}accamargo.org.br, Michal – m.kowalski{at}doctoral.uj.edu.pl, Alina – fshodan{at}gmail.com, Witold - witold.wydmanski{at}uj.edu.pl, Foox - jof3004{at}med.cornell.edu, Giovana – giovana.torrezan{at}accamargo.org.br, Ewelina - ewelina.pospiech{at}uj.edu.pl, Wojciech - w.branicki{at}gmail.com, Venkat - kasthuri.j.venkateswaran{at}jpl.nasa.gov, Bharath - bharath.prithiviraj{at}brooklyn.cuny.edu, Dhamodharan - rbdhamu{at}avanzbio.co.in, Klas - klas.udekwu{at}slu.se, Diana – dnoronha{at}accamargo.org.br, Dirce – dirce.carraro{at}accamargo.org.br, Chris - chm2042{at}med.cornell.edu
Data Availability
The raw data set used in this paper can be download at https://www.gisaid.org/