A prospective diagnostic study to measure the accuracy of detection of SARS-CoV-2 Variants Of Concern (VOC) utilising a novel RT-PCR GENotyping algorithm in an In silico Evaluation (VOC-GENIE) ================================================================================================================================================================================================ * Daryl Borley * R.A. Trevor * Alex Richter * Stephen Kidd * Nick Cortes * Nathan Moore * Alice Goring * Kate Templeton * Prachi Teltumbde * Seden Grippon * Paul Oladimeji * Aida Sanchez-Bretano * Andrew Dawson * Joanne E Martin ## Abstract **Background** SARS-CoV-2 variants of concern (VOCs) have been associated with higher rate of transmission, and evasion of immunisation and antibody therapeutics. Variant sequencing is widely utilized in the UK. However, only 0.5% (~650k) of the 133 million cumulative positive cases worldwide were sequenced (in GISAID) on 08 April 2021 with 97% from Europe and North America and only ~0.25% (~320k) were variant sequences. This may be due to the lack of availability, high cost, infrastructure and expert staff required for sequencing. Public health decisions based on a non-randomised sample of 0.5% of the population may be insufficiently powered, and subject to sampling bias and systematic error. In addition, sequencing is rarely available *in situ* in a clinically relevant timeframe and thus, is not currently compatible with diagnosis and treatment patient care pathways. Therefore, we investigated an alternative approach using polymerase chain reaction (PCR) genotyping to detect the key single nucleotide polymorphisms (SNPs) associated with increased transmission and immune evasion in SARS-CoV-2 variants. **Methods** We investigated the utility of SARS-CoV-2 SNP detection with a panel of PCR-genotyping assays in a large data set of 640,482 SARS-CoV-2 high quality, full length sequences using a prospective *in silico* trial design and explored the potential impact of rapid *in situ* variant testing on the COVID-19 diagnosis and treatment patient pathway. **Results** Five SNPs were selected by screening the published literature for a reported association with increased transmission and / or immune evasion. 344881 sequences contained one or more of the five SNPs. This algorithm of SNPs was found to be able to identify the four variants of concern (VOCs) and sequences containing the E484K and L452R escape mutations. **Interpretation** The *in silico* analysis suggest that the key mutations and variants of SARS-CoV-2 may be reliably detected using a focused algorithm of biologically relevant SNPs. This highlights the potential for rapid *in situ* PCR genotyping to compliment or replace sequencing or to be utilized instead of sequences in settings where sequencing is not feasible, accessible or affordable. Rapid detection of variants with *in situ* PCR genotyping may facilitate a more effective COVID-19 diagnosis and treatment patient pathway. **Funding** The study was funded by Primer Design (UK), with kind contributions from all academic partners. ## Introduction In December 2020, the first SARS-CoV-2 variant of concern (VOC) 20I/501Y.V1 (B.1.1.7) was identified in Kent, UK and 20H/501Y.V2 (B.1.3.5.1) was identified in South Africa. Subsequently, VOCs were identified in Brazil and Bristol and more than 20 other significant variants have been identified globally. Variants of concern are associated with higher rate of transmission, mortality and morbidity and/or the potential to evade immunisation and/or antibody therapeutics.1B.1.1.7 has been associated with an increased transmission and mortality risk (1.64, 95% CI 1.32 - 2.04)2 and vaccines are reported to offer diminished efficacy against variants with the E484K escape mutation.3 VOCs with the E484K spike protein mutation include B.1.3.5.1 B.1.351 (South Africa) and VOC-202102/02 (B.1.1.7 with E484K) and P.1 (Brazil) E484K has been described as an escape mutation due to an association with resistance to convalescent sera, antibody therapies and increased re-infection rates.4 The E484K and L452R mutations have also been associated with diminished vaccine efficacy, for example the Oxford/AstraZeneca vaccine was reported to be only 21.9% effective against the South African variant5 in *in vitro* studies and the Pfizer–BioNTech COVID-19 vaccine elicits antibodies that only partially recognise this variant.6 The hospital acquired SARS-CoV-2 infection rate was estimated to be 12.5% in April 20207. Variants with escape mutations are associated with infections with higher mortality rates and the previously used convalescent serum and monoclonal antibodies may not be as effective.8 This may lead to an increase in hospital admission and emergency room attendance rates, resulting in a need for rapid *in situ* variant testing to reduce the risk of nosocomial infection with variants. The UK approach to variant testing is a leading continuous nationwide surveillance programme, using genome sequencing9,10. Currently, patients’ positive reverse transcription polymerase chain reaction (RT-PCR) samples are reported and then sent to large central sequencing facilities with a turn-around-time of 1-2 weeks. While this timeframe may be sufficient for epidemiological surveillance, this approach is neither patient centric nor able to deliver results in a clinically relevant timeframe. Sequencing is currently not suitable for in situ variant detection due to its inherent lack of speed, on-site availability, and costs. In addition, few nations are meeting the minimum requirements set out by different countries to provide sufficient sample numbers for adequate sequencing based on variant surveillance and there are only ~685k high quality sequences in GISAID11, representing just 0.51% of the 133.1 million cumulative positive cases worldwide (as on 08 April 2021) with approximately 97% of those sequences coming from Europe and North America. An alternative approach utilises RT-PCR genotyping to identify SNPs and was found to provide rapid, cost-effective, and reliable variant monitoring9 in a pilot trial. The accuracy of variant PCR-genotyping has not yet been reported with a large data set. Thus, in this study we investigated a large data set of ~640,000 SARS-CoV-2 sequences with PCR-genotyping using a prospective *in silico* trial design, and explored the potential impact on the patient pathways. ## Methods Five SARS-CoV-2 SNPs were prospectively selected from a review of the published literature for an association with (i) increased SARS-CoV-2 transmission and/or (ii) diminished efficacy of monoclonal antibody therapy, convalescent plasma therapy, vaccine derived immunity, or naturally acquired immunity.10 The study team members that prospectively selected the SNPs were different from those that performed the *in silico* analysis or the investigation of the SNP dataset. The publicly available Spike Protein Sequence alignment was downloaded from GISAID11 on the 19th of March 2021. Subsequently, a further alignment using a multiple sequence alignment program (MAFFT) with a NJ / UPGMA phylogeny was performed. The sequences with more than 1% ambiguous bases were removed from the original dataset of 781,815 sequences. Each protein sequence was then interrogated in excel using the function search string =IF(RIGHT(LEFT(amino acid position, is residue X), 1)=Residue of interest, 1, 0) to determine the number of sequences that contained each SNP of interest. The =IF(AND(Residue X, Residue Y…),1,0) functions were then used to sequentially determine the sequences containing the SNPs, following the algorithm shown in Figure 1 to exclude the Wild type Sequences at each level of the analysis. ![Figure 1](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/05/09/2021.05.05.21256396/F1.medium.gif) [Figure 1](http://medrxiv.org/content/early/2021/05/09/2021.05.05.21256396/F1) Figure 1 Diagram of the number of GISAID sequences relating to mutations of interest at each step of the algorithm to distinguish VoCs View this table: [Table 1:](http://medrxiv.org/content/early/2021/05/09/2021.05.05.21256396/T1) Table 1: List of the five mutations of interest and their corresponding effect on transmission and immune evasion ## Results The five SARS-CoV-2 single nucleotide polymorphisms (SNPs) were selected for their reported association with (i) an increase in SARS-CoV-2 transmission and / or (ii) diminished efficacy of monoclonal antibody therapy, convalescent plasma therapy, vaccine derived immunity, or naturally acquired immunity.10 The sequences with more than 1% ambiguous bases were removed from the original dataset of 781,815 sequences. The remaining 640,482 sequences were investigated using the VOC Genie algorithm described in the Figure 1 below: At step (1) 640,482 sequences were assessed for the presence of the N501Y mutation associated with increased transmission. 324,234 sequences contained the SNP while 316,248 were wild type and assessed at step (2) for the E484K and L452R escape mutations. Surprisingly, 7,979 sequences contained the E484K mutation but did not contain the N501Y nor B.1.1.7 identifiers. At step (3), only 2,866 sequences contained the E484K escape mutation while B.1.1.7 was detected in over 321k sequences. The L452R SNP was detected in 13 sequences at (4). At the (5), B.1.1.7 with E484K (Bristol), South African and Brazil (P1) variants can be distinguished using the K417T/N mutation in combination with the E484K, N501Y and B.1.1.7 SNPs. 234 Bristol VOC sequences were identified, 1,766 South African and 866 Brazilian. The panel of five SNPs detected all the VOCs and mutations of biological significance. The potential impact of rapid PCR identification of variants on patient pathways demonstrated in Figure 2. ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/05/09/2021.05.05.21256396/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2021/05/09/2021.05.05.21256396/F2) Figure 2: Pathways with simulated timelines for identification of variants used globally ## Discussion The SARS-CoV-2 sequences in GISAID represent only 0.5% of the 133 million cumulative cases worldwide, with approximately 97% of those sequences coming from Europe and North America. This is a very small non-randomised sample, which may be insufficiently powered, and subject to sampling bias and systematic error and thus our current understanding of the global epidemiology of SARS-CoV-2 may be extremely limited. The algorithm of five SNPs identified all four of the variants of concern and mutations of biological significance and two previously unreported SNP combinations (a) N501Y + L452R and (b) E484K in the absence of the N501Y. (Figure 1) The other potential use of SNP based variant testing is in those cases where high quality nucleic acid is not available for sequencing. 141,297 out of 780,815 cases in GISAID were poor quality or incomplete sequences, for which PCR SNP testing may be a more robust technique. Further, patient samples with VOC genotype also have a higher viral load27, meaning these samples are more likely to work for sequencing biasing the results of epidemiological mapping. Accurate genomic sequencing requires a high quality, full length RNA from a sample with a high viral load (~CT<30), whereas due the amplification step, PCR is able to detect VOCs in samples with a low viral load. From these initial results, PCR genotyping may offer an alternative approach which is cheaper, more accessible and faster alternative to sequencing9 and could therefore, be deployed and utilised more broadly to build a more representative data set for public health decision making. Our analysis demonstrates the potential of PCR SNP genotyping to provide rapid *in situ* variant detection with a widely accessible and clinically relevant approach. PCR genotyping is limited to SNPs that have already been identified by sequencing, and so sequencing is still required for *de novo* detection. However, the public health and clinical interventions are not based on *de novo* detection but on the evidence that a known variant has a deleterious adaption. Thus, PCR genotyping maybe better suited for clinical and population variant detection, allowing the scarce sequencing resources to be better utilised for *de novo* detection of biologically relevant SNPS. The potential changes to the COVID-19 patient pathway from including PCR genotyping are explored in Figure 2 below: 1. PCR and then VOC PCR where novel variants could be identified for sequencing. 2. Current Pathway: PCR and then sequencing to identify VOCs 3. Positive PCR reflexed on to VoC PCR for variant identification 4. Clinically diagnosed hospital patients tested directly for variants Usually positive COVID-19 tests are sent to a network of specialised centres for sequencing and to determine the presence of SNPs and VOCs. This is the traditional pathway 2 in figure 2 above and is associated with delay in availability of results, increased costs as well as the need for more samples from individual patients. Alternatively in pathway 3, the PCR VOC pathway, a single positive sample is used to detect VOCs and the key SNPs in approximately two hours. Pathway 1 is the most efficient and effective in settings where sequencing is available – samples follow the PCR VOC pathway and only samples that don’t match known VOCs and SNP combinations i.e., potential *de novo* variants are sent for sequencing. In the hospital setting, pathway 4, the PCR detection of VOCs can be used in a near patient setting to facilitate testing and treatment decisions in a clinically relevant timeframe and also to test all staff and patients attending or being admitted to hospitals to reduce the risk of nosocomial transmission. This pathway is also applicable to the office, transport, educational and large event settings, which have acted as super spreader events for SARS-CoV-2 and perhaps exemplifies the potential for a simple and reliable rapid PCR technology might be deployed to underpin a ‘return to normal’ economic recovery. ## Data Availability The data can be made available on request ## Contributorship statement Contributors: All authors (DB, RAT, PT, SG, PO, ASB, AR, SK, NM, NC, AG, AP, HM, AD, JM) contributed to designing the work, analysing the data, and drafting and revising the manuscript. ## Declaration of interests (please fill in against name) Stephen Kidd, Nick Cortes, Nathan Moore, Kate Templeton, Alex Richter and Alice Goring have no conflicting interests. R.A Trevor, Daryl Borley, Paul Oladimeji, Prachi Teltumbde, Seden Grippon, Andrew Dawson and Aida Sanchez-Bretano are employees of Novacyt group, which is a medical diagnostics company operating in the COVID-19 variant testing field. R.A Trevor has no additional direct conflicts but is a shareholder in a number of un-related private and public companies that do not operate in the COVID-19 or diagnostics field. Joanne Martin has no direct conflicts of interest. She is a principal investigator of a care home trial using Novacyt rapid testing and National Specialty Advisor for Pathology for NHS England and Improvement. She is a director and shareholder of Biomoti a drug delivery company and has a shareholding in Glyconics, a diagnostics company. ## Role of the funding source The funder of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. All authors had full access to all the data in the study and had final responsibility for the decision to submit for publication. ## Acknowledgement Tom Jefferson contributed to data acquisition. ## Footnotes * Daryl.Borley{at}novacyt.com CTO{at}novacyt.com A.G.Richter{at}bham.ac.uk stephen.kidd{at}hhft.nhs.uk Nick.Cortes{at}hhft.nhs.uk Nathan.moore{at}hhft.nhs.uk Alice.Goring{at}hhft.nhs.uk Kate.Templeton{at}nhslothian.scot.nhs.uk Prachi.teltumbde{at}novacyt.com Seden.Grippon{at}novacyt.com Paul.oladimeji{at}novacyt.com Aida.Sanchez{at}novacyt.com Andrew.Dawson{at}novacyt.com j.e.martin{at}qmul.ac.uk * Received May 5, 2021. * Revision received May 5, 2021. * Accepted May 9, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. 1.McNally A. What makes new variants of SARS-CoV-2 concerning is not where they come from, but the mutations they contain. BMJ 2021; 372:504 2. 2.Challen R, Brooks-Pollock E, Read J M, Dyson L, Tsaneva-Atanasova K, Danon L et al. Risk of mortality in patients infected with SARS-CoV2 variant of concern 202012/1: matched cohort study. BMJ 2021; 372:579 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1136/bmj.n579&link_type=DOI) 3. 3.Wise J. Covid-19: The E484K mutation and the risks it poses BMJ 2021;372:359 4. 4.Choi B, Choudhary MC, Regan J, Sparks JA, Padera RF, Qiu X, et al. Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host. N Engl J Med. 2020;383(23):2291–3 [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F05%2F09%2F2021.05.05.21256396.atom) 5. 5.Garcia-Beltran WF, Lam EC, Denis KS et al. Multiple SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunity. Cell. 2021. [online] Available at: [https://www.sciencedirect.com/science/article/pii/S0092867421002981](https://www.sciencedirect.com/science/article/pii/S0092867421002981) [Accessed April 8, 2021]. 6. 6.Madhi SA, Baillie V, Cutland CL et al. Efficacy of the ChAdOx1 nCoV-19 Covid-19 vaccine against the B. 1.351 variant. New England Journal of Medicine (2021). [online] Available at: [https://www.nejm.org/doi/full/10.1056/NEJMoa2102214](https://www.nejm.org/doi/full/10.1056/NEJMoa2102214) [Accessed April 8, 2021]. 7. 7.Carter B, Collins, JT, Barlow-Pay F et al. Nosocomial COVID-19 infection: examining the risk of mortality. The COPE-Nosocomial Study (COVID in Older People). Journal of Hospital infections;106 (2) 367–394. 8. 8.Abdool Karim SS, de Oliveira T. New SARS-CoV-2 variants—clinical, public health, and vaccine implications. New England Journal of Medicine.(2021). [online] Available at: [https://www.nejm.org/doi/full/10.1056/NEJMc2100362](https://www.nejm.org/doi/full/10.1056/NEJMc2100362) 9. 9.Harper H, Burridge A, Winfield M et al. Detecting SARS-CoV-2 variants with SNP genotyping. PloS one. 2021 feb 24;16(2):e0243185. 10. 10.[https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/954990/s1015-sars-cov-2-immunity-escape-variants.pdf](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/954990/s1015-sars-cov-2-immunity-escape-variants.pdf) [Accessed April 8, 2021]. 11. 11.Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Global Challenges. 2017 jan;1(1):33–46. 12. 12.Starr TN, Greaney AJ, Hilton SK et al. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell, 2021; 182: 1295–1310. Doi: [https://doi.org/10.1016/j.cell.2020.08.012](https://doi.org/10.1016/j.cell.2020.08.012) 13. 13.Santos J, Passos G. The high infectivity of SARS-CoV-2 B.1.1.7 is associated with increased interaction force between Spike-ACE2 caused by the viral N501Y mutation. medRxiv, 2021; (published online Jan 1) (preprint). Doi: [https://doi.org/10.1101/2020.12.29.424708](https://doi.org/10.1101/2020.12.29.424708) 14. 14.Leung K, Shum Marcus HH, Leung GM, Lam TTY, Wu JT. Early transmissibility assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom, October to November 2020. Euro Surveillance. 2021; 26(1): pii=2002106. Doi: [https://doi.org/10.2807/1560-7917.ES.2020.26.1.2002106](https://doi.org/10.2807/1560-7917.ES.2020.26.1.2002106) 15. 15.Shen X, Tang H, McDanal C, et al. SARS-CoV-2 variant B.1.1.7 is susceptible to neutralizing antibodies elicited by ancestral Spike vaccines. bioRxiv, 2021; (publish online Jan 29) (preprint). Doi: [https://doi.org/10.1101/2021.01.27.428516](https://doi.org/10.1101/2021.01.27.428516) 16. 16.Wang Z, Schmidt F, Weisblum Y, et al. mRNA vaccine-elicited antibodies to SARS-CoV-2 and circulating variants. bioRxiv, 2021; (published online Jan 30) (preprint). Doi: [https://doi.org/10.1101/2021.01.15.426911](https://doi.org/10.1101/2021.01.15.426911) 17. 17.Chen, R.E., Zhang, X., Case, J.B. et al. Resistance of SARS-CoV-2 variants to neutralization by monoclonal and serum-derived polyclonal antibodies. Nat Med, 2021; 27:717–726. Doi: [https://doi.org/10.1038/s41591-021-01294-w](https://doi.org/10.1038/s41591-021-01294-w) 18. 18.Jangra S, Ye C, Rathnasinghe R, et al. The E484K mutation in the SARS-CoV-2 spike protein reduces but does not abolish neutralizing activity of human convalescent and post-vaccination sera. medRxiv, 2021; (published online Jan 29) (preprint). Doi: [https://doi.org/10.1101/2021.01.26.21250543](https://doi.org/10.1101/2021.01.26.21250543) 19. 19.Garcia-Beltran W, Lam EC, Denis KSt, et al. Multiple SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunity. Cell, 2021;184:1–12. Doi: [https://doi.org/10.1016/j.cell.2021.03.013](https://doi.org/10.1016/j.cell.2021.03.013) [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2020.12.019&link_type=DOI) 20. 20.Luan B, Huynh T. Insights on SARS-CoV-2’s Mutations for Evading Human Antibodies: Sacrifice and Survival. bioRxiv, 2021; (published online Feb 7) (preprint). Doi: [https://doi.org/10.1101/2021.02.06.430088](https://doi.org/10.1101/2021.02.06.430088) 21. 21.Yin R, Guest JD, Taherzadeh G, et al. Structural and energetic profiling of SARS-CoV-2 antibody recognition and the impact of circulating variants. bioRxiv, 2021; (published online Mar 21) (preprint). Doi: [https://doi.org/10.1101/2021.03.21.436311](https://doi.org/10.1101/2021.03.21.436311) 22. 22.Souza Santos K, Ramos Oliveira J, Machado RRG, et al. Immunodominant B cell epitope in SARS-CoV-2 RBD comprises a B.1.351 and P.1 mutation hotspot: implications for viral spread and antibody escape. medRxiv, 2021; (published online Mar 12) (preprint). Doi: [https://doi.org/10.1101/2021.03.11.21253399](https://doi.org/10.1101/2021.03.11.21253399) 23. 23.Yang Q, Hughes TA, Kelkar A, et al. Inhibition of SARS-CoV-2 viral entry upon blocking N- and O-glycan elaboration. eLife, 2020; (published online Oct 26) Doi: [https://doi.org/10.7554/eLife.61552](https://doi.org/10.7554/eLife.61552) 24. 24.Kuzmina A, Khalaila Y, Voloshin O, et al. SARS-CoV-2 spike variants exhibit differential infectivity and neutralization resistance to convalescent or post-vaccination sera. Cell Host & Microbe, 2021; 29(4) 522–528.e2. Doi: [https://doi.org/10.1016/j.chom.2021.03.008](https://doi.org/10.1016/j.chom.2021.03.008). 25. 25.McCallum M, Bassi J, De Marco A, et al. SARS-CoV-2 immune evasion by variant B.1.427/B.1.429. bioRxiv, 2021; (published online Apr 1) (preprint). Doi: [https://doi.org/10.1101/2021.03.31.437925](https://doi.org/10.1101/2021.03.31.437925) 26. 26.Gan HH, Twaddle A, Marchand B, Gunsalus KC. Structural modeling of the SARS-CoV-2 Spike/human ACE2 complex interface can identify high-affinity variants associated with increased transmissibility. bioRxiv, 2021; (published online Mar 22) (preprint). doi: [https://doi.org/10.1101/2021.03.22.436454](https://doi.org/10.1101/2021.03.22.436454) 27. 27.Kidd M, Richter A, Best A et al. S-variant SARS-CoV-2 lineage B1. 1.7 is associated with significantly higher viral loads in samples tested by ThermoFisher TaqPath RT-qPCR. The Journal of infectious diseases. 2021; (published online Feb 13) (preprint). doi: [https://doi.org/10.1093/infdis/jiab082](https://doi.org/10.1093/infdis/jiab082)