ABSTRACT
The ability to distinguish between SARS-CoV-2 variants of concern (VOCs) is of ongoing interest due to differences in transmissibility, response to vaccination, clinical prognosis, and therapy. Although detailed genetic characterization requires whole-genome sequencing (WGS), targeted nucleic acid amplification tests can serve a complementary role in clinical settings, as they are more rapid and accessible than sequencing in most laboratories.
We designed and analytically validated a two-reaction multiplex reverse transcription quantitative PCR (RT-qPCR) assay targeting spike protein mutations L452R, E484K, and N501Y in Reaction 1, and del69-70, K417N, and T478K in Reaction 2. This assay had 95-100% agreement with WGS in 502 upper respiratory swabs collected between April 26 and August 1, 2021, consisting of 43 Alpha, 2 Beta, 20 Gamma, 378 Delta, and 59 non-VOC infections. Validation in a separate group of 230 WGS-confirmed Omicron variant samples collected in December 2021 and January 2022 demonstrated 100% agreement.
This RT-qPCR-based approach can be implemented in clinical laboratories already performing SARS-CoV-2 nucleic acid amplification tests to assist in local epidemiological surveillance and clinical decision-making.
INTRODUCTION
Since the original strain of SARS-CoV-2 virus was first discovered in late 2019, numerous new variants have been identified, including variants of concern (VOCs) Alpha (B.1.1.7 and Q.*), Beta (B.1.351), Gamma (P.1 and sublineages), Delta (B.1.617.2 and AY.*) and Omicron (B.1.1.529 and BA.*). Importantly, these VOCs differ in their clinical prognosis, transmissibility, antibody susceptibility, and response to vaccination (1-21). Whole-genome sequencing (WGS) has played a critical role in identifying the emergence of these new variants (22-24), and millions of distinct sequences have been deposited into public repositories such as the Global Initiative on Sharing Avian Influenza Data consortium’s database (GISAID). However, WGS has a relatively long turnaround time, is labor-intensive, and requires instruments, bioinformatics support, and specially-trained staff that may not be widely available to many clinical laboratories. Therefore, the development of reverse transcription quantitative PCR (RT-qPCR) assays to detect and differentiate SARS-CoV-2 variants may be an important real-time complement to WGS epidemiologic surveillance, and may directly impact the clinical care of individual patients by informing selection of expensive and potentially difficult-to-source monoclonal antibody therapies (1, 6, 12-16, 19, 20, 25).
In this study, we report the design of a multiplex RT-qPCR assay that detects the del69-70, K417N, and T478K mutations in SARS-CoV-2 spike protein and targets the wild-type 69-70 sequence as an internal control. We further evaluate the performance of this assay in combination with our previously described RT-qPCR assay for the detection of L452R, E484K, and N501Y (26), and demonstrate the utility of this targeted mutational analysis to accurately distinguish among VOCs.
MATERIALS AND METHODS
Assay Design
The spike protein mutations associated with each variant that are interrogated by the RT-qPCR assays are summarized in Figure 1. In the first reaction (Reaction 1), we utilized our previously described RT-qPCR assay to detect L452R, E484K, and N501Y mutations in spike Receptor Binding Domain (RBD) (26). The present study describes the combination of this assay with a second, newly designed reaction (Reaction 2), which detects the deletion of amino acids 69-70 in the spike N-Terminal Domain (del69-70), as well as K417N and T478K mutations in the RBD. We use allele-specific RT-qPCR with probe sequences designed to maximize the difference in annealing temperature between mutant and wild-type sequences, allowing for differential binding and amplification. The primer/probe sequences for each mutation site are summarized in Table 1. Additional details are provided in the Supplemental Methods, ssDNA sequences for analytical experiments (Supplemental Table 1), guidance for interpretation and reporting (Supplemental Table 2), analytical validation data (Supplemental Table 3), and in-silico analysis of primer and probe sequences (Supplemental Figure 1).
Clinical Specimens
The samples included in the initial phase of this study were upper respiratory swab specimens collected from patients as part of routine clinical care from April 26, 2021 to August 1, 2021. Testing was performed at Stanford Clinical Virology Laboratory, which provides virologic testing for all Stanford-affiliated hospitals and outpatient centers in the San Francisco Bay Area. These initial SARS-CoV-2 nucleic acid amplification tests (NAATs) tests prior to genotyping were conducted according to manufacturer and emergency authorization instructions as previously described (26), and in the Supplemental Methods. All samples that tested positive for SARS-CoV-2 RNA were reflexed to genotyping. We then excluded samples that were initially tested by laboratory-based methods with cycle threshold values (Ct) ≥35 or relative light units (RLU) ≤1100. We included all available samples initially tested at or near the point of care as Ct data was not readily available for real-time specimen triage for these samples. We also excluded follow-up specimens to eliminate patient-level duplicates. Subsequent validation of this assay for Omicron variant detection was conducted using a convenience set of 230 Omicron variant samples with available WGS data collected between December 2, 2021 and January 5, 2022. This study was conducted with Stanford institutional review board approval (protocol 57519), and individual consent was waived.
Whole-Genome Sequencing
To validate the genotyping RT-qPCR reactions, we tested their performance against WGS in a subset of the samples in the initial April 26, 2021 to August 1, 2021 cohort with Ct <30. Samples with non-dominant variant typing by RT-qPCR were prioritized for sequencing, with the remaining isolates chosen randomly to fill a sequencing run. WGS was conducted as described previously, using a lab-developed pipeline consisting of long-range PCR, followed by fragmentation, then single-end 150-cycle sequencing using MiSeq reagent kit V3 (Illumina, San Diego, CA) (26). Genomes were assembled via a custom assembly and bioinformatics pipeline using NCBI NC_045512.2 as reference. Whole-genome sequences with ≥75% genome coverage to a depth of at least 10 reads were accepted for interpretation. Median number of aligned reads was 485,870 (interquartile range [IQR] 289,363-655,481), while median genome coverage to a depth of at least 10 reads was 99.3% (IQR 97.1-99.3%). Mutation calling required a depth of at least 12 reads with a minimum variant frequency of 20%. PANGO lineage assignment was performed using https://pangolin.cog-uk.io/ running pangolin version 3.1.17, while Nextclade Web v1.13.1 and auspice.us 0.8.0. were used to perform phylogenetic placement (2, 27, 28). Both lineage and clade assignments were performed on February 1, 2022. WGS data was deposited in GISAID (Supplemental Table 4).
Statistical Analysis
Positive percent agreement (PPA) and negative percent agreement (NPA) were reported with Clopper-Pearson score 95% binomial confidence intervals (CI) using WGS as the reference method. Analyses were conducted using the R statistical software package. This study was reported in accordance with Standards for the Reporting of Diagnostic Accuracy Studies (STARD) guidelines.
RESULTS
During the initial study period of April 26, 2021 to August 1, 2021, the Stanford Clinical Virology Laboratory received 102,158 specimens from 70,544 unique individuals. A total of 1,657 samples from unique individuals tested positive for SARS-CoV-2, of which 1,093 (66%) had genotyping RT-qPCR Reaction 1 and Reaction 2 performed, and 502 (30.3%) had successful WGS performed (Supplemental Figure 2). Of note, Reaction 1 was performed in near real-time, while Reaction 2 was performed retrospectively. Overall, this subset of sequenced samples, had patient and testing characteristics that closely resembled those of the larger cohorts (Supplemental Table 5).
The assay resulted as “Unable to Genotype” in 152 of 1,093 samples (14%) due to lack of amplification of any target in either or both reactions. Assay failure occurred predominantly in samples originally tested at or near the point of care (119/341, 35%), where all positive samples were triaged for genotyping without any filter. In contrast, assay failure occurred much less frequently in samples originally tested in the moderate-to-high complexity virology lab (33/752, 4%), where samples with lower viral loads were not triaged for genotyping.
For the combination of Reactions 1 and 2, the PPAs for del69-70, L452R, T478K, E484K, and N501Y were 100% (Table 2). Across all six loci, only K417N had a false negative, resulting in a PPA of 96% (27/28). In this sample, WGS showed a synonymous T to C mutation at position 1254 of the spike gene corresponding to amino acid position 418, changing the codon from ATT to ATC. This single base pair substitution likely decreased the annealing temperature, causing probe dropout and a false negative result.
The NPAs for del69-70, K417N, T478K, and N501Y were 100% (Table 2). L452R had an NPA of 95% (94/99) and E484K had an NPA of 99% (464/467). At the L452 locus, there were five samples positive for L452R mutation by RT-qPCR that were negative by WGS. Manual review of the WGS data showed that these were likely false negative WGS results due to insufficient (<12 reads) coverage at this codon. There were 3-9 reads containing the L452R mutation identified in the WGS primary data in each of these five samples. These five samples were all in the Delta lineage based on mutations found at other positions by sequencing.
For the E484K target, there were three samples that tested positive for the E to K mutation but in fact had a E484Q mutation determined by WGS. In both the E to K mutation (GAA to AAA) as well as the E to Q mutation (GAA to CAA), there was a single base substitution at the first position of the codon resulting in nonspecific probe binding. These three samples had a distinct blunted amplification curve with high Ct values associated with E484Q, as previously described (29).
Of note, there was a subset of variant AY.2, involving four specimens in our cohort, that had a V70F mutation causing both del69-70 and wt69-70 probes not to bind. However, because this variant would have T478K and K417N detected, the wt69-70 signal was not needed as an amplification control. This scenario has been reflected in the clinical interpretation table (Supplemental Table 2).
SARS-CoV-2 positive specimens collected starting December 2, 2021 began to show an unusual combination of mutations: presence of K417N and del69-70 only in Reaction 2, with all targets including internal control N501 not detected in Reaction 1. Based on in-silico analysis, we determined that these cases likely represented Omicron variant. While most Omicron variant strains possess del69-70, K417N, T478K, and N501Y mutations, they also have mutations at A67V, S477N, and Q498R, which would be predicted to interfere with binding of the del69-70/wt69-70, T478K, and N501Y/N501 probes, respectively. The del69-70 probe likely was able to retain some degree of binding due to the wider melting temperature differential of a 6-nucleotide deletion compared to a point mutation. As such, we validated this assay for Omicron detection using a set of 230 SARS-CoV-2 positive samples confirmed to be Omicron by WGS. We found that the unique pattern of K417N and del69-70 in Reaction 2, along with failure to amplify any target including internal control in Reaction 1, was present in 230/230 (100%, 95% CI 98-100%) Omicron samples tested. This pattern was not seen in any of the 1,093 non-Omicron samples previously genotyped.
We next predicted the WHO variant designation of samples using RT-qPCR and correlated them with the PANGO lineage assignments based on WGS data (Table 3). Mapping the genotyping results of the cohort based on RT-qPCR mutation analysis onto the Nextclade phylogenetic tree demonstrated close correlation with their WHO variant designations (Figure 2). Among the 732 clinical samples that were tested by both RT-qPCR and WGS, 43 samples (5.9%) were Alpha (B.1.1.7 or Q.3), 2 samples (0.3%) were Beta (B.1.351), 20 samples (2.7%) were Gamma (P.1 and sublineages), 378 samples (51.6%) were Delta (B.1.617.2 or AY.*), and 230 samples (31.4%) were Omicron (B.1.1.529 or BA.*). There were no RT-qPCR false negatives in assigning samples to these lineages. In addition, there were 59 samples (8.1%) tested by WGS that did not correspond to a WHO VOC as of February 2, 2022. Within this subset, there were 4 samples that were erroneously assigned as Gamma and 1 that was assigned as Beta by RT-qPCR. By WGS, these samples were variant of interest (VOI) Mu (B.1.621 or BB.2). This variant shares mutations E484K and N501Y with both the Beta and Gamma variants. A subset of Mu also includes the K417N mutation which is seen in the Beta variant. Thus, our PCR assay could not distinguish VOI Mu from VOCs Beta and Gamma. Our interpretation table included in the supplementary information reflects this limitation (Supplemental Table 2). The remaining 54 samples did not contain mutation patterns associated with VOCs.
DISCUSSION
The ability to distinguish between SARS-CoV-2 VOCs is important for epidemiologic surveillance, and in certain circumstances, the care of individual COVID-19 patients. In this study, we describe a two-reaction, multiplex RT-qPCR genotyping approach that examines the spike mutations del69-70, K417N, L452R, T478K, E484K, and N501Y. This targeted mutational analysis can be used to differentiate between the WHO VOCs Alpha (B.1.1.7 and Q.*), Beta (B.1.351), Gamma (P.1 and sublineages), Delta (B.1.617.2 and AY.*), and Omicron (B.1.1.519 and BA.*), as well as identify samples which cannot be categorized into a known VOC or VOI. Because the first part of this approach, Reaction 1, has been previously described, this current study focuses on Reaction 2 and the integrated results of the two-reaction test (26). Overall, these reactions showed high concordance with WGS, demonstrating over 95% PPA and NPA for all targeted mutations.
Several groups have previously described similar approaches to SARS-CoV-2 variant determination by RT-qPCR and digital droplet RT-PCR, particularly for the spike del69-70, E484K, and N501Y positions (30-37). Some of these assays included additional mutation sites that were not in our study, such as spike del144 or ORF1a Δ3675–3677 (30, 36). These earlier assays, published prior to the rise of Delta, primarily targeted VOCs Alpha, Beta, and Gamma. This was then followed by a surge of reports on the detection of the Delta variant. Garson et al. utilized double-mismatch allele-specific RT-PCR at L452R and T478K to differentiate Delta variant from other VOCs in 42 UK patient samples (38). Aoki et al. described an approach that combines nested PCR along with high-resolution melting analysis at those same mutations, which was validated in a small Japanese patient cohort (39). Barua et al. used a slightly different approach, taking advantage of the difference in melting temperature of a probe targeted to Delta mutation in spike T478K compared to other variants for a Delta-specific RT-FRET-PCR assay (40). Another defining feature of VOC Delta is spike del156-157, which was the target of a Delta variant PCR test developed by Hamill et al. (41). To our knowledge, the two-reaction multiplex RT-qPCR approach outlined in this study examining six different mutation sites is the most comprehensive variant genotyping test described that can identify Alpha, Beta, Gamma, Delta, and Omicron variants.
Multiplex RT-qPCR SARS-CoV-2 genotyping takes advantage of a commonly-used molecular technique that can be implemented by laboratories using existing equipment, materials, and personnel. Because this assay is more accessible and has a more rapid turnaround time than WGS, we envision it serving a complimentary role to sequencing. The genotyping RT-qPCR can provide more detailed and up-to-date epidemiological information by increasing the sample size of categorized variants in each geographic region, and can be essential in tracking local outbreaks in areas without direct access to WGS. For individual patients, the turnaround time of several hours also allows it to directly impact clinical care. For example, VOCs show differential susceptibility to monoclonal antibody treatments, and variant reporting could include this information (Supplemental Table 2) (1). Furthermore, current ongoing trials for small molecule drugs and other treatments may yield more information about variant-specific treatment strategies.
Importantly, RT-qPCR genotyping can help prioritize samples for sequencing. Although sequencing is needed for identification of novel variants and characterization of viral evolution, pre-screening by RT-qPCR can enrich for samples with atypical mutation patterns, lead to more efficient use of sequencing resources, and potentially more rapid identification of new variants.
This RT-qPCR approach has several limitations as evidenced by its assay failure rate of 14% across all tested samples in our initial cohort. Because multiplex RT-qPCRs involve a mixture of multiple sets of primers and probes, they are inherently less sensitive than single-target assays. For samples with RNA concentrations near the lower limits of detection, freeze-thaw cycles could impact RNA stability, and may not yield consistent results due to stochastic variation. This issue could be alleviated by implementing a Ct/RLU filter to only genotype samples most likely to yield interpretable results. Within our 1,093 sample cohort, the lower assay failure rate in samples tested in our clinical virology laboratory (4%) compared to near-care settings (35%) is likely attributable to genotyping only specimens with higher viral RNA levels. Note, however, that even with such filtering, mutation analysis by RT-qPCR remains more sensitive than WGS. The other limitation to this approach is the rapidly changing variant landscape which may render such an assay obsolete in the matter of weeks. However, the inclusion of multiple targets in key residues that influence viral fitness helps guard against this possibility, as evidenced by our ability to detect the emergence of Omicron variant in our population. Still, flexibility and vigilance are required to re-design and re-validate these types of assays as novel variants emerge.
In summary, we developed and validated a two-reaction multiplex RT-qPCR genotyping strategy that interrogates six clinically relevant mutations within the SARS-CoV-2 spike: del69-70, K417N, L452R, T478K, E484K, and N501Y. This approach allows for identification of WHO VOCs Alpha, Beta, Gamma, Delta and Omicron with excellent concordance to WGS. Overall, this method complements WGS, and is suitable for clinical decision-making, near real-time variant surveillance, and the triage of samples for sequencing.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
ACKNOWLEDGEMENTS
We would like to thank the Stanford Clinical Virology Laboratory Staff for their excellence and perseverance in caring for our community while facing the unique challenges of this pandemic. We additionally are grateful to the Stanford Protein and Nucleic Acid Facility for their synthesis of reagents on short notice.