ABSTRACT
Background To limit viral transmission, COVID-19 testing strategies must evolve as new SARS-CoV-2 variants (and new respiratory viruses) emerge to ensure that the specimen types and test analytical sensitivities being used will reliably detect individuals during the pre-infectious and infectious periods. Our accompanying work demonstrated that there are extreme differences in viral loads among paired saliva (SA), anterior-nares swab (ANS) and oropharyngeal swab (OPS) specimens collected from the same person and timepoint. We hypothesized that these extreme differences may prevent low-analytical-sensitivity assays (such as antigen rapid diagnostic tests, Ag-RDTs) performed on a single specimen type from reliably detecting pre-infectious and infectious individuals.
Methods We conducted a longitudinal COVID-19 household-transmission study in which 228 participants collected SA, ANS, and OPS specimens for viral-load quantification by RT-qPCR, and performed an ANS Ag-RDT (Quidel QuickVue At-Home OTC COVID-19 Test) daily. We evaluated the performance of the Ag-RDT (n=2215 tests) to detect infected individuals (positive results in any specimen type by RT-qPCR) and individuals with presumed infectious viral loads (at or above thresholds of 104, 105, 106, or 107 copies/mL).
Results Overall, the daily Ag-RDT detected 44% (358/811) timepoints from infected individuals. From 17 participants who enrolled early in the course of infection, we found that daily Ag-RDT performance was higher at timepoints when symptoms were reported, but symptoms only weakly correlated with SARS-CoV-2 viral loads, so ANS Ag-RDT clinical sensitivity remained below 50%. The three specimen types exhibited asynchronous presumably-infectious periods (regardless of the infectious viral-load threshold chosen) and the rise in ANS viral loads was delayed relative to SA or OPS for nearly all individuals, which resulted in the daily ANS Ag-RDT detecting only 3% in the pre-infectious period and 63% in the infectious period. We evaluated a computationally-contrived combined AN–OP swab based on viral loads from ANS and OPS specimens collected at the same timepoint; when tested with similar analytical sensitivity as the Ag-RDT, this combined swab was predicted to have significantly better performance, detecting up to 82% of infectious individuals.
Conclusion Daily ANS rapid antigen testing missed virtually all pre-infectious individuals, and more than one third of presumed infectious individuals due to low-analytical-sensitivity of the assay, a delayed rise in ANS viral loads, and asynchronous infectious viral loads in SA or OPS. When high-analytical-sensitivity assays are not available and low-analytical-sensitivity tests such as Ag-RDTs must be used for SARS-CoV-2 detection, an AN–OP combination swab is predicted to be most effective for detection of pre-infectious and infectious individuals. More generally, low-analytical-sensitivity tests are likely to perform more robustly using oral-nasal combination specimen types to detect new SASR-CoV-2 variants and emergent upper respiratory viruses.
INTRODUCTION
Earliest detection of SARS-CoV-2 infections is critical for reducing transmission, minimizing spread of new variants, and initiating treatments sooner for better patient outcomes. Antigen rapid diagnostic tests (Ag-RDTs) with nasal swabs are increasingly used for SARS-CoV-2 screening and diagnosis globally.1-4 In the U.S., media and public-health figures advocate for the use of Ag-RDTs and a large government campaign5 is providing 1 billion of these tests free for at-home use. Indeed, Ag-RDTs are a powerful tool given the low cost (compared with molecular tests), speed, and portability—all of which improve accessibility in remote or lower-resourced settings and at-home use.6 However, Ag-RDTs and some rapid molecular tests have lower analytical sensitivity than most gold-standard reverse-transcription quantitative PCR (RT-qPCR) tests7 and require high viral loads to reliably yield positive results.8,9 Further, Ag-RDTs are frequently used in unintended ways. For example, although many Ag-RDTs are not authorized for asymptomatic use and/or have poor clinical sensitivity in asymptomatic populations9,10 they continue to be used widely for test-to-enter and serial-screening purposes.
One view of rapid antigen testing is that these tests (and other diagnostics with low analytical sensitivity) can still prevent or mitigate SARS-CoV-2 outbreaks if they are used frequently.10,11 This view is based on the assumption that viral loads rise quickly from infectious levels to those detectable by low-analytical-sensitivity assays, making high-frequency rapid antigen testing with immediate results more effective than a high-analytical-sensitivity test with delayed results. However, the data do not support this view; instead, numerous longitudinal assessments of viral loads counter this assumption, showing that several days can pass between when viral loads reach potentially infectious viral loads and when they reach the limits of detection (LODs) of low-analytical-sensitivity assays.12-17 Several studies have recovered replication-competent (infectious) virus from clinical specimens with viral loads between 104 to 107 copies/mL.11-20
In our recent large household transmission study with frequent (daily) sampling of saliva (SA), anterior-nares swabs (ANS) and oropharyngeal swabs (OPS),17 two major findings emerged that have implications for the ability of Ag-RDTs with nasal swabs to detect individuals infected with the Omicron variant, particularly in the earliest days of infection. First, we observed a delay in the rise of ANS viral loads relative to those in specimens from the oral cavity; this finding is consistent with previous reports by us21 and others22,23 for pre-Omicron SARS-CoV-2 variants. Second, we found that viral loads differed significantly (by more than 9 orders of magnitude) among different specimen types within the same person at the same timepoint. Individuals often had high and presumably infectious viral loads in one specimen type (e.g., OPS), yet had undetectable or very low viral loads in another type (e.g., ANS).17
Here, we tested whether these observed differences in viral loads in different specimen types and delays in the rise in ANS viral loads would result in poor performance of an ANS Ag-RDT to detect SARS-CoV-2 Omicron infections in the presumed pre-infectious and infectious periods.
Investigations of the performance of Ag-RDTs for detecting the pre-infectious and infectious periods are challenging to undertake because assessing for the presence of replication-competent virus in clinical specimens complicated.24,25 Viral culture is difficult, costly, and restricted to specialized laboratories, so investigations of infectious virus are rarely done. Instead, high viral loads (above 104 to 107 copies/mL) can be used as a surrogate to infer infectiousness.11-20 Although several studies have evaluated the performance of Ag-RDTs, only a few studies have investigated the clinical sensitivity of Ag-RDTs relative to periods of known infectiousness26-28, and none considered infectious virus in specimen types other than the one tested by the Ag-RDT. Further, all participants in these research studies had already received a positive COVID-19 test result prior to beginning sampling, and most studies used a nasal-swab test as a determiner of initial positivity, which inherently selects individuals later in the infection with detectable viral loads in the nose.
In this investigation, we performed a community-based study to evaluate the ability of a daily ANS Ag-RDT to detect the pre-infectious and infectious periods of SARS-CoV-2 infection. Participants collected paired saliva, anterior-nares swab, and oropharyngeal swab specimen for SARS-CoV-2 testing and viral-load quantification by a high-analytical-sensitivity RT-qPCR tests, as described in our related paper.24 At each daily timepoint, participants also performed an at-home Ag-RDT (the Quidel QuickVue OTC COVID-19 Test).
METHODS
Study Participants
Complete details of the participants and study inclusion/exclusion criteria can be found in the accompanying paper.24 Briefly, this case-ascertained study was conducted in the greater Los Angeles County area between November 2021 and March 2022 under Caltech IRB #20-1026. All participants provided written informed consent or verbal assent with written parental permissions (for minors). Children ages 8-17 years old additionally provided written assent.
We enrolled 228 participants, of which 90 were determined by RT-qPCR to be infected with SARS-CoV-2 during enrollment (Fig 1, Fig 2). For analyses oriented to early infection (Fig 3, 4, 5), we analyzed data from 17 participants who were initially negative in at least one test (either RT-qPCR or Ag-RDTs) upon enrollment (Fig 1A, Table S1).
Sample Collection, RT-qPCR Testing, and Variant Sequencing
Complete details of sampling can be found in our accompanying paper.24 Briefly, each day, participants completed an online symptom survey, then self-collected saliva, then anterior-nares swab, then posterior oropharyngeal (throat) swab specimens for RT-qPCR testing. Extraction and RT-qPCR was performed at Pangea Laboratories using the FDA-authorized Quick SARS-CoV-2 RT-qPCR Kit.29 This assay has a reported LOD of 250 copies/mL of sample, which we also verified prior to study initiation.24 Details of the quantification of viral load were described previously.24
Viral sequencing and variant determination were also performed at Pangea; full methods previously described.24 Extraction, RT-qPCR, and sequencing operators and supervisors at Pangea Laboratory were blinded to which participant a sample originated from, as well as the infection status and antigen test results of all participants.
Viral loads, RNase P cycle threshold (Ct) values, and demographic information for 14 of these 17 participants are also reported in a companion manuscript.24
Antigen Testing for SARS-CoV-2
Immediately after packaging specimens to be delivered to Pangea for RT-qPCR analysis, participants followed manufacturer’s instructions to take an at-home Ag-RDT, the FDA-authorized Quidel QuickVue At-Home OTC COVID-19 Test,30-32 which uses a self-collected anterior-nares swab and has a reported LOD of 1.91×104 TCID50/mL. This test is authorized for use with symptomatic persons, or asymptomatic persons if tested twice in a 24-48-hour period. This test was selected because it is authorized and in use globally, and its performance has been the subject of several cross-sectional evaluations.2,33-35
Participants interpreted and reported their own antigen test results (positive, negative, or invalid), and photographed their test strips immediately. In the event of an invalid result, study coordinators called or text-messaged participants to request they immediately take an additional test; invalid results were replaced with subsequent valid results, when applicable. Participants recorded their test results and uploaded photos of the test strips to a secure REDCap server immediately after testing. All photographs were inspected by at least two study coordinators blinded to RT-qPCR results. Results as reported by the participants were analyzed and reported here. A discussion of the discrepancies between participant- and study-coordinator-interpreted test results (56 of 2,153 Ag-RDT results) can be found in the Supplemental Information.
The antigen test manufacturer reports that the Quidel QuickVue At-Home OTC COVID-19 Test has an LOD with >95% positivity at 1.91×104 TCID50 / mL of commercial heat-inactivated SARS-CoV-2 particles.31 Conversion of TCID50/mL to viral load in copies/mL is not provided in the FDA documentation for this test, and the manufacturer was unable to provide this value nor a lot number or certificate of analysis for the heat-inactivated particles. Thus, we were unable to convert this LOD value from TCID50/mL to copies/mL.
Analysis of Overall Antigen Test Performance
The 228 participants enrolled in the study collected specimens at 2,215 timepoints. A composite RT-qPCR result was generated for each of the timepoints: the participant was considered infected if any of the specimen types yielded a positive result by RT-qPCR, a participant was considered uninfected if all specimen types resulted negative by RT-qPCR, and inconclusive if at least one specimen type resulted inconclusive while the other specimen types resulted negative by RT-qPCR. A total of 2,188 timepoints specimens had valid, composite RT-qPCR results, 847 of which were considered infected. Of these 2,188 timepoints, 63 did not have associated Ag-RDT results reported by the participant. Three positive Ag-RDT antigen test results (from different participants) originated from a specific lot (#152000) of antigen test strips that consistently produced pink false-positive test lines even when only blank test buffer was applied during our in-house laboratory testing (see Supplementary Information). These three results were excluded. Of note, Ag-RDT strip lot numbers were not collected for all timepoints; thus, some additional false-positive Ag-RDT results may have originated from this lot but not been excluded from analyses. An additional seven timepoints had invalid Ag-RDT results. A total of 2,118 timepoints had valid, paired ANS Ag-RDT and composite RT-qPCR results (Fig 2A-C,H, Table S2).
Statistical Analyses
A continuous infectious period was defined for each participant (Fig 3, Fig 5) as the first specimen collection timepoint where at least one specimen type had a viral load above an infectious viral-load threshold, until the last timepoint where at least one specimen type had a viral load above the infectious viral-load threshold. Analyses were performed separately for infectious viral-load thresholds of 104, 105, 106, and 107 copies/mL (thresholds based on literature11-20). For certain analyses (Fig 4, Fig 5), the continuous infectious periods were defined for each participant either using all three specimen types or a single specimen type. Statistical significance between lengths of continuous infectious periods calculated using all specimen types or just ANS specimens were calculated using an upper-tailed related-sample t-test (Fig 4E). A two-stage Benjamini– Hochberg procedure was used to correct P-values for each comparison using a different infectious viral-load threshold.
Positive and negative percent agreement (Fig 2A-C) was calculated as the number of specimens with observed concordant results over the total number of specimens with positive or negative results, respectively, by a reference test (single specimen type RT-qPCR). Clinical sensitivity (Fig 2H, 5, 6) was calculated as the number of specimens with either observed or predicted positive results (based on viral loads above a specified assay LOD) over the total number of infected or infectious timepoints included. Individuals were considered infected at any timepoint where at least one specimen type collected had a positive result by RT-qPCR. Individuals were considered infectious based on criteria described above. Confidence intervals were calculated as described in the Clinical Laboratory Standards Institute EP12-A2 User Protocol for Evaluation of Qualitative Test Performance.36 Participants collected ANS, SA, and OPS specimens for viral-load quantification and performed a rapid ANS Ag-RDT at each timepoint; these measurements were considered paired. Differences in the inferred or observed clinical sensitivity from these paired data were tested for statistical significance (Fig 6I) using the McNemar Exact Test37 performed using the statsmodels package in Python v3.8.8, with a Benjamini–Yekutieli procedure to correct P-values.
We also predicted the performance of computationally-contrived combination specimen types (SA-ANS, SA-OPS, AN–OP combination swab, or SA-ANS-OPS). These combination specimen types were defined as having the highest viral load of any specimen type included in the combination collected by a participant at a timepoint. The performance of these combination specimen types was inferred and compared as described above.
RESULTS
Antigen Rapid Diagnostic Test (Ag-RDT) Exhibits <50% Observed Clinical Sensitivity to Detect Infected Individuals Across the Course of the Infection
From 2,218 timepoints with valid, paired ANS RT-qPCR and ANS Ag-RDT results, we observed a positive percent agreement (PPA) of 47% (347) Ag-RDT positive results of 731 ANS RT-qPCR positive results (Fig 2A). This PPA is lower than the 83.5% (95% CI 74.9-89.6%) reported by the manufacturer, from a study of 91 individuals with ANS specimens positive by a comparator RT-PCR assay.31 Although positive RT-qPCR and negative Ag-RDT results were expected for timepoints with low ANS viral loads, the Ag-RDT resulted negative for more than half of RT-qPCR positive ANS specimens with viral loads above 104 copies/mL (Fig 2D). Relatedly, we observed that when the 680 ANS specimens with quantifiable viral loads were ordered by viral load, 95% PPA with Ag-RDT was observed with viral loads above 7.6×106 copies/mL (Fig 2G) suggesting this value as an estimate for the approximate LOD of the assay. The reported LOD of the Ag-RDT assay is 1.91×104 TCID50/mL31; without lot information for the reported LOD validation, conversion to copies/mL is not possible. However, an approximate 1000-fold difference between RNA viral load and viral titer is reasonably expected.38
We observed a negative percent agreement (NPA) of 97% (1,343) antigen negative results of 1,385 ANS RT-qPCR negative results. This is slightly lower than the NPA of 99.2% (95% CI 97.2-99.8%) observed by the Ag-RDT manufacturer.31 This decrease may be due to inclusion of an antigen test lot we found to consistently yield false-positive results (see Supplementary Information).
There is an important distinction between PPA and clinical sensitivity. We calculated the PPA to compare positive results by the Ag-RDT versus positive results by a reference test (RT-qPCR performed on a single specimen type). In contrast, we calculated clinical sensitivity of the Ag-RDT to detect an infected person (infected defined as RT-qPCR positive result in any specimen type at a given timepoint). Importantly, we find that PPA against ANS RT-qPCR results was significantly higher than an overall observed clinical sensitivity of 44% to detect infected individuals (Fig 2H, upper-tailed McNemar Exact Test, P<0.001). This demonstrates that comparisons of single specimen types can overestimate the clinical sensitivity of a test (Fig 2A-C). Further, more than half of timepoints with potentially infectious viral loads (>104 copies/mL in any specimen type) were missed by the ANS Ag-RDT (Fig 2E,F,I).
Of the 90 infected participants (Fig 1), 71 (79%) had a positive Ag-RDT result at least once during enrollment. We next sought to investigate how daily Ag-RDT results aligned with detection of infected and presumably infectious individuals longitudinally from the early stage of acute SARS-CoV-2 infection.
Analysis of Longitudinal Viral-Load Timecourses and Antigen Test Performance
Of the 90 SARS-CoV-2 infected participants in this study, we identified 17 participants who enrolled and began specimen collection early in the course of the infection (negative in at least one of the four tests performed [SA, ANS, or OPS RT-qPCR, or ANS Ag-RDT] in their first set of samples upon enrollment, followed by quantifiable virus in all three specimen types by RT-qPCR, Fig 1). We compiled each participant’s daily viral-load measurements and human RNase P Ct values for each specimen type (SA, ANS, OPS),24 self-reported symptoms, and ANS rapid antigen test results (Fig 3). We additionally plotted the presumed infectiousness of an individual at each timepoint based on viral loads in any specimen type exceeding the noted thresholds of 104 to 107 copies/mL (Fig 3).
All participants reported at least one COVID-19-like symptom at some point during their infection with symptom onset within 3 days of first detectable viral load, as determined by RT-qPCR. In this cohort, the sensitivity of antigen testing when the participant was symptomatic was significantly higher than when the participant was asymptomatic (Fig S1A), but in both groups the observed clinical sensitivity was low (<50%). Surprisingly, several participants reported zero symptoms on the day of their peak viral loads (Fig 3C, 3L, 3N), all of which were >108 copies/mL. Overall, we found only a weak relationship between viral load and symptoms (Fig S1B-E). Importantly, individuals had infectious viral loads in at least 30% of timepoints at which no symptoms were reported (Fig S1F).
All but two participants (Fig 3D and 3F) reached presumed infectious viral loads at least 1 day prior to daily rapid antigen testing yielding positive results. In six participants, the delay between first infectious specimen and antigen positivity was 1-2 days; in five participants, the delay was 3 days; for one participant (Fig 3C) the delay was 5 days and for another participant (Fig 3A) the delay was 8 days. Further, two participants (Fig 3B and 3E) had presumably infectious viral loads for several consecutive days, but never received a positive Ag-RDT result during enrollment. The first participant (Fig 3B) had high viral loads in OPS specimens for 8 days while ANS specimens remained at low levels (rising just above 104 copies/mL only one day). Even very high-analytical-sensitivity RT-qPCR assays using ANS would not have reliably detected this participant’s infection because ANS specimens had low viral loads with inconsistently positive results throughout enrollment. In the second participant (Fig 3E) nasal viral loads exceeded 106 copies/mL on three days, but never yielded a positive Ag-RDT result, likely because these viral loads were too close to the Ag-RDT’s estimated LOD for reliable detection.
In another participant (Fig 3D), we observed consistent false-positive Ag-RDTs; even when ANS was negative by RT-qPCR and an iHealth rapid antigen test taken outside of the study on the final day of sampling. This participant continued to test positive by Ag-RDT even >30 days after his first detectable viral load, and when viral load was undetectable by RT-qPCR in all three specimen types. These antigen test strips were not from the lot that yielded consistently false-positive results. Several other participants (not in this cohort) exhibited a similar phenomenon of continuous false positives, some of which we were able to track to a specific test lot (see Supplemental Information).
Period of Presumed Infectiousness as a Factor of Infectious Viral-Load Threshold
It has been proposed that Ag-RDTs may detect presumed infectious individuals due to higher viral loads in these individuals. To assess this, individuals were presumed to be infectious based on viral loads exceeding infectious viral-load thresholds (IVLTs) of either 104, 105, 106 or 107 copies/mL. For each IVLT, the presence of specimens with infectious viral loads was plotted relative to the first RT-qPCR positive in any specimen type and positive paired Ag-RDT results were overlaid (Fig 4A-D). As the IVLT increased, the length of the total infectious period for each participant typically decreased.
All 17 individuals had presumed infectious viral loads (>104 copies/mL) in ANS specimens in at least one timepoint, and all but one individual (Fig 4A[N]) additionally exhibited presumed infectious viral loads in OPS and/or SA (Fig 4A-D). If the infectious periods in OPS and SA overlapped perfectly with the infectious period in ANS, then infectious viral loads in other specimen types would not affect the performance of the Ag-RDT to detect infectious individuals. However, we found that the presumed infectious periods for different specimen types are often asynchronous (Fig 4A-D). Moreover, positive antigen tests only overlapped with 60% of timepoints with infectious viral loads above 104 copies/mL and 80% of timepoints with infectious viral loads above 107copies/mL.
Given that the infectious periods for different specimen types were often asynchronous (Fig 4A-D), considering infectiousness in all three specimen types yielded a significantly longer infectious period than if only ANS viral loads were considered (Fig 4E, Fig S2) across all infectious viral load thresholds. We also found that the infectious period in ANS and OPS together was longer than any other combination of two specimen types, and similar to that of all three specimen types (Fig S2). These results suggest that testing only single specimen types (such as ANS) may fail to detect individuals with infectious viral loads in untested specimen types.
Inferred Clinical Sensitivity in the Presumed Infectious Period Depends on Specimen Type, Infectious Viral-Load Threshold (IVLT) and Assay Analytical Sensitivity
We next hypothesized that the discrepancy in the length of the infectious period would decrease the clinical sensitivity of an ANS Ag-RDT in identifying potentially infectious individuals. To illustrate this point, we compared the inferred clinical sensitivity to detect individuals with infectious viral loads in only the tested specimen type (Fig 5A-C) or in any specimen type (Fig 5D-F).
First, this analysis demonstrates that setting an IVLT at or above the LOD of an assay will artificially inflate inferred clinical sensitivity of that assay in detecting infectious individuals. We highlight three instances (red boxes in Fig 5A-C) where inferred clinical sensitivities increase by up to 84% as a result of assay LOD being just slightly greater than or equal to the defined IVLT. Perfect performance is observed in the lower-right triangular matrix (Fig 5A-C) because the assay LOD is equal to or less than the IVLT; in these cases, only specimens with viral loads above the LOD (and therefore likely detectable) would be considered infectious by definition. Generally, clinical sensitivity increases as the infectious threshold increases, whereas inferred clinical sensitivity decreases as test LOD increases. This analysis shows that defining an infectious threshold that is similar to the assay LOD (or worse, using an assay’s LOD to define the infectious threshold), can grossly overestimate the inferred clinical sensitivity of the assay in detecting infectious individuals.
Second, the inferred clinical sensitivity when considering only the tested specimen type (Fig 5A-C) decreases substantially in nearly all combinations of IVLT and LOD when all viral loads in all three specimen types are considered (Fig 5D-F); in many cases, inferred clinical sensitivity decreases by more than half. This demonstrates a serious implication of assuming an individual may only be infectious in one specimen type. Further, when infectiousness in any specimen type is considered, no single specimen type achieved a clinical sensitivity above 85% in detecting infectious individuals, regardless of selected test LOD or infectious threshold. This analysis clearly demonstrates that clinical sensitivity to detect presumed infectious individuals will be grossly overestimated for all combinations of IVLT/assay LOD when the viral loads in untested specimen types aren’t considered.
Performance of Daily Rapid Antigen Tests to Detect the Pre-Infectious and Infectious Periods
We next wished to investigate the performance of a daily rapid antigen test to detect the pre-infectious and infectious periods. We separately analyzed IVLTs of 104 to 107 copies/mL and plotted the observed clinical sensitivity of the Ag-RDT alongside the inferred clinical sensitivity of ANS specimens tested by an assay with a similar LOD of 106 copies/mL.
We found strong agreement between the inferred clinical sensitivity of ANS specimens tested with an assay with LOD of 106 copies/mL and the observed clinical sensitivity of the Ag-RDT in both the pre-infectious and infectious periods and across all four infectious thresholds (Fig 6A-D;I). This analysis supports that the performance of a given specimen type and assay analytical sensitivity can successfully be predicted using observed quantitative viral loads.
In the pre-infectious period, we observed that all low-analytical-sensitivity ANS tests (inferred by viral loads and observed Ag-RDT) failed to detect infections across all four infectious viral load thresholds. ANS rapid antigen testing was positive in, at most, 1 of 34 timepoints in the pre-infectious period of infection (Fig 6C,D).
Increasing the IVLT increased the pre-infectious period and decreased the infectious period, and therefore also increases the observed and inferred clinical sensitivity of these low-analytical-sensitivity assays to detect infectious individuals (Fig 6A-D). However, even at the highest IVLT (107 copies/mL), we observed that ANS Ag-RDTs detected only 63% of presumed infectious individuals (95% CI 54-71%, Fig 6D). Overall, ANS Ag-RDTs had poor detection of both pre-infectious and infectious individuals.
Ag-RDTs Using AN–OP Combination Swab Inferred to Significantly Improve Detection in Infectious Period
We next evaluated the clinical sensitivity of different specimen types tested with either high- or low-analytical-sensitivity assays to detect individuals during the infectious period. No single specimen type (SA, ANS, nor OPS) achieved 95% inferred clinical sensitivity with either a high- (LOD of 103 copies/mL) or low-analytical-sensitivity (LOD of 106 copies/mL) assay, for any IVLT (Fig 6E-H). However, high-analytical-sensitivity testing yielded significantly higher inferred clinical sensitivity over low-analytical-sensitivity testing for all specimen types, at all IVLTs (Fig 6I).
We had observed that considering infectious viral loads in multiple specimen types yielded significantly longer infectious periods (Fig 4), that individuals most frequently achieve infectious viral loads in ANS or OPS first (Fig S3-6), and a combination of ANS and OPS captured the longest duration of any two-specimen-type combination (Fig S2). This led us to propose that a specimen type that combined AN and OP sampling on a single swab might exhibit improved performance. We created a computationally-contrived AN–OP combination swab specimen containing the higher viral load of either specimen type collected by a participant at a given timepoint. This AN–OP combination swab specimen was predicted to perform significantly better than any single specimen type, including the observed performance of rapid ANS Ag-RDT. The AN–OP combination swab’s performance was predicted to be superior when using either a high-analytical-sensitivity assay (LOD of 103 copies/mL) or a low-analytical-sensitivity (LOD of 106 copies/mL) assays, across all IVLTs (Fig 6E-I). The combination AN–OP swab specimen type also had improved performance over all other possible two-specimen combination specimen types, and other assay LODs (Fig S7).
DISCUSSION
Our results revealed several important findings relevant to the use of Ag-RDTs and, by extension, other tests with low and moderate analytical sensitivity such as some nucleic acid amplification tests (NAATs) that forgo nucleic acid extraction and purification.
First, our community-based testing of the Ag-RDT showed low clinical sensitivity for detecting infected persons at any stage of infection. Overall, the observed clinical sensitivity of the Ag-RDT in our participant population was only 44%. This performance is consistent with what we24 (and others39) have predicted based on the estimated analytical sensitivity of the assay and measured ANS viral loads, suggesting user error did not substantially affect performance. This is, however, lower than the manufacturer-reported sensitivity of 83.5%,31 observed under a different study design: PPA was calculated relative to an ANS RT-PCR reference test, as opposed to our calculation of clinical sensitivity to detect infected status based on composite RT-qPCR results from multiple specimen types. Additionally, data from the manufacturer indicates that nearly all (84 of 91) reference test positive specimens originated from individuals after symptom onset, whereas our design includes both symptomatic and asymptomatic timepoints.
Many Ag-RDTs are recommended or validated for use only by symptomatic individuals, but in practice these tests are also often used for asymptomatic test-to-enter or serial-screening purposes. Although we found that Ag-RDT performance was significantly better at symptomatic timepoints than asymptomatic timepoints, the observed clinical sensitivity to detect infected persons even at symptomatic timepoints was low (<50%). Further, we found that individuals had infectious viral loads in at least one third of asymptomatic timepoints, and multiple participants had no symptoms on the day of peak viral load (sometimes >108 copies/mL).
Second, the ANS Ag-RDT only detected 3% of the presumed pre-infectious timepoints and 63% of the presumed infectious timepoints. Missing almost all of the pre-infectious period and much of the infectious period can be attributed to ANS viral loads sometimes rising to the LOD of the Ag-RDT after SA and OPS have achieved infectious viral loads (9 of 17 participants). Importantly, the total period of presumed infectiousness is significantly longer when one accounts for the viral loads in multiple specimen types, and not just in ANS. Studies that have assessed the period of infectiousness by viral culture of only one specimen type may have underestimated the total infectious period.12,26,40,41 Evidence for recommended isolation periods should consider multiple specimen types,40 particularly if negative Ag-RDT results are used to release individuals from isolation. Moreover, studies evaluating low-analytical-sensitivity testing relative to infectiousness in only one specimen type will likely overestimate the performance of that test to detect the full infectious period.42,43 Several outbreak models16,20 have simulated the performance of low-analytical-sensitivity tests; test performance will be overestimated if infectiousness in only the specimen type used for testing is considered. Further, the simulated performance of tests with high LODs (such as low-analytical-sensitivity Ag-RDTs) will be drastically different depending on the IVLT used in outbreak models: if the IVLT is at or above the LOD of the simulated test, artificially high or even perfect performance will be calculated simply as a result of the chosen parameters.
We do not believe that the poor performance of the Ag-RDT in our study is due to a particular artifact of the study implementation for two reasons. First, our results are generally consistent with a previous study of pre-Omicron variant infections: the performance of an ANS Ag-RDT to detect individuals positive by nasopharyngeal swab (NPS) RT-qPCR was <50% in the first days of infection, and rose to a maximum of 77% three days after symptom onset.42 Although, our study is not directly comparable because that study assessed only one reference specimen type (NPS). Second, observed ANS Ag-RDT performance was in excellent agreement with the performance predicted based on ANS viral loads measured by RT-qPCR, which used a separate ANS swab and sample tube.
Only assays with the highest analytical sensitivity performed well for detecting the pre-infectious and infectious periods, because they were able to capture low viral loads in the tested specimen type while individuals had infectious viral loads in untested specimen types. For example, in one individual (Fig 3A), ANS viral loads were absent or below 103 copies/mL for 5 days, causing a delay of 8 days between the first RT-qPCR positive in any specimen type to first positive ANS antigen result. During this time, the individual had infectious viral loads in multiple SA and OPS specimens. The delayed rise in ANS viral loads relative to saliva is consistent with what has been observed for prior variants,21 as well as for the Omicron variant,28 but this work is the first analysis that takes into account the viral loads in all three major specimen types to analyze the entirety of the presumed infectious period. We see that when only a single specimen type is sampled (e.g., the nasal cavity in the case of Ag-RDTs), even daily testing can fail to detect pre-infectious and infectious individuals.
Third, for low-analytical-sensitivity assays, the use of combination specimen types can significantly improve the performance. High-analytical-sensitivity assays can capture instances of low viral loads in the tested specimen type while another, untested specimen type has high, infectious viral loads. This is not the case for low-analytical-sensitivity tests, so the impact of combination specimen types is more pronounced. Our data suggest that for infection with the Omicron variant, an AN–OP combination swab specimen type would be significantly more effective than single specimen types for detecting all periods of infection, including the earliest days of infection, at an LOD similar to that of the daily ANS Ag-RDT we evaluated (106 copies/mL). The significantly better performance of this AN–OP combination swab specimen type over individual specimen types was robust to IVLTs from 104 to 107 copies/mL. Many countries already have authorized and implemented the use of combination specimen types, including for Ag-RDTs,4 yet this is not the case in the U.S., with most Ag-RDTs using nasal swabs.
We acknowledge several limitations of our study. First, we only used a single Ag-RDT, the Quidel QuickVue At-Home OTC COVID-19 Test. Although other Ag-RDTs have slightly different LODs,33-35 the concordance we observed between the inferred clinical sensitivity based on ANS viral loads and the observed clinical sensitivity of the Ag-RDT supports that these findings are generalizable, driven by the extreme differences in viral loads among specimen types, and that the performance of other Ag-RDTs can be inferred from viral-load data. Second, we make inferences about the value of a combination AN–OP swab for improved detection (including using assays with a similar LOD to the Ag-RDT used here) based on viral-load measurements, but we did not directly test a combination swab in this study population. Finally, this study was performed in the context of one SARS-CoV-2 variant and in one geographical area. As new variants (and respiratory viruses) emerge, it will be critical to re-evaluate viral-load timecourses in different specimen types to ensure specimen types and assay LODs are judiciously chosen to effectively detect the pre-infectious and infectious periods. In the absence of studies informing on these factors, combination oral-nasal swabs are likely the most robust approach to detect infections in the pre-infectious and infectious periods.
Data Availability
The data underlying the results presented in the study can be accessed via CaltechDATA: https://data.caltech.edu/records/20223.
DATA AVAILABILITY
The data underlying the results presented in the study can be accessed via CaltechDATA: https://data.caltech.edu/records/20223.
COMPETING INTERESTS STATEMENT
RFI is a co-founder, consultant, and a director and has stock ownership of Talis Biomedical Corp.
FUNDING
This work was funded by the Ronald and Maxine Linde Center for New Initiatives at the California Institute of Technology and the Jacobs Institute for Molecular Engineering for Medicine at the California Institute of Technology. AVW is supported by a UCLA DGSOM Geffen Fellowship.
Supplemental Information
Supplementary Analyses
Discordance in Participant Interpretation of Antigen Test Results
In 2.5% of antigen tests (56 of 2,153 tests), a pink (positive) test line was visible to two study coordinators in photographs uploaded, but the result was reported as negative by the participant. In most cases the pink lines were faint and may have been overlooked by the participants. It is also possible that in some cases the test was photographed late; per the manufacturer’s guidance, the test result is only valid at the 10-min mark. One participant with a dark pink line was queried and reported poor close-range vision; this participant had a housemate help with all further interpretations. In one case from one participant, an invalid result was reported, but a blue control line was visible to two study coordinators. In this manuscript, we used the participants’ interpretations in all analyses. Although 2.5% of all rapid antigen test results had discordant interpretations, 14% (33 of 228) participants had a discordant interpretation; this discordance underlines that user error can affect sensitivity of these at-home tests in real-world settings.
Faulty Antigen Test Lot
In mid-January 2022 we observed that two asymptomatic participants had consecutive positive antigen test results, but negative results by RT-qPCR in all three specimen types tested. Further investigation revealed that the most recently taken false positives from these two participants were from the same antigen strip lot (Quidel QuickVue At-Home OTC COVID-19 Test #152000). A third participant (Fig 3D) also had a single false-positive test from this lot the same week. This lot was immediately pulled from circulation in the study, and reported to the manufacturer and to the FDA (via a MedWatch Voluntary Report).
To investigate the issue further in the laboratory, blank antigen test buffer and commercial nasal fluid from healthy human donors were applied to antigen test strips from this lot. Only this lot, not several other lots tested, consistently yielded faint but visible pink test lines. Full details of this follow-up investigation are in preparation.
Following an IRB amendment, participants began photographing the antigen test strip lot number visible when they reported their Ag-RDT results. Known test results from this faulty lot were marked as invalid and excluded from analysis (Fig 2). In one of the 17 participants enrolled during the early period of infection (Fig 3D), the antigen test result from this lot is noted with a “?” on his plot, and the datapoint was excluded from subsequent analyses (Fig 4,6).We emphasize that because we were not recording antigen test lot numbers from the beginning of the study, we do not know the extent of the results prior to mid-January that were from lot #152000.
AUTHOR CONTRIBUTIONS (listed alphabetically by last name)
Reid Akana (RA): Collaborated with AVW in creating digital participant symptom surveys; assisted with data quality control/curation with NS, HD, SC; created current laboratory information management system (LIMS) for specimen logging and tracking. Creation of iOS application for sample logging/tracking. Configured an SQL database for data storage. Created an Apache server and websites to view study data. Configured FTPS server to catalog PCR data. Wrote a Python package to access study data. Trained study coordinators on SQL. Troubleshooting and QC of LIMS. Made Figures 4, 5, S2, S3, S4, S5. Wrote and edited the manuscript with AVW and NS.
Alyssa M. Carter (AMC): Assisted with the inventory and archiving of >6,000 samples at Caltech; coordinated shipment of samples to Caltech with AER and JRBR; assisted with procurement of antigen tests; assisted with organizing volunteers and making participant kits; assisted AER in developing and implementing QC for participant kits. Led the in-lab investigation of antigen false-positive results; designed and performed experiments for lot analysis of the Quidel QuickVue At-Home Covid-19 tests. Provided feedback and edited the manuscript.
Yap Ching Chew (YCC): Primary liaison with Caltech team. Prepared and provided Zymo SafeCollect kits and related materials to Caltech team. Supervised the extraction, PCR, and QC teams at Pangea Laboratory. Sent PCR results daily to Caltech team. Arranged for Pangea team to perform viral-variant sequencing on selected samples; reported results and provided sequencing files.
Saharai Caldera (SC): Study coordinator; recruited, enrolled and maintained study participants with NS and HD; study-data quality control, curation and archiving with RA, NS, HD and MKK; supplies acquisition with AER, NS, HD and MKK.
Hannah Davich (HD): Lead study coordinator; co-wrote participant informational sheets with NS; developed recruitment strategies and did outreach with NS; participant kit creation and co-coordinated kit-making by volunteers with AER; recruited, enrolled and maintained study participants with NS and SC; managed the study-coordinator inventory; study-data quality control, curation and archiving with RA, NS, SC and MKK; supplies acquisition with AER, NS, SC and MKK.
Matthew Feaster (MF): Co-investigator; collaborated with AVW, MMC, NS, YG, RFI on study design and recruitment strategies; provided guidance and expertise on SARS-CoV-2 epidemiology and local trends.
Ying-Ying Goh (Y-YG): Co-investigator; collaborated with AVW, MMC, NS, MF, RFI on study design and recruitment strategies; provided guidance and expertise on SARS-CoV-2 epidemiology and local trends.
Rustem F. Ismagilov (RFI): Principal investigator; collaborated with AVW, MMC, NS, MF, YYG on study design and recruitment strategies; provided leadership, technical guidance, oversight of all analyses, and was responsible for obtaining the primary funding for the study.
Mi Kyung Kim (MKK): Study coordinator (part-time); maintaining participants with NS, HD, and SC; study-data quality control, curation and archiving with RA, NS, SC and HD; supplies acquisition with AER, NS, SC and HD; collected contact info for local university/college student health centers for recruitment outreach; assembled Table S1 with NS.
John Raymond B. Reyna (JRBR): Organized sample labeling and short-term storage of all samples at Pangea Laboratories. Arranged shipment of all samples to Caltech team. Assisted with processing of the specimens.
Anna E. Romano (AER): Co-coordinated kit-making by volunteers with HD; implemented QC process for kit-making; participated in kit making; managed logistics for the inventory and archiving of >6,000 samples at Caltech; supplies acquisition with HD, NS, SC and MKK; assisted with securing funding; compiled antigen lot data to assist false-positive antigen test investigation; organized and performed QC on sequencing data. Provided feedback and edited the manuscript.
Natasha Shelby (NS): Study administrator; collaborated with AVW, RFI, YG, MF on initial study design and recruitment strategies; co-wrote IRB protocol and informed consent with AVW; co-wrote enrollment questionnaire and post-study questionnaire with AVW; initiated the collaboration with Zymo and served as primary liaison throughout study; reviewed pilot sampling data and amended instructional sheets/graphics for specimen collections in collaboration with Zymo; co-wrote participant informational sheets with HD; hired, trained, and supervised the study-coordinator team; developed recruitment strategies and did outreach with HD; recruited, enrolled and maintained study participants with HD and SC; co-developed participant keep/drop criteria with AVW; performed the daily upload, review, and QC of PCR data received from Zymo; made the daily keep/drop decisions based on viral-load trajectories in each household; made all phone calls to alert presumptive positives of their status and provide resources; study-data quality control, curation and archiving with RA, HD, SC and MKK; organized archiving of all participant data and antigen-test photographs; supplies acquisition with AER, HD, SC and MKK; assisted with securing funding; managed the overall study budget; assembled Fig 1 with AVW; assembled Table S1 with MKK; made Fig 3 with AVW; managed citations and reference library; verified the underlying data with AVW and RA; co-wrote and edited the manuscript with AVW and RA.
Matt Thomson (MT): Assisted with statistical approach and analyses.
Colten Tognazzini (CT): Coordinated the recruitment efforts at PPHD with case investigators and contact tracers; provided guidance and expertise on SARS-CoV-2 epidemiology and local trends.
Alexander Viloria Winnett (AVW): Collaborated with NS, RFI, YG, MF on initial study design and recruitment strategies; co-wrote IRB protocol and informed consent with NS; co-wrote enrollment questionnaire and post-study questionnaire with NS; co-developed participant keep/drop criteria with NS; funding acquisition; designed and coordinated LOD validation experiments; selected and prepared specimen for viral-variant sequencing with NS, YC, and AER; assisted with the inventory and archiving of >6,000 specimen at Caltech with AER and AMC; minor role supporting outreach by HD and NS; minor role supporting kit-making by AER, HD and AMC; verified the underlying data with NS and RA; assembled Fig 1 with NS; made Fig 3 with NS; performed analysis and prepared Fig 2, Fig 6, Fig S1, and Table S2. Co-wrote and edited the manuscript with NS and RA.
Taikun Yamada (TY): Performed the RT-qPCR COVID-19 testing at Pangea Laboratory.
ACKNOWLEDGEMENTS
We sincerely thank the study participants for making this work possible. We thank Lauriane Quenee, Grace Fisher-Adams, Junie Hildebrandt, Megan Hayashi, RuthAnne Bevier, Chantal D’Apuzzo, Ralph Adolphs, Victor Rivera, Steve Chapman, Gary Waters, Leonard Edwards, Gaylene Ursua, Cynthia Ramos, and Shannon Yamashita for their assistance and advice on study implementation and/or administration. We thank Jessica Leong, Ojas Pradhan, Si Hyung Jin, Emily Savela, Bridget Yang, Ekta Patel, Hsiuchen Chen, Paresh Samantaray, Zeynep Turan, Cindy Kim, Trinity Lee, Vanessa Mechan, Katherine Stiefel, Rosie Zedan, Rahulijeet Chadha, Minkyo Lee, and Jenny Ji for volunteering their time to help with this study. We thank Prabhu Gounder, Tony Chang, Jennifer Howes, and Nari Shin for their support with recruitment. Finally, we thank all the case investigators and contact tracers at the Pasadena Public Health Department and Caltech Student Wellness Services for their efforts in study recruitment and their work in the pandemic response.