ABSTRACT
Background Screening testing, often via self-collected specimens, remains a key strategy to detect infections early and prevent SARS-CoV-2 transmission, and to enable earlier initiation of treatment. However, which specimen type best detects the earliest days of infection remains controversial. Further, the analytical sensitivity of diagnostic tests must also be considered, as viral loads below a test’s limit of detection (LOD) are likely to yield false-negative results. Comparisons of quantitative, longitudinal SARS-CoV-2 viral-load timecourses in multiple specimen types can determine the best specimen type and test analytical sensitivity for earliest detection of infection.
Methods We conducted a COVID-19 household transmission study between November 2021 and February 2022 that enrolled 228 participants and analyzed 6,825 samples using RT-qPCR to quantify viral-load timecourses in three specimen types (saliva [SA], anterior-nares swab [ANS], and oropharyngeal swab [OPS]). From this study population, 14 participants enrolled before or at the incidence of infection with the Omicron variant. We compared the viral loads in specimens collected from each person at the same timepoint, and the longitudinal viral-load timecourses from each participant. Using these viral loads, we inferred the clinical sensitivity of each specimen type to detect infected, pre-infectious, and infectious individuals (based on presumably infectious viral-load levels) using assays with a range of analytical sensitivities. We also inferred the clinical sensitivity of computationally-contrived specimen types representing combinations of single specimen types.
Results We found extreme differences (up to 109 copies/mL) in viral loads between paired specimen types in the same person at the same timepoint, and that longitudinal viral-load timecourses across specimen types did not correlate. Because of this lack of correlation, infectious viral loads were often observed in different specimen types asynchronously throughout the course of the infection. In the first 4 days of infection, no single specimen type was inferred to achieve >95% detection of infected or infectious individuals, even with the highest analytical sensitivity assays. In nearly all participants (11/14), a rise in ANS viral loads was delayed (as many as 7 days) relative to SA and OPS. We also observed that ANS and OPS had the most complementary viral-load timecourses, resulting in optimal inferred performance with a computationally-contrived combined anterior-nares–oropharyngeal (AN–OP) swab specimen type. The combination AN–OP swab had superior inferred clinical sensitivity the first 8 days of infection with both high- and low-analytical-sensitivity assays. This AN–OP swab was also inferred to significantly improve detection of pre-infectious and infectious individuals over any single specimen type.
Conclusion Our work demonstrates that the viral load in one specimen type cannot reliably predict the viral load in another specimen type. Combination specimen types may offer a more robust approach for earliest detection of new variants and respiratory viruses when viral kinetics are still unknown.
INTRODUCTION
As new SARS-CoV-2 variants-of-concern (and new respiratory viruses) emerge with different viral kinetics,1 it is important to continually re-evaluate testing strategies (including specimen type and test analytical sensitivity) for detecting pre-infectious and infectious individuals. Early detection limits transmission within communities and the global spread of new variants. Appropriate screening prior to congregate settings (camps/schools, airplanes/cruises, etc.) can reduce outbreaks.2-4 Early detection also enables earlier initiation and more effective treatments.5-8
Testing strategies to achieve detection in the pre-infectious and infectious periods of new SARS-CoV-2 variants or new viruses requires filling two critical knowledge gaps: (1) Which respiratory specimen accumulates virus first? (2) What is the appropriate test analytical sensitivity to detect the earliest accumulation of virus (in the pre-infectious and infectious stages)? These two gaps must be filled in parallel. The first gap is knowing which specimen type accumulates virus first. Commonly, an individual’s infection is described by the viral load sampled from a single specimen type. This practice is appropriate when there is one principal specimen type (e.g., HIV in blood plasma). However, some respiratory infections, including SARS-CoV-2, exhibit viral tropism and the shifting of viral loads across respiratory sampling sites.9,10
Nasopharyngeal (NP) swabs have been the gold standard for SARS-CoV-2 detection, but NP swabs are poorly tolerated and challenging for serial sampling, so many alternate specimen types are now widely used.11,12 Some of these are suitable for routine testing,13 and are approved for self-collection (e.g., saliva [SA], anterior-nares swabs [ANS], and oropharyngeal swabs [OPS]) in some countries. Studies comparing paired specimen types in the same person at the same timepoint have found that cycle threshold (Ct, a proxy for viral load) values can differ significantly between specimen types.14-17 In some cases, viral loads in one specimen type are low or even absent while viral loads in another type may be high.16 Viral variant plays a role too; one study15 suggested that healthcare-worker-acquired mid-turbinate (MT) swabs had significantly better performance than self-collected oral swabs for Beta and Delta variants but not Omicron.14 Several studies,18,19 news media,20 and social media posts have speculated that in Omicron infections viral load accumulates in oral specimens before the nasal cavity, but findings from single timepoints are contradictory and rigorous comparisons starting from the incidence of infection are needed. Thus, despite compelling anecdotal evidence that oral specimens might be superior for detection of Omicron infections, nasal swabs (including those in rapid antigen tests) continue to be the dominant specimen type used in the U.S., including for workplace screenings and at-home testing. The lack of consensus on optimal specimen type calls into question whether it is appropriate to describe a SARS-CoV-2 infection by the viral load in only one specimen type.
The second knowledge gap is the analytical sensitivity needed for reliable detection of pre-infectious and infectious individuals. The assay analytical sensitivity (or limit of detection, LOD) required to detect an infection depends on the viral load in the specimen type tested. Generally, the LOD of a molecular test describes its ability to detect and quantify target at or above a certain concentration in that specimen type with >95% probability.21 In early SARS-CoV-2 variants, some studies showed that saliva accumulated virus earlier than nasal swabs, but at very low levels16,22; therefore, detecting earliest saliva accumulation required a high-analytical-sensitivity (i.e., low LOD) assay.23
Detection in the pre-infectious period is ideal so that infected persons can isolate before transmission occurs, and detection during the infectious period is critical to minimize outbreaks. Replication-competent virus has been recovered from saliva,24 OPS,25 and nasal swabs,26 but it is both impractical and infeasible to perform viral culture on each positive specimen to determine if a person is infectious. However, studies that performed both culture and RT-qPCR have found that lower Ct values (higher viral loads) are associated with infectious virus. Specific viral loads likely to be infectious for each specimen type have not been established,27 partly because Ct values are not comparable across assays28 and culture methods differ. However, as a general reference, viral loads of 104–107 RNA copies/mL are associated with the presence of replication-competent virus in culture.29-38 In the SARS-CoV-2 human challenge study, viral loads of just 102 (nose) and 104 (throat) copies/mL were associated with infectious virus in the first days of infection.39 Modeling papers have used infectious thresholds of 104 and 105 copies/mL.34,38 The enormous range (more than 4 orders of magnitude) in observed viral loads that correspond with infectiousness emphasizes why robust quantitative measurements of viral loads in different specimen types are needed to make predictions about testing strategies that will detect the pre-infectious and infectious periods.
There are three main methods for defining the infectious period for an individual based on viral loads, all of which will be utilized in this manuscript. First, the infectious period may be defined as the continuous period between the first specimen (of any type) with an infectious viral load until the first timepoint after which no specimen has an infectious viral load.40-42
However, high-frequency measurements show that viral loads fluctuate, so some timepoints within the continuous infectious period contain lower viral loads and negative viral-culture results.43-46 To account for viral-load fluctuations, one may instead define an instantaneous infectious period (i.e., an individual is presumed infectious only when at least one specimen type has a viral load above the infectious viral load threshold). Both methods neglect the role of the neutralizing immune response, and the impact of infection stage on viral-culture positivity.30,47 To account for these factors, the presumed infectious period may be limited to a number of days following symptoms or the first infectious timepoint.
The “pre-infectious” period is all SARS-CoV-2-positive timepoints prior to the first timepoint in which any specimen type contains viral load greater than the indicated infectious viral load threshold. In the pre-infectious period, viral loads in some specimen types may be low, so both selection of the right specimen type, and an assay with appropriate analytical sensitivity are required to reliably detect the earliest presence of virus. What remains largely unknown is the load at which virus first becomes detectable, when viral load rises to infectious levels, and when viral loads rise to levels detectable by tests with low analytical sensitivity (high LODs). Importantly, how all these factors may differ among different specimen types remains unknown.
The assumption made early in the COVID-19 pandemic that viral load always rises rapidly from undetectable to likely infectious,48 has been challenged by several longitudinal studies of viral load in different specimen types.16,22,45,46,49 The finding that viral load can rise slowly over several days (not hours) is encouraging, because this window provides more time to identify and isolate pre-infectious individuals.16,22 However, making use of this opportunity requires a thorough understanding of how viral load changes in each specimen type early in infection for each variant.
Similarly, to reliably detect an infectious person, an infectious specimen must be tested with an assay that has an LOD below the infectious viral load for that specimen type. However, many authorized COVID-19 tests (including rapid antigen tests)11,50 have LODs well above the range of reported infectious viral loads.51,52 Moreover, infectious viral-load values may differ by variant, necessitating studies of within-host viral-load dynamics over time in multiple specimen types as variants emerge.
Filling these two critical (and inter-related) knowledge gaps about specimen type and assay LOD requires high-frequency quantification of viral loads (not semi-quantitative Ct values) in multiple specimen types starting from the incidence of infection (not after a positive test or symptom onset, as is commonly done). Moreover, quantification must be performed with a high-analytical-sensitivity assay to capture low viral loads in the first days of detectable infection. It is challenging to acquire such data. Individuals at high risk of infection must be prospectively enrolled prior to detectable infection and tested longitudinally with high-frequency in multiple paired specimen types.
To our knowledge, only three studies have reported longitudinal viral-load timecourses in multiple, paired specimen types from early infection. A university study26 captured daily saliva and nasal-swab samples for 2 weeks from 60 individuals, only 3 of whom were negative for SARS-CoV-2 upon enrollment, allowing only limited inferences about earliest detection. In our prior study, we captured twice-daily viral-load timecourses from 72 individuals for 2 weeks,53 seven of whom were negative upon enrollment.16 In 6 of 7 individuals, we inferred from viral-load quantifications that a high-analytical-sensitivity saliva assay would detect infection earlier than a low-analytical-sensitivity ANS test. However, participants in both studies were infected with pre-Omicron variants, and neither used OPS, which have been speculated to be superior at detecting early Omicron infections.20 Only one longitudinal study18 analyzed viral loads in all three major specimen types (SA, ANS and OPS) over time in Omicron. Unfortunately, daily measurements in all three sample types were captured for only two individuals, both of whom were already positive upon enrollment.
Here, we measured and analyzed the viral-load timecourses of the Omicron variant in three specimen types appropriate for self-sampling (SA, ANS, and OPS) by individuals starting from before the incidence of infection as part of a household transmission study in Southern California. From these data, we inferred the clinical sensitivity of each specimen type across a range of analytical sensitivities, to determine which specimen type and analytical sensitivity would yield the most reliable detection of pre-infectious and infectious individuals infected with the currently circulating Omicron variant. An accompanying paper reports the results of daily rapid antigen testing in an expanded cohort from this study.54
METHODS
Study Participants
This case-ascertained study of household transmission was conducted in the greater Los Angeles County area between November 23, 2021 and March 1, 2022 under Caltech IRB #20-1026.55 Additional details can be found in the Supplemental Information. A total of 228 participants from 56 households were enrolled in the study; 90 of whom tested positive for SARS-CoV-2 infection during enrollment (Fig 1A).
We limited our analyses to 14 individuals (Table 1B, Table S1, Fig 2) who enrolled in the study at or before the incidence of acute SARS-CoV-2 infection. To be included in the cohort, a participant must have had at least one specimen type with viral loads below quantification upon enrollment, followed by positivity and quantifiable viral loads in all three specimen types.
Sample Collection
Each day, participants reported symptoms, then self-collected saliva (SA), anterior-nares swab (ANS), and posterior oropharyngeal (hereafter oropharyngeal) swab (OPS) specimen for RT-qPCR testing in Zymo Research’s SafeCollect devices (CE-marked for EU use), following manufacturer’s instructions.56 57 Participants collected specimens immediately upon enrollment, then daily every morning upon waking, as morning sample collection has been shown to yield higher viral loads than evening collection.53
RT-qPCR Testing for SARS-CoV-2
Extraction and RT-qPCR were performed at Pangea Laboratories (Tustin, CA, USA) using the FDA-authorized Quick SARS-CoV-2 RT-qPCR Kit, with results (positive, negative, inconclusive) assigned per manufacturer criteria.58 Additional details can be found in Supplemental Information. This assay has a reported LOD of 250 copies/mL of sample.
Quantification of Viral Load from RT-qPCR Result
To quantify viral load in RT-qPCR specimens, contrived specimens across a 13-point standard curve (dynamic range from 250 copies/mL to 4.50×108 copies/mL) for each specimen type was generated at Caltech and underwent extraction and RT-qPCR as described above. All three replicates at 250 copies/mL of specimen were detected, independently validating the reported LOD for the assay. For each specimen type, the standard curve generated an equation to convert from SARS-CoV-2 N gene Ct values to viral loads in genomic copy equivalents (hereafter copies) per mL of each specimen type. See Supplemental Information for additional details and equations.
Positive specimens with viral loads that would be quantified below the assay LOD (250 copies/mL) were considered not quantifiable.
Viral Sequencing and Lineage/Variant Determination
Viral sequencing of at least one specimen for each participant with incident infection was performed on ANS or OPS specimens with moderate to high viral loads by Zymo Research at Pangea Lab. The sequencing protocol used a variant ID detection workflow that closely resembles the Illumina COVIDSeq™ NGS Test.59,60 See Supplemental Information for details.
Statistical Analyses
Comparison of Viral-Load Timecourses Across Specimen Types
To quantify the difference between viral-load timecourses, we first aligned each timecourse to the time of collection of the first SARS-CoV-2-positive specimen (of any type) for each participant. Differences between viral loads from the same infection timepoint were quantified. We compared both intra- and interparticipant viral-load timecourses: when the lengths of two participant timecourses differed, the longer timecourse was truncated. We then hypothesized that if the viral-load timecourses followed the same time-dependent distribution, then the observed ‘noise’ between these viral-load measurements would be attributable to expected sampling noise.
Expected sampling noise was estimated as a zero-centered normal distribution fitted on human RNase P control target measurements (Fig S4B, additional details can be found in Supplemental Information). The distribution of observed ‘noise’ between viral-load measurements was obtained by performing maximum likelihood estimation (MLE) on each pair of viral-load timecourses being compared. We then tested whether observed differences in viral load across pairs of viral-load timecourses could be explained by expected sampling noise alone. P-values were then obtained by performing upper-tailed Kolmogorov–Smirnov tests over the differences between the distributions of the observed noise across viral-load timecourses and expected sampling noise. Two-stage Benjamini–Hochberg correction was used to limit the false discovery rate to 5%; viral-load timecourse comparisons with adjusted P-values <0.05 were considered statistically significantly different. See Supplemental Information for additional details. Analyses were performed in Python 3.8 using the scipy package.61
Inference of Clinical Sensitivity by Viral-Load Quantification
Inferred clinical sensitivity of a given specimen type and analytical sensitivity was calculated for each timebin of infection as the number of specimens of a given type with viral load above a given LOD divided by all participants considered infected (Fig 4) or infectious (Figs 6-7) at that timepoint. Confidence intervals were calculated as described in the Clinical Laboratory Standards Institute EP12-A2 User Protocol for Evaluation of Qualitative Test Performance.62 Statistical testing for differences in inferred clinical sensitivity were performed for paired data (comparing performance at two LODs for one specimen type, or at one LOD for two paired specimen types collected by a participant at a timepoint) using the McNemar Exact Test,63 and for unpaired data (comparing the performance of one specimen type at one LOD between infection stages) using the Fisher Exact Test.64 Analyses were performed in Python v3.8.8.
Participants were considered infected from the time of collection of the first SARS-CoV-2-positive specimen (of any type) until negative in all three specimen types by RT-qPCR. Individuals were presumed to be infectious based on whether the viral load in any specimen type were above or below an infectious viral-load threshold (104, 105, 106, or 107 copies/mL) and fulfilled additional definitions described in the Introduction.
Combination specimen types (e.g., anterior-nares–oropharyngeal [AN–OP] combination swab) were computationally-contrived to have either the maximum (Fig 5-8) or average (Fig S7) viral loads from the specimen types included in the combination that were collected by a participant at that timepoint.
RESULTS
Among the 228 enrolled participants, incident SARS-CoV-2 infection was observed in 14 participants (Fig 1, see Methods), all of whom were enrolled before or at the start of acute SARS-CoV-2 infection with the Omicron variant of concern.65 All 14 had also received at least one vaccine dose more than 2 weeks prior to enrollment (Table S1). From this cohort, 260 saliva (SA), 260 oropharyngeal swab (OPS), and 260 anterior-nares swab (ANS) specimens were collected for viral-load quantification (see Methods) and plotted relative to enrollment in the study (Fig 2). All participants additionally took daily rapid antigen tests; an analysis of the antigen testing is in a companion manuscript.54
Viral-load timecourses in the earliest stage of acute SARS-CoV-2 infection differed substantially among specimen types and participants (Fig 2). In only 2 (Fig 2A, I) of the 14 participants, viral loads became quantifiable in all three specimen types at the same timepoint; in most (11 out of 14) participants (Fig 2B,C,D,E,F,G,H,J,K,L,M), SA or OPS were positive first, while ANS remained negative or at low, inconsistently positive viral loads for up to the first 6 days of infection. However, later in the infection, the average peak viral load in ANS was significantly higher than SA or OPS (Fig S1).
In one individual (Fig 2B), SARS-CoV-2 was detectable in ANS for only 4 days, with only one specimen rising to 104 copies/mL, while OPS viral loads exceeded 106 copies/mL for many days. Even daily testing of ANS with a moderate (LOD of 104 copies/mL) analytical sensitivity assay would not have reliably detected this participant’s infection.
SARS-CoV-2 Viral Loads Differ Significantly Between Specimen Types During the Early Period of Infection
We next sought to quantify the magnitude of differences in viral load across paired specimens of different types, to answer three questions: (i) Are differences in viral loads between specimen types large enough to impact detectability by assays with varying analytical sensitivity? (ii) Are differences in viral loads attributable to variability in participant sampling behavior? (iii) Are viral-load timecourses in different specimen types correlated with each other?
First, we calculated the absolute (Fig 3A) and relative (fold) differences (Fig S3) in viral loads between paired specimen of different types collected by the same participant at the same timepoint. Large (several orders of magnitude) differences in absolute viral loads were observed between paired specimen types for both the first 4 days from the incidence of infection (Fig 3A) and all timepoints (Fig 3B). We observed absolute differences of more than 9 orders of magnitude, and all specimen type comparisons had median absolute differences greater than 104 copies/mL, a scale of difference likely to impact the detection of SARS-CoV-2 across different specimen types.
If the observed differences in viral loads between specimen types were the result of variability in sample collection, we would expect the fold differences to be similar to the variability of the human RNase P control marker, but RNase P Ct measurements were relatively stable for each specimen type collected by participants across their timecourse (Fig 2, Fig S4). The average standard deviation in RNase P Ct across participants was <1.5 in all specimen types (SA: 1.37, ANS: 1.42, OPS: 1.46) over the entire course of enrollment (Fig S4B), which corresponds to, at most, a 2.8-fold change in target abundance. In contrast, most (84%) comparisons between specimen types had greater than a 2.8-fold differences in viral loads (Fig S2B), suggesting the extreme differences in viral load were not due to variability in sampling.
Although the differences in viral loads across paired specimens of different types were extreme, we recognized the possibility that the longitudinal timecourse of viral loads from different specimen types in a person might still correlate. For example, viral loads in one specimen type (e.g., saliva) might be consistently lower than those in another (e.g., ANS), but follow the same pattern throughout acute infection. If this were the case, viral load measured in one specimen type would still be associated with the viral load in another specimen type despite extreme absolute differences.
To test for associations between specimen types, we measured the correlation between viral-load timecourses from each specimen type collected by the same participant. We also measured the correlations between viral-load timecourses of each specimen type collected by different participants and represented these intra- and interparticipant correlations as a matrix for the 42 viral-load timecourses (14 participants with three specimen types each, Fig 3C-D). The strength of the correlation (Fig 3C) was quantified by estimating the standard deviation of pairwise differences in viral load across the two timecourses. The statistical significance of the correlations between viral-load timecourses (Fig 3D) was then calculated by comparing the distribution of pairwise differences in viral-load timecourses to a distribution of expected sampling noise.
We found that viral-load timecourses in different specimen types collected by the same individual often do not correlate. Importantly, in nearly all participants (13 of 14), viral loads in at least two specimen types from the same participant had significantly different timecourses. In 38% of comparisons (16 of 42), we observed significantly different viral-load timecourses for each of the three specimen types from the same individual (Fig 3D). In some instances, the viral-load timecourses of specimen types from the same participant were less correlated with each other than other participants. For example (see white circles in Fig 3D), the SA viral-load timecourse for individual A was not significantly different from the SA timecourses for participants D, F, G, H, J, K, L, or M, however Individual A’s SA timecourse was significantly different from his own OPS timecourse.
Within the same individual, OPS and ANS viral-load timecourses were most commonly different (64%, 9 of 14 individuals). Additionally, in 29% (4 of 14) of individuals, SA and ANS viral-load timecourses differed significantly, and in 21% (3 of 14) of individuals, their own SA and OPS viral-load timecourses were significantly different (Fig 3D).
The Inferred Clinical Sensitivity to Detect SARS-CoV-2 Infection Strongly Depends on Infection Stage, Specimen Type, and Assay Analytical Sensitivity
Because viral load determines whether an assay with a given analytical sensitivity will reliably yield a positive result, we hypothesized that the extreme differences in viral loads among different specimen types would significantly impact the clinical sensitivity of COVID-19 tests performed on different specimen types during different stages of the infection.
To examine the inferred clinical sensitivity to detect SARS-CoV-2 infections as a factor of both specimen type and test LOD, aligned viral-load timecourses were divided into 4-day timebins. For specimens collected within each timebin, we assumed that viral loads above a given assay LOD would reliably yield a positive result; we did not assume that specimens that were positive but not quantifiable would be reliably detected by our high-analytical-sensitivity assay (LOD of 250 copies/mL). The inferred clinical sensitivity of detecting infected persons by each specimen type and assay LOD during each timebin was calculated as the proportion of specimens with viral loads greater than the assay LOD, divided by all timepoints collected by infected participants in that same timebin.
For all specimen types and timebins, testing with a high-analytical-sensitivity assay (LOD of 103 copies/mL) yielded significantly better inferred clinical sensitivity to detect infected persons than testing with a low-analytical-sensitivity assay (LOD of 106 copies/mL) (Table S4A-I).
Given the value of early detection of SARS-CoV-2, we specifically examined the first 4 days of infection, when individuals are often pre-symptomatic. During this stage of infection, no single specimen type achieved >90% inferred clinical sensitivity with any LOD (Fig 4A,B,C). The range of inferred clinical sensitivities across specimen types even at the lowest value of LOD (50.8% to 83.1%) suggests that no single specimen type will reliably provide earliest detection of infection with the Omicron variant.
In the first 4 days, at all LODs, ANS had the poorest inferred clinical sensitivity of all three specimen types. Even with a high-analytical-sensitivity assay (LOD of 103 copies/mL), ANS was predicted to miss more than half (54%, 32 of 59) of timepoints from infected persons. SA and OPS specimens had significantly better inferred clinical sensitivity than ANS when a high-analytical-sensitivity assay (LOD of 103 copies/mL) was used, and worse (but not significantly) when a low-analytical-sensitivity assay (LOD of 106 copies/mL) was used (Table S4J-Z).
As infection progresses to days 4-8, individuals are more likely to become symptomatic. Inferred ANS performance improved significantly during days 4-8 (Table S4 AH-AN) and became significantly better than SA and OPS at LODs of 103 copies/mL and above (Table S4AO-BB). This improvement can be attributed to the rise to high viral loads in ANS during this period: SARS-CoV-2 was not detected in almost half of ANS specimens in days 0-4, but in days 4-8, more than half of ANS specimens had high viral loads (>106 copies/mL) (Fig S1D,E). In contrast, during both timebins, almost half of all SA or OPS specimens had viral loads between 104 and 107 copies/mL, and thus detection using SA or OPS was more dependent on assay LOD (Fig S1D,E).
Computationally-Contrived Combination Specimen Types Have Higher Inferred Clinical Sensitivity Than Any Single Specimen Type
The extreme differences and lack of correlation in viral loads among specimen types as well as the poor performance of all three specimen types in all timebins and all test LODs led us to hypothesize that combination specimen types might exhibit better clinical sensitivity. We generated computationally-contrived specimen types representing combinations of specimen types. For each timepoint, the viral load of a combination specimen type was the highest viral load of any single specimen type included in the combination. We then inferred the clinical sensitivity of these combination specimen types with assays of different analytical sensitivities for each timebin (Fig 4D-G).
As expected, all combination specimen types had better inferred clinical sensitivity than single specimen types for all timebins. ANS and OPS exhibited the least similar (and therefore most complementary) viral-load timecourses (Fig 3D), and thus had the best inferred performance of all the two-specimen-type combinations (Fig 4E).
During the earliest phase of infection (days 0-4) and with the use of a high-analytical-sensitivity assay (LOD of 103 copies/mL), the AN–OP combination swab was predicted to detect 88% (52 of 59, Fig 4E) of specimens from infected persons, which was significantly better inferred clinical sensitivity than for ANS alone (46%, Fig 4B, Table S4BC) or for OPS alone (78%, Fig 4C, Table S4BD). With a low-analytical-sensitivity assay (LOD 106 copies/mL), the AN–OP combination swab also exhibited significantly better inferred clinical sensitivity (53%) than ANS (31%) or OPS (32%, Table S4BF,BG).
We recognize that specimen-collection and processing factors (e.g., buffer volumes, type and carrying capacity of swabs), may cause dilution effects that would impact the viral load for combination specimen types. To account for this, we also performed this analysis by calculating the viral load of a computationally-contrived combination specimen as the average (rather than maximum) viral load of paired single specimen types in each combination (Fig S7). Using the average introduced at most a 2- or 3-fold correction for the two- or three-specimen combinations, respectively, because viral loads differed by orders of magnitude (Fig 3). Clinical sensitivities of combination specimen types remained similar (Fig S7I-J) to those calculated in Fig 4 and the AN–OP combination swab remained superior with this alternate calculation (Fig S7F).
Differences in Viral Load Across Individual Specimen Types Hinders the Detection of Presumably Infectious Individuals by Tests Utilizing Single Specimen Types
Prompt identification of individuals who are or will become infectious can prevent further transmission. We next compared the ability of each specimen type and assay analytical sensitivity to detect presumably infectious individuals. In the absence of direct viral culture, an individual was presumed to be infectious if the viral load in any specimen type collected from that participant at a given timepoint was above an infectious viral load threshold. As mentioned in the Introduction, known infectious viral loads span a broad range from 104 to 107 copies/mL (Fig 2, Fig 6). We performed separate analyses for each infectious viral-load threshold to test the robustness of our conclusions.
Because of the extreme differences in viral-load timecourses, a presumed non-infectious viral load in one specimen type did not reliably indicate that a participant would have presumed non-infectious viral loads in all specimen types. At the highest infectious viral-load threshold (107 copies/mL), 59 of 197 timepoints had an infectious viral load in at least one paired specimen type (Fig 5A). Therefore, in the remaining 70% (138 of 197) of timepoints, a presumed non-infectious viral load in one specimen type was inferred to correctly indicate that the participant did not have an infectious viral load in any specimen type collected at that timepoint and could be presumed non-infectious. In contrast, at the lowest infectious viral-load threshold (104 copies/mL), a presumed non-infectious viral load in one specimen type indicated a non-infectious participant only about 24% of the time (47 of 197 timepoints). The difference in concordance is partly due to the decreasing number of presumed infectious timepoints as the infectious viral-load threshold increases (Fig 5A).
Across infectious viral-load thresholds, we saw a pattern that suggested combination specimen types together might capture the most presumably infectious timepoints (Fig 5B-C). Testing for infectious viral loads with SA alone would miss more than half of presumed infectious individuals, and testing with either OPS or ANS alone would miss more than one third of presumed infectious individuals (Fig 5B). Interestingly, across all presumed infectious viral-load thresholds, 90–95% of timepoints with an infectious viral load in any specimen type had infectious viral loads in either ANS or OPS specimen. This complementarity suggested that an AN–OP combination swab specimen type could be superior for detecting nearly all infectious individuals.
We sought to interrogate this complementarity between ANS and OPS specimens further by comparing the viral loads of the three specimen types at each of the 150 timepoints in which at least one specimen had viral loads above a 104 copies/mL infectious viral-load threshold (Fig 5C). We found that 52% of individuals with presumed non-infectious viral loads by SA, 38% of individuals with presumed non-infectious viral loads by OPS, and 30% of individuals with presumed non-infectious viral loads by ANS had infectious viral loads in another specimen type collected at the same timepoint. In these cases, high-analytical-sensitivity testing could potentially capture individuals with infectious viral loads in specimen types other than the one tested. However, 19% of SA, 20% of ANS, and 13% of OPS specimens had either undetectable or unquantifiable viral loads while another specimen type in the same individual had an infectious viral load (Fig 5C). In such cases, testing a single specimen type even with a very-high-analytical-sensitivity assay (e.g., LOD of 250 copies/mL) would not reliably detect a presumably infectious person.
The extreme differences and lack of correlation between viral loads in paired specimens of different types suggests that an infected person could have low viral loads in one specimen type while having high and infectious loads in another. This prompted us to question how well each specimen type and assay LOD would impact the detection of infectious individuals at different stages of the infection.
Inferring Detection of Infectious Individuals by Specimen Type and Assay Analytical Sensitivity Across Infectious Viral-Load Thresholds
Having assessed which specimen types contained infectious viral loads, we wanted to assess the ability of each single or combination specimen type to detect presumed infectious individuals across different stages of acute infection, when tested by assays with either high- (LOD 103 copies/mL) or low- (LOD 106 copies/mL) analytical sensitivity (Fig 6). Regardless of specimen type, the inferred clinical sensitivity of both high and low-analytical-sensitivity assays to detect presumed infectious individuals typically increased as the infectious viral-load threshold increased. This was more pronounced for the low-analytical-sensitivity assay (LOD of 106 copies/mL). This pattern is intuitive; specimens with viral loads above the infectious viral-load threshold but below the LOD are considered infectious but missed, decreasing the inferred clinical sensitivity. Increasing the infectious viral-load threshold would exclude those specimens, thereby resulting in better inferred clinical sensitivity. This effect is heightened when infectiousness is presumed based only on the viral loads in one specimen type, and perfect performance is observed when the infectious viral-load threshold is at or above the assay LOD. However, this is unrealistic as it does not account for infectious viral loads in other specimen types (Fig 5). We presumed infectiousness whenever viral load in any specimen type exceeded the infectious viral load threshold. Thus, the poor inferred sensitivities at infectious viral-load thresholds above 106 copies/mL are attributable to infectious viral loads in specimen types other than the one tested (Fig 6).
Three major patterns remained consistent regardless of the infectious viral-load threshold, so for simplicity the rest of this section will describe inferred clinical performances and statistical comparisons using only an infectious viral-load threshold of 105 copies/mL.
First, during the first 4 days of infection, all specimen types were inferred to have significantly worse performance when tested with a low-analytical-sensitivity assay (LOD of 106 copies/mL) compared with a high-analytical-sensitivity assay (LOD of 103 copies/mL, Table S4BK-BQ). Even when tested with a high-analytical-sensitivity assay, no single specimen type achieved >95% inferred clinical sensitivity (Fig 6A,B,C).
Because the rise in ANS viral load was delayed relative to SA or OPS in most participants (Fig 2), ANS exhibited significantly lower performance than SA and OPS during days 0-4 (Table S4BS,BT,BU). At an assay LOD of 103 copies/mL, the inferred clinical sensitivity of ANS was only 57% (Fig 6C). This suggests that ANS testing, even with high analytical sensitivity, would miss approximately 40% of presumed infectious individuals at the very beginning, and often pre-symptomatic, stage of the infection.
Second, from days 4 to 8 of infection, when ANS viral loads increased rapidly in many participants (Fig 2, Fig 4), ANS had significantly higher inferred clinical sensitivity regardless of LOD (Fig 6C versus Fig 6J, Table S4BU,BV) and significantly better performance than SA (Fig 6H) and OPS (Fig6I) across LODs (Table S4BW-BZ).
Third, the higher clinical sensitivity of OPS during days 0 to 4, and of ANS at days 4 to 8 suggested complementarity; This is supported by the observations that ANS and OPS have some of the most extreme differences in viral load (Fig 3A,B), that the highest proportion of individuals had significantly different ANS and OPS viral-load timecourses (Fig 3D), and that only rarely do individuals have infectious viral loads in saliva alone (Fig 5, Table S2).
The AN–OP combination swab (Fig 6F) was inferred to perform significantly better than all single specimen types alone (Fig 6A,B,C) when tested with either a high- or low-analytical-sensitivity assay during the first 4 days of infection, and significantly better than SA (Fig 6H,O) and OPS (Fig 6I,P) when tested with either a analytical sensitivity during later stages of infection (Fig 6M,T, Table S4CA-CJ). In addition, AN–OP had significantly better performance than ANS when tested with a low-analytical-sensitivity assay during days 4-8 (Fig 6M versus Fig 6J).
The combination of all three specimen types (Fig 6G,N,U) would by definition capture all presumed infectious individuals. However, this combination type never had significantly higher inferred clinical sensitivity than AN–OP combination swab (Table S4CM-CR).
Performance of Specimen Types and Analytical Sensitivities in the Pre-infectious and Infectious Periods
For public-health purposes, understanding assay performance during the pre-infectious and infectious periods, rather than in timebins relative to the incidence of infection, is more informative and actionable. Therefore, we next evaluated the performance of each single specimen type and the AN–OP combination swab for each assay LOD during the presumed pre-infectious and infectious periods (Fig 7). We chose to focus our analysis on the AN–OP combination swab specimen type because of the high performance it exhibited to detect both infected (Fig 4) and presumed infectious (Fig 6) individuals.
As described in the Introduction, there are several methods for defining the infectious period based on viral loads. To ensure our conclusions were robust across different definitions of the infectious period, we compared the results of our analysis across three common definitions: a “continuous” infectious period, an “instantaneous” infectious period), for only the first 5 days after an initial infectious specimen (“day [0-5]” infectious period).
At all infectious viral-load thresholds above 105 copies/mL, AN–OP combination swab had the highest inferred clinical sensitivity of any specimen type to detect pre-infectious individuals (Fig 7E,I,M). In all cases where the assay LOD was at least 2 orders of magnitude lower than the infectious viral-load threshold, there were more than 10 detectable specimens available for comparison of inferred clinical sensitivity, and AN–OP combination swab was inferred to perform significantly better than ANS (Table S4CS-DT). With an infectious viral-load threshold of 104 copies/mL, fewer pre-infectious timepoints were available for analysis. In this case, we see that ANS had very low performance, but no specimen type emerged as optimal (Fig 7A).
Three major trends held across all infectious viral-load thresholds and all definitions of the infectious period. First, ANS had similar performance to SA and OPS when testing with high-analytical-sensitivity assays (LODs at or below 103 copies/mL), except when infectious period is defined as the 5 days following the first infectious specimen. This definition selects earlier timepoints, prior to the rise in ANS viral loads (Fig 2, Fig 5C) so ANS testing has lower inferred clinical sensitivity to detect both infected (Fig 4B) and infectious (Fig 6C) individuals. Second, as noted previously (Fig 4), ANS performance was more robust to differences in assay LOD than SA and OPS because ANS loads tended to be either very low or very high (>106 copies/mL), whereas SA and OPS tended to fluctuate between 104 to 107 copies/mL (Fig S1D,E). Furthermore, in all but one comparison (Fig 7D) ANS was inferred to have higher performance than SA or OPS alone when tested with lower analytical sensitivity assays (LODs at and above 105 copies/mL). Third, AN–OP combination swab always had the highest inferred clinical sensitivity at all LODs.
DISCUSSION
In this study, we observed extreme and statistically significant differences in SARS-CoV-2 viral loads among three common respiratory specimen types (saliva, anterior-nares swab, and oropharyngeal swab) collected at the same timepoint by 14 individuals enrolled before or at the incidence of acute infection. In all 14 individuals we also observed that the viral-load measurements in different specimen types followed significantly different timecourses. These intra-participant differences were as extreme as those observed between participants (Fig 3C-D). The differences in viral load resulted in significantly different inferred clinical sensitivities to detect both infected and infectious individuals depending on the infection stage, specimen type, and analytical sensitivity (LOD) of the assay. This led us to conclude that the SARS-CoV-2 viral load from any single specimen type can only describe the state of the specimen type tested, and not the general state of the individual’s infection.
Because of the extreme differences in viral-load patterns in the early and pre-infectious periods of infection, there was no single optimal specimen type for detecting Omicron. However, ANS was the poorest specimen type for detection in the first 4 days of infection. We observed a delay in detectable SARS-CoV-2 in nasal-swab specimens relative to oral specimens similar to what has been observed previously16,67 with earlier SARS-CoV-2 variants. In our study, 12 of 14 participants (86%) infected with the Omicron variant were either negative in ANS or had ANS viral loads below 250 cp/mL at the incidence of infection (the first day viral RNA was detectable in any specimen type). In 3 of these 12 positive participants (25%), ANS was undetectable for more than 3 days (Fig 1B,C,H). Because of the delay in ANS viral loads in the first days of infection, the inferred clinical sensitivity of ANS specimens at the beginning of infection was low (<60%), even with high-analytical-sensitivity assays.
Furthermore, we found that low-analytical-sensitivity testing was inferred to have poor performance for early detection in all specimen types. High-analytical-sensitivity assays (e.g., LODs of 103 copies/mL) were inferred to improve clinical sensitivity in all specimen types and at all stages of infection. At high (compared with low) analytical sensitivities, SA and OPS testing had significantly better inferred clinical sensitivity for detecting the first 4 days of infection.
We also found that there was no optimal single specimen type for detection of presumed infectious individuals (based on viral-load thresholds of 104 to 107 copies/mL or greater in any specimen type). Of the three single specimen types, ANS testing was inferred to miss the lowest proportion of presumed infectious individuals overall, but ANS still missed at least 30% of presumably infectious timepoints because of high viral loads in oral specimen types (Fig 6). The failure to detect presumed infectious individuals was inferred to be even worse when using tests of low analytical sensitivity. Real-world testing of low-analytical-sensitivity assays is needed to confirm this inferred clinical sensitivity. To address this point, rapid antigen testing results for a broader cohort from this study population are reported in a companion paper.54
Testing with combination specimen types (e.g., a single swab of both the oropharynx and anterior nares) was inferred to yield significantly improved clinical sensitivity to detect both infected (Fig 4) and presumed infectious individuals (Fig 6,7) than any single specimen type. Combination swabs are already common in many regions of the world.11 In the United Kingdom, the National Health Service website even states that PCR tests that rely only on nasal swabbing will be “less accurate” than those with a combined nose and tonsil swab.68 The U.K. also uses a combination throat–nasal swab for rapid antigen testing.69 However, standard COVID-19 testing practice in the U.S. is to test only one specimen type, usually from the nasal cavity. Despite hundreds of emergency use authorizations (EUAs) that the FDA has issued for detection of SARS-CoV-2, including 280 molecular tests and 49 rapid antigen tests,12 none use a combination specimen type.
Studies comparing single and combination specimen types have generally shown that combination specimens are either equivalent or superior to single specimens. Three studies have found that nasal-OP swabs performed better than NP swabs.70-72 Another study found that self-collected combined nasal-OP swabs performed better than oral fluid/saliva.73 A separate study found lower Ct values (higher viral loads) using a combined NP–OP swab, compared with either swab alone or sputum.74 But several other studies have found non-inferiority (no advantage) to combined nasal-oropharyngeal testing over nasal testing alone.25,75-77 Importantly, in all studies evaluating combination swabs, sample collection began after the onset of COVID-like symptoms and/or after an initial positive test (usually by nasal swab) and thus likely missed the earliest days of infection, which is the time period when we found the greatest benefit of sampling SA or OPS.
Our study has four limitations. First, although this is the most comprehensive study of complete viral loads in multiple specimen types to date, data are from a limited number of individuals and demographic. Obtaining early viral-load timecourses from these 14 individuals required enrollment and daily testing of 228 participants for a total of 6,825 RT-qPCR tests. Future studies for new SARS-CoV-2 variants and new respiratory viruses should ideally involve multi-institution partnerships to enroll a diverse cohort from a broad geographic range. Second, we presumed infectiousness based on viral-load thresholds in three specimen types; we did not perform viral culture on these specimens (we also acknowledge that specimen types not collected here could have contained infectious viral loads25,78,79). Third, Omicron remains a relevant variant, but additional variants will continue to develop and may exhibit different patterns in their viral-load timecourses by specimen type, so community-based studies like this one will need to be done regularly to identify optimal testing methods for new variants (and emerging respiratory viruses). Finally, although our results allow us to infer that early infections would be most reliably detected with a combination AN–OP swab, we did not actually collect combination specimen in this study. Our inferences should be verified using actual combination swabs compared with individual OPS swabs and individual AN swabs in a longitudinal study design that captures the earliest days of detectable virus.80
Conclusion
The unique study design employed here demonstrates why comparisons between specimen types tests and inferences about the distributions of early viral loads cannot use only samples collected in the middle- or late periods of infection. Viral-load dynamics in the earliest days of infection are very different from those in the middle and late stages. Moreover, the results presented here could not have been gleaned from single timepoint measurements. Only with longitudinal, quantitative studies that begin at the incidence of infection, can appropriate comparisons between specimen types and test analytical sensitivities be made.
This study design allowed us to reveal that viral loads in different specimen types within the same person can differ by more than 9 orders of magnitude at the same timepoint, and that viral loads in different specimen types collected by the same person often do not correlate over time. No single SARS-CoV-2 specimen type can reliably describe an infected person and infectiousness cannot be ruled out from low viral load in a single specimen type. The use of multiple specimen types and longitudinal sampling further allowed us to identify that ANS and OPS specimens are the most different and therefore complementary in Omicron infections, so a combination nasal-oropharyngeal swab is likely to be the most reliable, particularly when using low-analytical-sensitivity assays (see related paper examining antigen test performance54). Such combination AN–OP swab tests are already common in many countries outside of the U.S. but have not yet been approved in the U.S. Ideally, quantitative longitudinal studies of differences in viral loads in each specimen type starting immediately at the incidence of infection should be routinely performed to update testing strategies for new emerging variants and new respiratory viruses. In the absence of such studies, the use of combination specimen types is prudent.
Data Availability
The data underlying the results presented in the study can be accessed at CaltechDATA: https://data.caltech.edu/records/20223.
DATA AVAILABILITY
The data underlying the results presented in the study can be accessed at CaltechDATA: https://data.caltech.edu/records/20223.
COMPETING INTERESTS STATEMENT
RFI is a co-founder, consultant, and a director and has stock ownership of Talis Biomedical Corp.
FUNDING
This work was funded by the Ronald and Maxine Linde Center for New Initiatives at the California Institute of Technology and the Jacobs Institute for Molecular Engineering for Medicine at the California Institute of Technology. AVW is supported by a UCLA DGSOM Geffen Fellowship.
Supplemental Information
Supplemental Materials and Methods
Study Participants
All adult participants provided written informed consent, and minors provided assent and their legal guardian provided written permission. Individuals were eligible for enrollment if someone in their home had recently (within 5 days) become positive for SARS-CoV-2, or if they had a recent known exposure to a person suspected to be SARS-CoV-2-positive. All participants had to be 6 years of age or older and fluent in English.
Extraction and RT-qPCR
Participants packaged their specimens each morning for transport by medical courier to Pangea Laboratories in Tustin, CA, USA. Most specimens were received at the facility within 10 hours of collection; some specimens were received at the facility ∼24-48 hours after donation due to transport delays. Most specimens were extracted and run in RT-qPCR within a few hours of arrival to the facility. Extraction and RT-qPCR operators and supervisors (at Pangea Laboratory) were blinded to which participant a specimen originated from, as well as the infection status and test results of participants.
Extraction and RT-qPCR were performed using the FDA-authorized Quick SARS-CoV-2 RT-qPCR Kit .58 which extracts nucleic acids using the Quick-DNA/RNA Viral MagBead Kit (Zymo Research, Catalog #R2141) followed by amplification of three target regions within the SARS-CoV-2 N gene.
A specimen was considered inconclusive if the human RNase P Ct value was >40 or not detected. If RNase P had a Ct < 40, then for a SARS-CoV-2 N gene target Ct value <40 the sample was considered positive. If the SARS-CoV-2 target Ct value was 40-45 it was considered inconclusive, and if >45 or not detected it was considered negative.
Quantification of Viral Load from RT-qPCR Result
To quantify viral load in RT-qPCR specimens, a 9-point standard curve was generated at Caltech using dilutions from a commercial heat-inactivated SARS-CoV-2 particles (BEI Cat. N4-52286 Lot 70034991). To achieve higher concentrations and greater dynamic range in the standard curve, volume from a participant saliva specimen previously quantified to have a viral load of 6.44×109 copies/mL53 was used to generate 4 additional points. Diluted particles or volume from the participant specimen was spiked into pooled matrix from freshly collected SA, ANS, or OPS specimens from SARS-CoV-2 negative donors, collected as described above. Specimens were then shipped to Pangea Laboratories (concentrations blinded) for extraction and RT-qPCR testing. Three of three replicates at 250 copies/mL of specimen were detected, independently validating the reported LOD for the assay.
From the dynamic range of the standard curve (250 copies/mL to 4.50×108 copies/mL), the following equations were used to convert RT-qPCR SARS-CoV-2 N gene Ct value to viral load in genomic copy equivalents (copies) per mL of each specimen type:
Viral Load in copies/mL saliva = 2(Ct - 42.374)/-0.8973
Viral Load in copies/mL buffer for nasal swabs = 2(Ct - 43.050)/-0.9282
Viral Load in copies/mL buffer for oropharyngeal swabs = 2(Ct - 43.903)/-0.9653
Positive specimens with viral loads that would be quantified below the assay LOD (250 copies/mL) were considered not quantifiable, as amplification and resulting Ct values become noisy at these very low viral loads.
Viral Sequencing and Lineage/Variant Determination
Whenever possible, we sequenced the putative index case’s highest viral load nasal-swab specimens. When this was not possible (e.g., if the index case was not enrolled, or the index case’s highest viral load nasal-swab specimen was insufficient for sequencing, or limitations in available specimen volume), we chose an alternate high viral load (viral load <2×104 copies/mL) nasal or oropharyngeal swab specimen from the index case or a secondary case in the household.
All sequencing was performed by Zymo Research at Pangea Lab using a variant ID detection workflow that closely resembles the Illumina COVDISeq™ NGS Test (EUA).59,60 In brief, RNA extracted from samples underwent cDNA synthesis using random hexamers according to the manufacturer’s recommendation (Illumina, Catalog #20043675).
The SARS-CoV-2 virus genome was amplified using primers designed to tile across the full sequence length as originally described by the ARTICnetwork (https://artic.network/ncov-2019). Amplicons containing the SARS-CoV-2 viral genome fragments were then pooled and subjected to tagmentation to further fragment and tag amplicons with adapter sequences. Adapter-tagged amplicons then underwent a second round of PCR amplification using a PCR master mix and unique index adapters. The indexed libraries were then pooled and cleaned up for downstream sequencing.
Finished libraries were sequenced on an Illumina MiniSeq using a PE 100 bp read configuration to a depth of approximately 100,000 reads per library. Illumina sequence reads were converted from bcl to fastq files, adaptor trimmed, then quality filtered using standard parameters. Variant calls as described by Phylogenetic Assignment of Named Global outbreak LINeages software 2.3.2 (github.com/cov-lineages/pangolin) were made using a custom bioinformatics data analysis pipeline developed by Zymo Research.
Shuffled Viral-Load Timecourses and Data Validations
In addition to controls built into the study design (e.g. specimen have barcodes specific to each specimen type, barcodes are confirmed to be the expected specimen type when packaging specimen-collection materials prior to delivery to participants, participants take and package specimen types in a specific order during each timepoint, and the receiving laboratory assessed arriving specimen for the presence of a swab), we assessed mathematically whether the observed viral loads were likely to come from viral-load timecourses of their designated specimen type, or whether they could have been switched between specimen types. We assessed the correlation between the viral load for a given specimen at a timepoint and either the viral load in the same specimen type or the viral load from a different, randomly selected specimen type at the following timepoint (Fig S3), for all measurements. The correlation between viral-load measurements from randomly selected specimen types is significantly different (P<0.001) from the correlations between viral-load measurements from the same specimen type (Fig S3C). Erroneously assigned specimen types would yield similar (P>0.01) correlations for both randomized and non-randomized viral-load timecourses. The analysis showed greater standard deviation for shuffled compared with unshuffled viral-load timecourses, suggesting that all specimens were correctly assigned to specimen type by participants.
Estimations of Sample Noise with RNase P
To estimate expected sampling noise that would affect viral-load measurements in each specimen type, we examined RT-qPCR Ct measurements of the human RNase P control target in the same specimen type from each of the 14 participants in this cohort (Fig 2; Fig S4B). The standard deviation of the RNase P Ct was calculated for each timecourse and then averaged over all 14 participants: the average standard deviation of RNase P Ct for saliva specimens was 1.37, nasal-swab specimens was 1.42, and oropharyngeal swab specimens was 1.46 (Fig S4B). We then used the average standard deviation of RNase P Ct across all three specimen types (1.42 Ct) as the overall estimate of sampling noise in all viral-load measurements, which is consistent with the standard deviation (1.7 Ct) of SARS-CoV-2 N2 gene Ct values in two MT nasal-swab specimens collected immediately in sequence in a separate study.66
[see attached Table S4.xlsx]
Table S4. Statistical comparisons of inferred clinical sensitivity drawn from Fig 7 and 8. For select comparisons (across specimen types, assay LODs, infection stages/timebins, or IVLTs), the comparison is stated, along with the inferred clinical sensitivity (with 95% Confidence Intervals), statistical method, and significance of the difference. Index is referenced in the main text. Bolded cells in each row indicate the groups being compared. Values under Contingency Table indicate number of specimens. ‘Infectious’ indicates timepoints from individuals with a viral load in any specimen type above the infectious viral-load threshold listed in parentheses. Test Methods: A-Lower-Tailed McNemar Exact Test, B-Upper-Tailed McNemar Exact Test, C-Two-Tailed McNemar Exact Test, D-Lower-Tailed Fisher Exact Test. SA, saliva; ANS, anterior-nares swab; OPS, oropharyngeal swab; AN–OP, anterior-nares–oropharyngeal combination swab; SA–ANS, saliva–anterior-nares combination specimen; SA–OPS, saliva–oropharyngeal swab combination specimen; SA–ANS–OPS, saliva–anterior-nares–oropharyngeal swab combination specimen.
AUTHOR CONTRIBUTIONS (listed alphabetically by last name)
Reid Akana (RA): Collaborated with AVW in creating digital participant symptom surveys; assisted with data quality control/curation with NS, HD, SC; created current laboratory information management system (LIMS) for specimen logging and tracking. Creation of iOS application for sample logging/tracking. Configured an SQL database for data storage. Created an Apache server and websites to view study data. Configured FTPS server to catalog PCR data. Wrote a Python package to access study data. Trained study coordinators on SQL. Troubleshooting and QC of LIMS. Made Fig 3(C-D) and SI Figs S3, S4, S5, Table S2, S3, S4, S5, S6. Wrote and edited the manuscript with AVW and NS.
Alyssa M. Carter (AMC): Assisted with the inventory and archiving of >6,000 samples at Caltech; coordinated shipment of samples to Caltech with AER and JRBR; assisted with procurement of antigen tests; assisted with organizing volunteers and making participant kits; assisted AER in developing and implementing QC for participant kits. Provided feedback and edited the manuscript.
Yap Ching Chew (YCC): Primary liaison with Caltech team. Prepared and provided Zymo SafeCollect kits and related materials to Caltech team. Supervised the extraction, PCR, and QC teams at Pangea Laboratory. Sent PCR results daily to Caltech team. Arranged for Pangea team to perform viral-variant sequencing on selected samples; reported results and provided sequencing files.
Saharai Caldera (SC): Study coordinator; recruited, enrolled and maintained study participants with NS and HD; study-data quality control, curation and archiving with RA, NS, HD and MKK; supplies acquisition with AER, NS, HD and MKK.
Hannah Davich (HD): Lead study coordinator; co-wrote participant informational sheets with NS; developed recruitment strategies and did outreach with NS; participant kit creation and co-coordinated kit-making by volunteers with AER; recruited, enrolled and maintained study participants with NS and SC; managed the study-coordinator inventory; study-data quality control, curation and archiving with RA, NS, SC and MKK; supplies acquisition with AER, NS, SC and MKK.
Matthew Feaster (MF): Co-investigator; collaborated with AVW, MMC, NS, YG, RFI on study design and recruitment strategies; provided guidance and expertise on SARS-CoV-2 epidemiology and local trends.
Ying-Ying Goh (Y-YG): Co-investigator; collaborated with AVW, MMC, NS, MF, RFI on study design and recruitment strategies; provided guidance and expertise on SARS-CoV-2 epidemiology and local trends.
Rustem F. Ismagilov (RFI): Principal investigator; collaborated with AVW, MMC, NS, MF, YYG on study design and recruitment strategies; provided leadership, technical guidance, oversight of all analyses, and was responsible for obtaining the primary funding for the study.
Mi Kyung Kim (MKK): Study coordinator (part-time); maintaining participants with NS, HD, and SC; study-data quality control, curation and archiving with RA, NS, SC and HD; supplies acquisition with AER, NS, SC and HD; collected contact info for local university/college student health centers for recruitment outreach; assembled Table S1 with NS.
John Raymond B. Reyna (JRBR): Organized sample labeling and short-term storage of all samples at Pangea Laboratories. Arranged shipment of all samples to Caltech team. Assisted with processing of the specimens.
Anna E. Romano (AER): Co-coordinated kit-making by volunteers with HD; implemented QC process for kit-making; participated in kit-making; managed logistics for the inventory and archiving of >6,000 samples at Caltech; supplies acquisition with HD, NS, SC and MKK; assisted with securing funding; compiled Table S3; organized and performed QC on sequencing data. Provided feedback and edited the manuscript.
Natasha Shelby (NS): Study administrator; collaborated with AVW, RFI, YG, MF on initial study design and recruitment strategies; co-wrote IRB protocol and informed consent with AVW; co-wrote enrollment questionnaire and post-study questionnaire with AVW; initiated the collaboration with Zymo and served as primary liaison throughout study; reviewed pilot sampling data and amended instructional sheets/graphics for specimen collections in collaboration with Zymo; co-wrote participant informational sheets with HD; hired, trained, and supervised the study-coordinator team; developed recruitment strategies and did outreach with HD; recruited, enrolled and maintained study participants with HD and SC; co-developed participant keep/drop criteria with AVW; performed the daily upload, review, and QC of PCR data received from Zymo; made the daily keep/drop decisions based on viral-load trajectories in each household; made all phone calls to alert presumptive positives of their status and provide resources; study-data quality control, curation and archiving with RA, HD, SC and MKK; organized archiving of all participant data and antigen-test photographs; supplies acquisition with AER, HD, SC and MKK; assisted with securing funding; managed the overall study budget; assembled Fig 1 with AVW; assembled Table S1 with MKK; made Fig 2 with AVW; managed citations and reference library; verified the underlying data with AVW and RA; co-wrote and edited the manuscript with AVW and RA.
Matt Thomson (MT): Assisted with statistical approach and analyses.
Colten Tognazzini (CT): Coordinated the recruitment efforts at PPHD with case investigators and contact tracers; provided guidance and expertise on SARS-CoV-2 epidemiology and local trends.
Alexander Viloria Winnett (AVW): Collaborated with NS, RFI, YG, MF on initial study design and recruitment strategies; co-wrote IRB protocol and informed consent with NS; co-wrote enrollment questionnaire and post-study questionnaire with NS; co-developed participant keep/drop criteria with NS; funding acquisition; designed and coordinated LOD validation experiments; selected and prepared specimen for viral-variant sequencing with NS, YC, and AER; assisted with the inventory and archiving of >6,000 specimen at Caltech with AER and AMC; minor role supporting outreach by HD and NS; minor role supporting kit-making by AER, HD and AMC; verified the underlying data with NS and RA; assembled Fig 1 with NS; made Fig 2 with NS; performed analysis and prepared Figs 4-7, Table S2, Fig S1, S2, S6, S7, S8, S9. Co-wrote and edited the manuscript with NS and RA.
Taikun Yamada (TY): Performed the RT-qPCR COVID-19 testing at Pangea Laboratory.
ACKNOWLEDGEMENTS
We sincerely thank the study participants for making this work possible. We thank Lauriane Quenee, Grace Fisher-Adams, Junie Hildebrandt, Megan Hayashi, RuthAnne Bevier, Chantal D’Apuzzo, Ralph Adolphs, Victor Rivera, Steve Chapman, Gary Waters, Leonard Edwards, Gaylene Ursua, Cynthia Ramos, and Shannon Yamashita for their assistance and advice on study implementation and/or administration. We thank Jessica Leong, Ojas Pradhan, Si Hyung Jin, Emily Savela, Bridget Yang, Ekta Patel, Hsiuchen Chen, Paresh Samantaray, Zeynep Turan, Cindy Kim, Trinity Lee, Vanessa Mechan, Katherine Stiefel, Rosie Zedan, Rahulijeet Chadha, Minkyo Lee, and Jenny Ji for volunteering their time to help with this study. We thank Prabhu Gounder, Tony Chang, Jennifer Howes, and Nari Shin for their support with recruitment. Finally, we thank all the case investigators and contact tracers at the Pasadena Public Health Department and Caltech Student Wellness Services for their efforts in study recruitment and their work in the pandemic response.