Abstract
A growing number of studies provide insight into how SARS-CoV-2 spreads1-7. Yet, many factors that characterize its transmissibility remain unclear, including mechanistic correlates of overdispersion, viral kinetics, the extent to which respiratory droplets and aerosols carry viable virus and the infectiousness of asymptomatic, presymptomatic and pediatric cases7. Here, we developed a comprehensive dataset of respiratory viral loads (rVLs) via systematic review and investigated these factors using meta-analyses and modeling. By comparing cases of COVID-19, SARS and influenza A(H1N1)pdm09, we found that heterogeneity in rVL was associated with overdispersion and facilitated the distinctions in individual variation in infectiousness among these emergent diseases. For COVID-19, case heterogeneity was broad throughout the infectious period, although rVL tended to peak at 1 day from symptom onset (DFSO) and be elevated for 1-5 DFSO. While most cases presented minimal risk, highly infectious ones could spread SARS-CoV-2 by talking, singing or breathing, which shed virions at comparable rates via droplets and aerosols. Coughing shed considerable quantities of virions, predominantly via droplets, and greatly increased the contagiousness of many symptomatic cases relative to asymptomatic ones. Asymptomatic and symptomatic infections showed similar likelihoods of expelling aerosols with SARS-CoV-2, as did adult and pediatric cases. Children tended to be less contagious by droplet spread than adults based on tendencies of symptomatology rather than rVL. Our findings address longstanding questions on SARS-CoV-2 transmissibility and present pertinent considerations for disease control.
Main body
The novel coronavirus SARS-CoV-2 has spread globally, causing the coronavirus disease 2019 (COVID-19) pandemic with more than 38.0 million infections and 1.0 million deaths (as of 13 October 2020)8. While the basic reproductive number has been estimated to be 2-3.61,2, transmissibility of SARS-CoV-2 is highly overdispersed, with numerous instances of superspreading9-11 and few cases (10-20%) causing many secondary infections (80%)4-6. Estimates indicate 40-87% of cases are asymptomatic or paucisymptomatic, which facilitates transmission2,3,12. Infectiousness begins around −3 to −2 DFSO and has been estimated to last 8-10 DFSO13-15. Hence, the temporal infectiousness profile of COVID-19 resembles that of influenza16, whereas its overdispersion is similar to that of severe acute respiratory syndrome (SARS)17.
For respiratory virus transmission, airway epithelial cells shed virions to the extracellular fluid before atomization (from breathing, talking, singing, coughing and aerosol-generating procedures) partitions them into a polydisperse mixture of droplets (>5 μm) and aerosols (≤5 μm) that are expelled to the ambient environment7. Based on mass, droplets tend to settle gravitationally, whereas aerosols remain suspended and travel based on airflow profiles. Although proximity has been associated with infection risk for COVID-1918, studies have also suggested that long-range aerosol spread occurs conditionally9-11.
Despite these analyses, key factors that characterize the transmissibility of SARS-CoV-2 remain unclear. To investigate many of these factors, we synthesized evidence by systematic review, compared SARS-CoV-2 rVLs among subgroups, associated overdispersion, analyzed viral kinetics and modeled the likelihood of shedding viable virus via droplets and aerosols across respiratory activities, DFSO, case heterogeneity and subgroups. For inference on transmissibility, we compared these analyses with those of SARS-CoV-1 (the most closely related human coronavirus) and A(H1N1)pdm09 (the most recent pandemic influenza virus), which spread by contact, droplets and aerosols19,20.
Systematic review
To develop the comprehensive dataset, we conducted a systematic review on quantitative VL measurements for respiratory specimens taken during the infectious periods of SARS-CoV-2 (−3 to 10 DFSO)13-15, SARS-CoV-1 (0-20 DFSO)21 and A(H1N1)pdm09 (−2 to 9 DFSO)16 (Methods). The systematic search (Supplementary Tables 1-5) identified 4,274 results. After screening and full-text review, 63 studies met the inclusion criteria and were used for analysis (Fig. 1) (n=9,692 total specimens), which included adult (n=5,124) and pediatric (n=1,593) cases and measurements for asymptomatic (n=2,378), presymptomatic (n=28) and symptomatic (n=7,231) infections. According to a hybrid Joanna Briggs Institute critical appraisal checklist, risk of bias was low to moderate for included studies (Extended Data Table 1).
Meta-analysis and subgroup analyses of rVL
For each study in the systematic dataset, we used specimen concentrations to estimate rVLs (Methods). We then performed a random-effects meta-analysis (Extended Data Fig. 1), which showed that, during the infectious periods, the expected rVL of SARS-CoV-2 was comparable to that of SARS-CoV-1 (p=0.148, two-sided Welch’s t-test) but lesser than that of A(H1N1)pdm09 (p<0.0001). We also performed random-effects subgroup analyses for COVID-19 (Fig. 2), which showed that expected SARS-CoV-2 rVLs were consistent between pediatric and adult cases (p=0.861) and between symptomatic/presymptomatic and asymptomatic infections (p=0.951).
Association of heterogeneity in rVL with overdispersion
Since few cases drive the transmission of SARS-CoV-2 (dispersion parameter k, 0.10-0.58)4-6 and SARS-CoV-1 (k, 0.16-0.17)17 whereas A(H1N1)pdm09 (k, 7.4-14.4)22,23 spreads more homogeneously, we sought to find a mechanistic association for overdispersion. As an empirical estimate, k depends on myriad extrinsic (behavioral, environmental and invention) and host (host defenses) factors. However, since dispersion is similar across distinct outbreaks of a virus17, we hypothesized that an intrinsic virological factor mediates k for these emergent respiratory infections.
We assessed heterogeneity in rVL. For all three viruses, rVLs best conformed to Weibull distributions (Extended Data Fig. 2), and we fitted the entirety of individual sample data for each virus in the systematic dataset (Fig. 3a, Extended Data Fig. 2n). While COVID-19 and SARS cases tended to shed lesser virus than those with A(H1N1)pdm09 (Extended Data Fig. 1), broad heterogeneity in rVL inverted this relationship for highly infectious individuals (Extended Data Fig. 3a-c). At the 90th case percentile (cp), the estimated rVL was 8.91 (8.82-9.00, 95% confidence interval [CI]) log10 copies/ml for SARS-CoV-2, whereas it was 8.62 (8.47-8.76) log10 copies/ml for A(H1N1)pdm09. Moreover, heterogeneity in rVL was similar among adult, pediatric, symptomatic/presymptomatic and asymptomatic COVID-19 cases (Extended Data Fig. 3d-g), with standard deviations (SDs) of 2.01-2.06 log10 copies/ml (Extended Data Table 2).
To investigate the relationship between k and heterogeneity in rVL, we performed a meta-regression using each included study (Fig. 3b). The analysis showed a negative association (p=0.031, meta-regression slope t-test), indicating that heterogeneity in rVL intrinsically facilitated the distinctions in overdispersion among the emergent infections.
Kinetics of SARS-CoV-2 rVL
To investigate dynamics, we delineated SARS-CoV-2 rVLs by DFSO and fitted the mean estimates to a mechanistic epithelial cell-limited model for viral kinetics (Fig. 3c, Methods). The outputs indicated that, on average, each productively infected cell in the airway epithelium shed SARS-CoV-2 at 1.11 (0.51-1.71, 95% CI) copies/ml day-1 and infected up to 10.6 susceptible cells (Extended Data Table 3). The turnover rate for infected epithelial cells was 0.55 (0.23-0.87) days-1, while the half-life of SARS-CoV-2 in the respiratory tract was 4.35 (2.23-97.8) hours. By extrapolating the model to an initial rVL of 0-1 log10 copies/ml, the estimated incubation period was 5.38-4.52 days, which agrees with epidemiological findings1. Conversely, the expected duration of shedding was 26.8 DFSO. Thus, SARS-CoV-2 replicated exponentially in the respiratory tract based on a mean rate constant of 4.02×10−7 (3.01×10−7-5.03×10−7) (copies/ml)-1 day-1 after infection. Around 1 DFSO, rVL peaked, as the number of infected epithelial cells reached equilibrium, and then diminished exponentially.
As SARS-CoV-2 rVLs showed widespread heterogeneity across the infectious period, we fitted distributions for each DFSO (Fig. 3d), which showed that high rVLs also increased from the presymptomatic period before decreasing towards the end of the first week of illness. At the 90th cp, SARS-CoV-2 rVL peaked at 1 DFSO at 9.83 (9.12-10.61) log10 copies/ml, an order of magnitude greater than the overall 90th-cp estimate (Extended Data Table 2). The estimate remained ≥9.33 log10 copies/ml between 1-5 DFSO. At −1 DFSO, the 90th-cp rVL was 8.30 (6.88-10.02) log10 copies/ml, while it was 7.92 (7.34-8.55) log10 copies/ml at 10 DFSO (Extended Data Fig. 3h-s).
Likelihood of droplets and aerosols containing virions
Since rVL is an intensive quantity, the volume fraction of virions is low and viral partitioning coincides with atomization, we used Poisson statistics to model likelihood profiles. To calculate an unbiased estimator of partitioning (the expected number of viable copies per particle), our method multiplied rVL estimates with the volumes of atomized particles and an assumed viability proportion of 0.1% after dehydration (Methods).
When expelled by the mean COVID-19 case across the infectious period, respiratory particles showed minimal likelihoods of carrying viable SARS-CoV-2 (Fig. 4a,b). Aerosols (dehydrated aerodynamic diameter [da]≤5 µm) were <0.001% likely to contain a virion. Droplets also had low likelihoods: at da=40 µm, they were ≤0.4% likely to contain a virion.
COVID-19 cases with high rVLs, however, expelled particles with considerably greater likelihoods of carrying viable copies (Fig. 4a,b). For the 98th cp at 1 DFSO, 18.2% (8.8-27.6%) of aerosols (da=5 µm) contained at least one SARS-CoV-2 virion. For da>14.4 µm, droplets were >99% likely to contain virions, with large ones carrying tens to hundreds.
Shedding SARS-CoV-2 via respiratory activities
Using the partitioning estimates in conjunction with published profiles of the particles expelled by respiratory activities (Extended Data Fig. 4), we modeled the rates at which talking, singing, breathing and coughing shed viable SARS-CoV-2 across da (Fig. 4c-f). Among the non-presenting activities, singing emitted virions most rapidly followed by talking and then breathing, although talking loudly was similar to singing (Extended Data Fig. 4c,d). These activities produced more aerosols than droplets, but particle size correlated with the likelihood of containing virions. Thus, talking, singing and breathing shed SARS-CoV-2 at similar rates via aerosols and droplets: aerosols mediated 25.2-43.4% of the virions expelled by the non-presenting activities (Fig. 4g). In comparison, coughing shed far greater quantities of virions (Fig. 4f), of which >99.9% were carried by droplets.
We further examined the influences of case heterogeneity and disease course on expelling SARS-CoV-2 (Fig. 4h, Extended Data Fig. 5). The estimated total shedding rates (over all particle sizes) for a respiratory activity spanned ≥8.55 orders of magnitude on each DFSO; cumulatively from −1 to 10 DFSO, they spanned 11.2 orders of magnitude. Hence, most cases expelled a negligible number of SARS-CoV-2 virions by talking, singing or breathing. Shedding occurred most rapidly at 1 DFSO: for the 98th cp, singing discharged 31.5 (3.26-379, 95% CI) virions/min to the ambient environment, while talking emitted 4.67 (0.48-56.1) virions/min and breathing exhaled 1.27 (0.13-15.2) virions/min; these estimates were two orders of magnitude greater than those for the 86th cp. For the 98th cp at −1 DFSO, singing shed 1.31 (0.01-406) virions/min and breathing exhaled 5.24×10−2 (5.28×10−4-16.3) virions/min. The estimates at 7-10 DFSO were comparable to these presymptomatic ones (Fig. 4h, Extended Data Fig. 5).
At 1 DFSO, coughing expelled 2.13×106 (2.20×105-2.56×107) virions/cough for the 98th cp, 90.4 (24.6-372) virions/cough for the 50th cp and 2.66 (0.65-13.1) virions/cough for the 25th cp (Extended Data Fig. 5c). At −1 and 10 DFSO, these estimates were reduced by ∼2 orders of magnitude. Thus, most symptomatic cases shed considerable quantities of SARS-CoV-2 by coughing; a single cough accounted for the virions emitted by weeks of singing for a case.
As indicated by similar mean rVLs (Fig. 2) and heterogeneities in rVL (Extended Data Table 2), asymptomatic, symptomatic/presymptomatic, adult and pediatric COVID-19 cases showed similar profiles for total shedding rates (Extended Fig. 6a-d). The estimates showed that the top 6.1%, 2.4% and 1.1% of pediatric cases shed ≥1 virion/min by singing, talking and breathing, respectively, while 62.5% expelled ≥10 virions/cough. In general, highly infectious COVID-19 cases expelled virions more rapidly than did ones with A(H1N1)pdm09 (Extended Data Fig. 6f).
Discussion
This study provided comprehensive, systematic analyses of several factors characterizing the transmissibility of SARS-CoV-2. First, we evaluated the influence of heterogeneity in rVL. Our findings show that broad heterogeneity in rVL facilitates greater variation in individual infectiousness in the COVID-19 pandemic than was found in the 2009 H1N1 pandemic. For each respiratory activity, SARS-CoV-2 shedding rates span >11 orders of magnitude throughout the infectious period. While most COVID-19 cases present minimal transmission risk by talking, singing or breathing, highly infectious ones, including asymptomatic and presymptomatic infections, can spread SARS-CoV-2 through these activities. Our model estimates, when corrected to copies rather than virions, align with recent clinical findings for exhalation rates of SARS-CoV-224. Moreover, the findings suggest that heterogeneity in rVL may be a virological factor generally associated with overdispersion for respiratory infections. In this case, rVL distribution may serve as an early correlate for transmission patterns, including superspreading, during outbreaks of novel respiratory viruses, providing insight for disease control before large-scale epidemiological analyses empirically characterize k.
Second, we analyzed SARS-CoV-2 kinetics during respiratory infection. While heterogeneity remains broad throughout the infectious period, the systematic dataset indicates that rVL tends to peak at 1 DFSO and be elevated for 1-5 DFSO, coinciding with the period of highest attack rates observed among close contacts25. These results indicate that transmission risk tends to be greatest soon after illness rather than in the presymptomatic period, which concurs with large tracing studies (6.4-12.6% of secondary infections from presymptomatic transmission)26,27 rather than early temporal models (∼44%)14. Furthermore, our kinetic analysis suggests that, on average, SARS-CoV-2 reaches diagnostic concentrations 1.60-3.22 days after respiratory infection (−3.78 to −2.16 DFSO), assuming assay detection limits of 1-3 log10 copies/ml, respectively, for nasopharyngeal swabs immersed in 1 ml of transport media.
Third, we modeled the likelihood of shedding SARS-CoV-2 via aerosols. Talking, singing and breathing shed SARS-CoV-2 at comparable rates through droplets and aerosols (up to tens to hundreds of virions/min). As airborne spread is recognized as a key mode of transmission for A(H1N1)pdm0920, our model estimates and comparative analyses support, particularly for highly infectious cases, airborne spread as a transmission mode for SARS-CoV-2. While our models delineated aerosols from droplets at the classical threshold (da=5 µm), recent reports show that, based on emission vectors and environmental conditions, respiratory particles larger than 5 µm can also travel >2 m in air28,29, further supporting the plausibility of the airborne transmission of SARS-CoV-2. However, with short durations of stay in well-ventilated areas, the concentration, and exposure risk, of aerosols remains correlated with proximity to infectious cases18,28.
Fourth, we assessed the relative infectiousness of COVID-19 subgroups. Since rVL distributions are similar among subgroups and the predominant source of aerosols is the non-presenting respiratory activities (talking, singing and breathing), symptomatic and asymptomatic infections present similar risks for aerosol spread, as do adult and pediatric cases. However, most cases shed considerable numbers of virions via large droplets by coughing, a common symptom of COVID-1930. Thus, symptomatic infections tend to be significantly more contagious than asymptomatic ones, providing a reason as to why asymptomatic cases transmit SARS-CoV-2 at lower relative rates3, especially in close contact31, despite similar rVLs and increased contact patterns. Accordingly, children (48-54% of symptomatic cases present with cough)32,33 tend to be less contagious by droplet spread than adults (68-80%)30,33 based on tendencies of symptomatology rather than rVL.
Our study has limitations. The systematic search found a limited number of studies reporting quantitative specimen measurements from the presymptomatic period, meaning these estimates may be sensitive to sampling bias. Although additional studies have reported semiquantitative metrics (cycle thresholds), these data were excluded because they cannot be compared on an absolute scale due to batch effects34, limiting use in compound analyses. Furthermore, our analyses considered population-level estimates of the infectious periods and viability proportions, which omit individual variation in the dynamics of virus viability. Some patients shed SARS-CoV-2 with diminishing viability soon after symptom onset13, while others produce replication-competent virus for weeks35. It remains unelucidated how case characteristics and environmental factors affect the viability dynamics of SARS-CoV-2.
Taken together, our findings provide a potential path forward for disease control. They highlight the disproportionate role of high-risk cases, settings and circumstances in propelling the COVID-19 pandemic. Since highly infectious cases, regardless of age or symptomatology, can rapidly shed SARS-CoV-2 via both droplets and aerosols, airborne spread should also be recognized as a transmission risk, including for superspreading. Strategies to abate infection should limit crowd numbers and duration of stay while reinforcing distancing and then widespread mask usage; well-ventilated settings can be recognized as lower risk venues. Coughing sheds considerable quantities of virions for most infections, while rVL tends to peak at 1 DFSO and can be high throughout the infectious period. Thus, immediate, sustained self-isolation upon symptom presentation is crucial to curb transmission from symptomatic cases. While diagnosing COVID-19, qRT-PCR can also help to triage contact tracing, prioritizing patients with higher specimen measurements: for nasopharyngeal swabs immersed in 1 ml of transport media, ≥7.14 (7.07-7.22, 95% CI) log10 copies/ml corresponds to ≥80th cp. Doing so may identify asymptomatic and presymptomatic cases more efficiently, a key step towards mitigation as the pandemic continues.
Data Availability
Data will be made available upon request. All raw data, code and model outputs from this study will be made publicly available in online repositories after peer review. Search strategies for the systematic review are shown in Supplementary Tables 1-5. The systematic review protocol was prospectively registered on PROSPERO (registration number, CRD42020204637).
Methods
Search strategy, selection criteria and data collection
We undertook a systematic review and prospectively submitted the systematic review protocol for registration on PROSPERO (registration number, CRD42020204637). Other than the title of this study, we have followed PRISMA reporting guidelines36. The systematic review was conducted according to Cochrane methods guidance37.
The search included papers that (i) reported positive, quantitative measurements (copies/ml or an equivalent metric) of SARS-CoV-2, SARS-CoV-1 or A(H1N1)pdm09 in human respiratory specimens (ETA, NPA, NPS, OPS, POS and Spu) from COVID-19, SARS or A(H1N1)pdm09 cases; (ii) reported data that could be extracted from the infectious periods of SARS-CoV-2 (defined as −3 to +10 DFSO for symptomatic cases and 0 to +10 days from the day of laboratory diagnosis for asymptomatic cases), SARS-CoV-1 (defined as 0 to +20 DFSO or the equivalent asymptomatic period) or A(H1N1)pdm09 (defined as −2 to +9 DFSO for symptomatic cases and 0 days to +9 days from the day of laboratory diagnosis for asymptomatic cases); and (iii) reported data for two or more cases with laboratory-confirmed COVID-19, SARS or A(H1N1)pdm09. Quantitative specimen measurements were considered after RNA extraction for diagnostic sequences of SARS-CoV-2 (Ofr1b, N, RdRp and E genes), SARS-CoV-1 (Ofr1b, N and RdRp genes) and A(H1N1)pdm09 (HA and M genes).
Studies were excluded, in the following order, if they (i) studied an ineligible disease; (ii) had an ineligible study design, including those that were reviews of evidence (e.g., scoping, systematic, narrative), did not include primary clinical human data, reported data for less than two cases due to an increased risk of selection bias, were incomplete (e.g., ongoing clinical trials), did not report an RNA extraction step before measurement or were studies measuring environmental samples; (iii) reported an ineligible metric for specimen concentration (e.g., qualitative RT-PCR or cycle threshold [Ct] values without calibration included in the study); (iv) reported quantitative measurements from an ineligible specimen type (e.g., blood specimens, pooled specimens or self-collected POS or Spu patient specimens in the absence of a healthcare professional); (v) reported an ineligible sampling period (consisted entirely of data that could not be extracted from within the infectious period); or (vi) were duplicates of an included study (e.g., preprinted version of published paper or duplicates not identified by Covidence). We included data from control groups receiving standard of care in interventional studies but excluded data from the intervention group. Patients in the intervention group are, by definition, systematically different from general case populations because they receive therapies not being widely used for treatment, which may influence virus concentrations. Interventional studies examining the comparative effectiveness of two or more treatments were excluded for the same reason. Studies exclusively reporting semiquantitative measurements (e.g., Ct values) of specimen concentration were excluded, as these measurements are sensitive to batch inconsistencies and, without proper calibration, cannot be compared on an absolute scale across studies34.
We searched, without the use of filters or language restrictions, the following sources: MEDLINE (via Ovid, 1946 to 7 August 2020), EMBASE (via Ovid, 1974 to 7 August 2020, Cochrane Central Register of Controlled Trials (via Ovid, 1991 to 7 August 2020), Web of Science Core Collection (including: Science Citation Index Expanded, 1900 to 7 August 2020; Social Sciences Citation Index, 1900 to 7 August 2020; Arts & Humanities Citation Index, 1975 to 7 August 2020; Conference Proceedings Citation Index - Science, 1990 to 7 August 2020; Conference Proceedings Citation Index - Social Sciences & Humanities, 1990 to 7 August 2020; and Emerging Sources Citation Index, 2015 to 7 August 2020), as well as MedRxiv and BioRxiv (both searched through Google Scholar via the Publish or Perish program, to 7 August 2020). We also gathered studies by searching through the reference lists of review articles identified by the database search, by searching through the reference lists of included articles, through expert recommendation (by Epic J. Topol, Akiko Iwasaki and A. Marm Kilpatrick on Twitter) and by hand-searching through journals (Nature, Nat. Med., Science, NEJM, Lancet, Lancet Infect. Dis., JAMA, JAMA Intern. Med. and BMJ). A comprehensive search was developed by a librarian, which included subject headings and keywords. The search strategy had 3 main concepts (disease, specimen type and outcome), and each concept was combined using the appropriate Boolean operators. The search was tested against a sample set of known articles that were pre-identified. The line-by-line search strategies for all databases are included in Supplementary Tables 1-5. The search results were exported from each database and uploaded to the Covidence online system for deduplication and screening.
Two authors independently screened titles and abstracts, reviewed full texts, collected data and assessed risk of bias via Covidence and a hybrid critical appraisal checklist based on the Joanna Briggs Institute (JBI) tools for case series, analytical cross-sectional studies and prevalence studies38-40. To evaluate the sample size in a study, we used the following calculation: where n* is the sample size threshold, z is the z-score for the level of confidence (95%), σ is the standard deviation (assumed to be 3 log10 copies/ml, one quarter of the full range of rVLs) and d is the marginal error (assumed to be 1 log10 copies/ml, based on the minimum detection limit for qRT-PCR across studies)41. The hybrid JBI critical appraisal checklist is shown in the Supplementary Notes. Inconsistencies were resolved by discussion and consensus.
The search found 27 studies for COVID-1913,35,42-66, 9 studies for SARS67-75 and 27 studies for A(H1N1)pdm0976-102. Quantitative specimen measurements were collected directly if reported numerically or using WebPlotDigitizer 4.3 (https://apps.automeris.io/wpd/) if reported graphically. For included studies, we also collected the relevant numbers of cases, types of cases, volumes of transport media, pharmacotherapies, DFSO (for symptomatic cases) or day relative to initial laboratory diagnosis (for asymptomatic cases) on which each specimen was taken and numbers of tested specimens. Hospitalized cases were defined as those being tested in a hospital setting and then admitted. Non-admitted cases were defined as those being testing in a hospital setting but not admitted. Community cases were defined as those being tested in a community setting. Symptomatic, presymptomatic and asymptomatic infections were defined as in the study. Based on rare description in the included studies, paucisymptomatic infections, when defined in a study, were included with symptomatic ones. Pediatric cases were defined as those of 18 years of age or lower or as defined in the study. Adult cases were defined as those above 18 years of age or as defined in the study.
Meta-analysis of rVLs
Based on the search design and composition of included studies, the meta-analysis overall estimates were the expected SARS-CoV-2, SARS-CoV-1 and A(H1N1)pdm09 rVL when encountering a COVID-19, SARS or A(H1N1)pdm09 case, respectively, during their infectious period. To determine rVLs, data collected on positive, quantitative specimen measurements were converted to the RNA concentration in the respiratory tract. Viral concentrations in respiratory specimens were denoted as specimen measurements, whereas viral concentrations in the respiratory tract were denoted as rVLs. For example, measurements from swabbed specimens (NPS and OPS) typically report the RNA concentration in viral transport media. Based on the expected uptake volume for swabs (0.128 ± 0.031 ml, mean ± SD)103 or reported collection volume for expulsed fluid in each study (e.g., 0.5 to 1 ml) along with the reported volume of transport media in each study (e.g., 1 ml), we calculated the dilution factor for each respiratory specimen to estimate the rVLs. If the diluent volume was not reported, then the dilution factor was calculated assuming a volume of 1 ml (NPS and OPS), 2 ml (POS and ETA) or 3 ml (NPA) of transport media43,45,71. Unless dilution was reported for Spu specimens, we used the specimen measurement as the rVL13. The non-reporting of diluent volume was noted as an element increasing risk of bias in the hybrid JBI critical appraisal checklist. Viral load estimates (based on instrumentation, calibration, procedures and reagents) are not standardized. While the above procedures (including only quantitative measurements after extraction, collecting assay detection limits, correcting for specimen dilution) have considered many of these factors, non-standardization is an inherent limitation in interpreting specimen measurements across studies.
Pooled estimates and 95% CIs for the expected rVL of each virus across their infectious period were calculated using a random-effects meta-analysis. The estimates for rVL assumed that each viral copy was extracted and quantified from the tested specimen aliquot. For studies reporting summary statistics in medians and interquartile or total ranges, we derived estimates of the mean and variance and calculated the 95% CIs104. All calculations were performed in units of log10 copies/ml. Between-study heterogeneity in meta-analysis was assessed using the I2 and τ2 statistics. The weighting for each study in its virus group was calculated as the reciprocal of the rVL variance.
Subgroup analyses of rVLs
Subgroup analyses were conducted to compare the expected rVLs of SARS-CoV-2 in adult, pediatric, symptomatic and asymptomatic COVID-19 cases, as previously defined, during the infectious period. The overall estimate for each subgroup was the expected rVL when encountering a case of that subgroup during the infectious period. Studies reporting data exclusively from a subgroup of interest were included in the analysis without modification. For studies in which data for these subgroups constituted only part of its dataset, rVLs from the subgroup were collected to calculate the mean, variance and 95% CIs. All calculations were performed in units of log10 copies/ml. In the analysis, we excluded studies with only a single case in our subgroups of interest. Pooled estimates and 95% CIs for each subgroup were calculated using a random-effects meta-analysis, in which between-study heterogeneity was assessed using the I2 and τ2 statistics. The weighting for each study in its subgroup was calculated as the reciprocal of the rVL variance.
Distribution of rVL
To analyze heterogeneity in rVLs, we pooled the entirety of individual sample data (reported as individual specimen measurements rather through descriptive statistics) in the systematic dataset by disease, COVID-19 subgroups and DFSO. For analyses of SARS-CoV-2 dynamics across DFSO, we included estimated rVLs from negative qRT-PCR measurements of respiratory specimens (n=3, 3, 6, 8, 12, 15, 13, 17 and 14 negative specimens for 2, 3, 4, 5, 6, 7, 8, 9 and 10 DFSO, respectively) for cases that had previously been quantitatively confirmed to have COVID-19. These rVLs were estimated based on the reported assay detection limit in the respective study. Probability plots and modified Kolmogorov–Smirnov tests were used to determine the suitability of normal, lognormal, gamma and Weibull distributions to describe the distribution of rVLs for SARS-CoV-2, SARS-CoV-1 and A(H1N1)pdm09. For each virus, the data best conformed to Weibull distributions, which is described by the probability density function where α is the shape factor, β is the scale factor and υ is rVL (υ ≥ 0 log10 copies/ml). In this distribution, the value of the rVL at the xth percentile was determined using the quantile function,
For cp curves, we used eq. (3) to determine rVLs from the 1st cp to the 99th cp (step size, 1%). Curve fitting to eq. (2) and calculation of eq. (3) and its 95% CI was performed using the Distribution Fitter application in Matlab R2019b (MathWorks, Inc., Natick, Massachusetts, USA).
Meta-regression of k and heterogeneity in rVL
To assess the relationship between k and heterogeneity in rVL, we performed a univariate meta-regression (log k = a(SD) + b, where a is the slope for association and b is the intercept) between pooled estimates of k (based on studies describing community transmission) for COVID-19 (k=0.409)4-6,105-108, SARS (k=0.165)17 and A(H1N1)pdm09 (k=8.155)22,23 and the SD of the rVLs in each study. Since the negative binomial distribution, from which k is derived17, is analogous to a compound Poisson distribution in which each random variable is Log(k)-distributed, the meta-regression was performed with log k. Based on negligible between-study heterogeneity, we used a fixed-effects model. This analysis assumes that the SD of rVLs in each study estimates SD of rVL for the disease. Thus, for weighting in the meta-regression, we used the proportion of rVL samples for each study relative to the entire systematic dataset (Wi = (ni/ntotal)). The regression line, its 95% CI and its Pearson correlation coefficient (r) were presented along with the p-value for association (meta-regression slope t-test for a) between the two variables. The meta-regression assumed that the viability proportion (for viruses exiting the respiratory tract) was similar across cases for a given respiratory infection; it could be a different value for different diseases. The meta-regression also assumed that the rate profile of particles expelled by respiratory activities (e.g., talking) is similar among the diseases. The limit of detection for qRT-PCR instruments used in the included studies did not significantly affect the analysis of heterogeneity in rVL, as these limits tended to be below the values found for specimens with low virus concentrations.
Viral kinetics
To model the kinetics of SARS-CoV-2 rVL, we used a mechanistic epithelial cell-limited model for the respiratory tract109, based on the system of differential equations: where T is the number of uninfected target cells, I is the number of productively infected cells, V is the rVL, β is the infection rate constant, p is the rate at which airway epithelial cells shed virus to the extracellular fluid, c is the clearance rate of the virus and δ is the clearance rate of productively infected cells. Parameter units are summarized in Extended Data Table 3. Using these parameters, the viral half-life in the respiratory tract (t1/2 = ln 2/c) and the half-life of productively infected cells (t1/2 = ln 2/δ) and their 95% CIs could be estimated. Moreover, the cellular basic reproductive number (the expected number of secondary infected cells from a single productively infected cell placed in a population of susceptible cells) was calculated by where T0 is the initial number of susceptible cells109.
For initial parameterization, eqs. (4)-(6) were simplified according to a quasi-steady state approximation110 to where r = pβ/c, for a form with greater numerical stability. The system of differential equations was fitted on the mean estimates of SARS-CoV-2 rVL between −2 and 10 DFSO using the entirety of individual sample data in units of copies/ml. Numerical analysis was implemented using the Fit ODE app in OriginPro 2019b (OriginLab Corporation, Northampton, Massachusetts, USA) via the Runge-Kutta method and initial parameters V0, I0 and T0 of 4 copies/ml, 0 cells and 5×107 cells, respectively, for the range −5 to 10 DFSO. The analysis was first performed with eqs. (8)-(9). These output parameters were then used to initialize final analysis using eqs. (4)-(6), where the estimates for β and δ were input as fixed and variable parameters, respectively. The fitted line and its coefficient of determination (r2) were presented.
To estimate the average incubation period, we extrapolated the kinetic model to 0 and 1 log10 copies/ml pre-symptom onset. To estimate the average duration of shedding, we extrapolated the model to 0 log10 copies/ml post-symptom onset. Unlike experimental estimates, this estimate for duration of shedding was not defined by assay detection limits. These analyses had limitations. To estimate the average DFSO on which SARS-CoV-2 concentration reached diagnostic levels, we extrapolated the model pre-symptom onset to the equivalent of 1 and 3 log10 copies/ml in specimen concentration (chosen as example assay detection limits), as described by the dilution factor estimation above. The average time from respiratory infection to reach diagnostic levels was then calculated by subtracting these values from the incubation period for 0 log10 copies/ml. However, the extrapolated time for SARS-CoV-2 to reach diagnostic concentrations in the respiratory tract should be validated in tracing studies, in which contacts are prospectively subjected to daily sampling.
Considerations for particle dehydration
The desiccation time of a particle in air was described , where b is prefactor for dehydration rate which depends on the environmental conditions, di is the initial hydrated diameter and ddes is particle diameter after desiccation111. After desiccation, the remaining non-volatile matter (ions, molecules, viruses and cells) governs particle size, which is approximately 0.44 times the initial size of particles atomized in the respiratory tract112. Dehydrated aerodynamic diameter was calculated by da = dp(ρ/ρ0)1/2, where dp is the dehydrated particle size, ρ is the material density of the respiratory particle and ρ0 is the reference material density (1 g/cm3). For conservative estimates, the value of b was taken to be 64.9 µm2/s113 based on conditions of room temperature and a relative humidity of 59% (near the upper limit of 60% for healthcare and typical indoor specifications)114. The equation for desiccation time indicated that respiratory particles begin to dehydrate immediately upon release to the ambient environment. Desiccation occurred rapidly, as the equation estimated that an 11.4-µm particle desiccated to 5 µm in 1.6 s within the model conditions, and this value was an upper limit for the desiccation times of aerosols (tdes≤ 1.6 s).
Likelihood of respiratory particles containing virions
To calculate an unbiased estimator for viral partitioning (the expected number of viable copies in an expelled particle at a given size), we multiplied rVLs with the volume equation for spherical particles during atomization and the estimated viability proportion: where λ is the expectation value, ρ is the material density of the respiratory particle (997 g/m3), υp is the volumetric conversion factor (1 ml/g), γ is the viability proportion, υ is the rVL and d is the hydrated diameter of the particle during atomization. The model assumed γ was 0.1% for the viruses. For influenza, approximately 0.1% of copies in particles expelled from the respiratory tract represent viable virus115, which is equivalent to one in 3 log10 copies/ml for rVL or, after dilution in transport media, roughly one in 4 log10 copies/ml for specimen concentration. Recent reports have detected culture-positive respiratory specimens with SARS-CoV-2 concentrations down to 4 log10 copies/ml13, including from pediatric patients62 and in the presymptomatic period15, suggesting the assumption was also suitable for SARS-CoV-2.
Likelihood profiles were determined using Poisson statistics, as described by the probability mass function where k is the number of virions partitioned within the particle. For λ, 95% CIs were determined using the variance of its rVL estimate. To determine 95% CIs for likelihood profiles from the probability mass function, we used the delta method, which specifies where σ2D is the covariance matrix of θ and ġ(θ) is the gradient of g(θ). For the univariate Poisson distribution, σ2D = λ and
Based on the relative relationship between the residence time of expelled particles before assessment (∼5 s)116 in the referenced study115 and the estimated dehydration rates of expelled particles, we took the viability proportion (0.1%) to be for dehydrated particles. The model calculated partitioning of copies using the hydrated volume and then applied the viability proportion for number of virions in particles after emission and dehydration. Thus, we compared likelihoods among expelled, dehydrated particles. In Fig. 4, the comparison between hydrated and dehydrated diameters showed only the relationship in particle size and not the relationship in likelihood of containing viable virus. Based on the scope of the study, the model did not account for the virion half-life as particles deposit onto surfaces or remain suspended in air117.
Rate profiles of particles expelled by respiratory activities
For the rate profiles of particles expelled during respiratory activities, we used distributions from the literature. For coughing, we considered the rate (particles/cough) of expelling particles at different sizes, as determined by Loudon and Roberts118, by calculating the mean number of respiratory particles expelled per cough based on subject tests RI, RII, LI, LII and EI (EII was presumed to be an outlier based on the relative rate when compared to EI). These particles were taken to be dehydrated based on the deposition time in the experiment relative to estimated dehydration rates. We compared this rate profile to that of Duguid119, which were taken to be hydrated particles based on experimental design. For talking, singing and breathing, we obtained data from Morawaska et al120. Rate profiles (particles/min) were calculated by converting the normalized concentration (particles/cm3) at each particle size based on normalization (32 channels per decade) for the aerodynamic particle sizer (APS) used, unit conversion (cm3 to L) and the sample flow rate (1 L/min). Rate profiles of talking and singing were isolated from breathing by subtracting the contribution of breathing to the combined data. Particles were taken to be dehydrated based on the minimum particle age in the measurements. Based on the APS used, the analyzed range for da was 0.3-20 μm. While larger droplets may potentially be expelled by the respiratory activities, the data suggested that their emission rates were minimal, and there was a limited bias associated with instrumentation. We compared these data for talking with rate profiles of talking loudly and talking quietly from Asadi et al121. For data reported in a size channel, we took the particle size to be the median value. Curves based on discrete particle measurements were connected using the nonparametric Akima spline function.
Shedding virions via respiratory droplets and aerosols
To determine the respiratory shedding rate across particle size, rVL estimates and the hydrated diameters of particles expelled by a respiratory activity were input into eq. (10), and the output was then multiplied by the rate profile of the activity (talking, singing, breathing or coughing). Dehydration and viability considerations were continued from the likelihood models. The model used particle profiles from (coughing) Loudon and Roberts118 or (talking, singing and breathing) Morawaska et al120.
To determine the total respiratory shedding rate for a given respiratory activity across cp, we determined the cumulative hydrated volumetric rate (by summing the hydrated volumetric rates across particle sizes for that respiratory activity) and input it into eq. (10). Using rVLs as determined by the Weibull quantile functions, we then calculated the Poisson means and their 95% CIs at different cps.
To assess the relative contribution of aerosols and droplets to mediating respiratory viral shedding for a given respiratory activity, we calculated the proportion of the cumulative hydrated volumetric rate contributed by aerosols (da≤5 μm) or droplets (da>5 μm) for that respiratory activity. Since the Poisson mean was proportional to cumulative volumetric rate, this estimate of the relative contribution of aerosols and droplets to respiratory viral shedding was consistent among viruses and cps in the model.
In this study, the model for shedding virions via droplets and aerosols did not delineate particles generated in the upper respiratory tract from those generated in the lower respiratory tract, as the sites of atomization remain poorly understood. It also did not differentiate cases with significant expectoration from those without it. In addition, it did not account for individual variation in the profiles of expelled particles; superemitters can expel respiratory particles at rates ∼3 times above median121.
Statistical analysis
For data collection, statistical analysis, coding and data visualization, we used Excel v16.40 (Microsoft Corporation, Redmond, Washington, USA), OriginPro 2019b (OriginLab Corporation, Northampton, Massachusetts, USA) and Matlab R2019b (MathWorks, Inc., Natick, Massachusetts, USA). Between-study heterogeneity in the random-effects meta-analyses was assessed using the I2 and τ2 statistics. Probability plots for normal, lognormal, gamma and Weibull distributions of rVLs were scored based on the Blom method. Modified Kolmogorov-Smirnov tests were used to determine the goodness of fits between rVLs (in log10 copies/ml) and normal, lognormal, gamma or Weibull distributions. By accepting the null hypothesis in the modified Kolmogorov–Smirnov test, the given distribution cannot be rejected to fit the data. Based on fitted Weibull distribution parameters, the Weibull quantile function was used to determine the rVL and its 95% CIs at a given cp. The association between k and rVL was assessed via meta-regression, and the p-value for association was based on the meta-regression slope t-test. Likelihood profiles were determined using the Poisson probability mass function and the unbiased estimator for the expected partitioning of virions at a given particle size. Variance on likelihood estimates was determined via the delta method. Since case variance or sample size may be unequal among the viral infections or subgroups, the two-sided Welch’s t-test was used to compare the difference of expected rVLs in the meta-analysis and subgroup analyses. For all statistical analyses, the significance level (α) was taken to be 0.05.
Data availability
Data will be made available upon request. All raw data, code and model outputs from this study will be made publicly available in online repositories after peer review. Search strategies for the systematic review are shown in Supplementary Tables 1-5. The systematic review protocol was prospectively registered on PROSPERO (registration number, CRD42020204637).
Author contributions
P.Z.C. designed the study, performed analyses, interpreted results and drafted the manuscript. P.Z.C. and N.B. conducted screening, appraised studies and drafted the review protocol. Z.P. developed and conducted the systematic review search. M.K. and D.N.F. assessed methods, interpreted results and contributed to the discussion. All authors reviewed and revised the manuscript. F.X.G. secured funds, interpreted results and supervised the project.
Competing interests
The authors declare no competing interests.
Supplementary information is available for this paper.
Correspondence and requests for materials should be addressed to F.X.G.
Acknowledgements
We thank T. Alba (Toronto) for discussion on statistical methods. We thank E. Lavezzo and A. Chrisanti (Padova) for responses to data inquiries. This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Toronto COVID-19 Action Fund. P.Z.C. was supported by the NSERC Vanier Scholarship (608544). D.N.F. was supported by the Canadian Institutes of Health Research (Canadian COVID-19 Rapid Research Fund, OV4-170360). F.X.G. was supported by the NSERC Senior Industrial Research Chair.
References
References (for Methods)
- 36.↵
- 37.↵
- 38.↵
- 39.
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.
- 45.↵
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.↵
- 63.
- 64.
- 65.
- 66.↵
- 67.↵
- 68.
- 69.
- 70.
- 71.↵
- 72.
- 73.
- 74.
- 75.↵
- 76.↵
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.
- 96.
- 97.
- 98.
- 99.
- 100.
- 101.
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.
- 107.
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵
- 121.↵