Abstract
Importance Several parameters driving the transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) remain unclear, including age-specific differences in infectivity and susceptibility, and the contribution of inapparent infections to transmission. Robust estimates of key time-to-event distributions remain scarce as well.
Objective Illustrate SARS-CoV-2 transmission patterns and risk factors, and estimate key time-to-event distributions.
Design, Setting, and Participants Individual-based data on 1,178 SARS-CoV-2 infected individuals and their 15,648 contacts identified by contact tracing monitoring over the period from January 13-April 02, 2020 were extracted from the notifiable infectious diseases reporting system in Hunan Province, China. Demographic characteristics, severity classification, exposure and travel history, and key clinical timelines were retrieved.
Exposures Confirmed SARS-CoV-2 infection by positive polymerase chain reaction test result of respiratory samples, and exposure to SARS-CoV-2 infected individuals via household, relative, social, and other types of contacts.
Main Outcomes and Measures The relative contribution of pre-symptomatic and asymptomatic transmission, key time-to-event parameters, and the effect of biological, demographic, and behavioral factors on SARS-CoV-2 infectivity and susceptibility were quantified.
Results Among SARS-CoV-2 infected individuals, the estimated mean serial interval was 5.5 days (95%CI −5.0, 19.9) and the mean generation time was 5.5 days (95%CI 1.7, 11.6). Infectiousness was estimated to peak 1.8 days before symptom onset, with 95% of transmission events occurring between −7.6 days and 7.3 days from the date of symptom onset. The proportion of pre-symptomatic transmission was estimated at 62.5%, while a lower bound for the proportion of asymptomatic transmission was 3.5%. Infectiousness of SARS-CoV-2 was not significantly different between working-age adults (15-59 years old) and other age groups (0-14 years old: p-value=0.16; 60 years and over: p-value=0.33), whilst susceptibility to SARS-CoV-2 infection was estimated to increase with age (p-value=0.03). In addition, transmission risk was higher for household contacts (p-value<0.001), but decreased in later generations of a cluster (second generation: OR=0.13, p-value<0.001; generations 3-4: OR=0.05, p-value<0.001, relative to generation 1) and for those exposed to infectors with a larger number of contacts (p-value=0.04).
Conclusions and Relevance These findings support the contribution of children to transmission and the importance of pre-symptomatic transmission, in turn highlighting the importance of large-scale testing, contact tracing activities, and the use of personnel protective equipment during the COVID-19 pandemic.
Question What are the age-specific differences in infectivity and susceptibility to SARS-CoV-2? What is the role of asymptomatic and pre-symptomatic transmission in the COVID-19 pandemic? What are the risk factors associated with SARS-CoV-2 transmission?
Findings Infectiousness does not differ by age, while susceptibility to SARS-CoV-2 infection increases with age. Up to 62.5% of transmission events may occur before symptom onset and at least 3.5% of transmission events may be linked to asymptomatic shedding. Contacts in the household and exposure to primary cases increase the risk of transmission, while transmission decreases with increasing contacts.
Meaning Our findings support the contribution of children to transmission, and highlight the importance of pre-symptomatic transmission; containment measures should be adjusted accordingly.
Introduction
The outbreak of coronavirus disease 2019 (COVID-19) started in December 2009 in Wuhan, China 1. The outbreak, caused by the SARS-CoV-2 virus, quickly spread globally, leading WHO to declare a pandemic on March 11, 2020 2. Despite more than 11.8 million SARS-CoV-2 infected individuals confirmed worldwide as of July 09, 2020 3, there are still many unknowns in the epidemiology and natural history of COVID-19.
A key question under debate is whether the infectivity of, and susceptibility to, SARS-CoV-2 infection differs by age. In particular, the role of children in SARS-CoV-2 transmission has yet to be fully understood. Schools were closed in the early months of the pandemic in most countries 4,5, so that the low proportion of cases notified in young individuals 6 could be attributed to a low probability of developing symptoms 7,8, a low susceptibility to infection 9-11, and/or few contact opportunities relative to other age groups. The importance of each of these factors has been difficult thus far to disentangle. A related question is the probability of asymptomatic transmission from young individuals. In fact, it is often argued that the COVID-19 pandemic has been difficult to tackle because of the importance of pre-symptomatic and asymptomatic transmission. Evidence from confined settings such households, homeless shelters, and nursing facilities, supports the role of pre-symptomatic and asymptomatic transmission 10,12-15. Yet, a quantification of the contribution of asymptomatic and pre-symptomatic transmission in large populations is still lacking.
A full understanding of SARS-CoV-2 transmission patterns and risk factors is crucial to plan targeted COVID-19 responses, especially as countries relax costly lockdown policies and move towards case-based interventions (e.g., case isolation, quarantine of contacts, contact tracing). To define the temporal characteristics of the response strategies (e.g., duration of the quarantine and isolation period, definition of contacts to be traced) it is crucial to understand the age profile of infectiousness and to have robust estimates of key time-to-event distributions such as the generation time. These distributions were estimated in the early days of the pandemic based on the very first few clusters of cases and are thus subject to high uncertainty and variability between different studies 1,15,16. It is important to update these estimates using large-scale and harmonized epidemiological datasets.
In this study, we analyze 1,178 SARS-CoV-2 infected individuals and their 15,648 contacts identified by contact tracing operations carried out in the Hunan Province of China over the period from January 13-April 02, 2020. This comprehensive and detailed dataset compiled by the Hunan Provincial CDC sheds light on SARS-CoV-2 transmission patterns, risk factors, and the distribution of key time-to-event parameters.
Methods
COVID-19 surveillance system, field epidemiological investigations, and contact tracing
In response to the COVID-19 outbreak, in late December 2019, the Chinese Center for Disease Control and Prevention (China CDC) launched a new surveillance system for COVID-19 cases. A description of the surveillance system is reported elsewhere 1. On January 21, 2020, the first COVID-19 case was confirmed in Hunan Province. Since then, active field epidemiological investigations of suspected or confirmed SARS-CoV-2 infections as well as their contacts have been initiated.
The definition of suspected and confirmed COVID-19 cases (i.e., symptomatic individuals), as well as subjects with asymptomatic SARS-CoV-2 infections (i.e., asymptomatic subjects) was based on the New Coronavirus Pneumonia Prevention and Control Program published by the National Health Commission (NHC) of China and the World Health Organization (WHO) 17. A suspected COVID-19 case was defined as a person who met one or more clinical criteria and had an epidemiological link to SARS-CoV-2 positive individuals or history of travel to/from regions reporting widespread SARS-CoV-2 transmission (Appendix p2). A confirmed COVID-19 case was defined as a suspected case with positive real-time RT-PCR results, while an asymptomatic subject was defined as an individual with laboratory confirmation of SARS-CoV-2 infection, but without any clinical symptom (e.g., no fever or cough). Confirmed COVID-19 cases were categorized by clinical severity, including mild, moderate, severe and critical illnesses (as defined in Appendix, Tab. S1).
Once a suspected or confirmed COVID-19 case was identified, a field epidemiology investigation was undertaken by the local CDC. Data were collected on demographic characteristics, clinical symptoms, and activity patterns starting 14 days before symptom onset and until confirmation or isolation in the hospital. All cases detected between January 16 and April 02, 2020 were interviewed using a standardized questionnaire. In addition, each individual with suspected or confirmed SARS-CoV-2 infection was asked to provide a list of locations she/he visited (e.g., workplace, health-care facilities) and her/his contacts. On the basis of this list, active contact tracing was then initiated by the investigation team. Screening interviews, checking of travel records based on public security cameras and traffic system, and digital health records were also collected to assess whether an individual met the definition of close contact. Once a close contact was identified and traced, she/he was quarantined at a designated place (e.g., hotel room) or at home and followed up for 14 days 17. Close contacts were interviewed using a standardized form before they were quarantined. The form comprised basic demographic information (e.g., age and sex), and detailed a record of the timing, frequency, and type of exposures to the case(s) who triggered the investigation.
Specimen collection and laboratory testing
Upper respiratory specimens (nasopharyngeal and oropharyngeal swabs) were collected from all suspected cases as well as their close contacts. Before February 7, 2020 specimens were collected for testing from each close contact if she/he developed symptoms during quarantine period. After February 7, 2020, specimens were collected at least once during quarantine, regardless of symptoms. After January 27, the designated hospitals and local CDCs were approved to conduct real-time RT-PCR assay for diagnosis of COVID-19 using a standardized laboratory testing procedure according to the “Novel coronavirus pneumonia Diagnosis and Treatment Program” released by NHC of China. The assays were performed in laboratory equipped with BSL-2 facilities (Appendix p3-4).
Close contacts, sporadic cases and clusters
Close contacts were defined as individuals who had close-proximity interactions (within 1 meter) with clinically suspected and laboratory-confirmed SARS-CoV-2 cases, for the period from 2 days before, to 14 days after, the potential infector’s symptom onset. For those exposed to asymptomatic subjects, the contact period was from 2 days before, to 14 days after, a respiratory sample was taken for real-time RT-PCR testing. Close contacts included, but were not limited to, household contacts (i.e., household members regularly living with the case), relatives (i.e., family members who had close contacts with the case but did not live with the case), social contacts (i.e., a work colleague or classmate), and other close contacts (i.e., caregivers and patients in the same ward, persons sharing a vehicle, and those providing a service in public places, such as restaurants or movie theatres).
A cluster of SARS-CoV-2 infections was defined as a group of two or more confirmed cases or asymptomatic subjects with an epidemiologic link (Appendix p3). Epidemiologically linked cases were classified according to the generation time of SARS-CoV-2 transmission and the setting where exposure took place, with primary cases considered as first generation. A sporadic case was defined as a confirmed case of SARS-CoV-2 infection (either symptomatic or asymptomatic) who did not belong to any of the reported clusters.
We define pre-symptomatic transmission as a direct transmission event that takes place before the date of symptom onset of the infector, while asymptomatic transmission is a transmission event from a person who never developed symptoms.
Ethical approval statement
This study was approved by the ethic committee of the Hunan CDC with a waiver of informed consent due to a public health outbreak investigation (IRB No. 2020005).
Statistical analysis
We provide descriptive statistics of the characteristics of cases and their close contacts, including demographic factors and exposures (Appendix p5-p7). We estimated the incubation period (i.e., the time delay from infection to illness onset), the serial interval (i.e., the time interval between the onset of symptoms in a primary case and in her/his secondary cases), the generation time (i.e., the time interval between infection of the primary case and of her/his secondary cases), and the infectiousness profile (i.e., the daily distribution of the probability of transmission since the date of symptom onset; see 15,18 and Appendix p7-p10 for methods). We also estimated the interval from symptom onset to the sampling date of first PCR by using a maximum likelihood estimator and fitting three distributions (Weibull, gamma, and lognormal) (Appendix p10). The goodness of fit was assessed using Akaike information criterion (AIC). We restrict the estimation of incubation period to 268 locally acquired infections with information on both the date(s) of exposure and generation of SARS-CoV-2 transmission in the cluster.
We rely on the contact tracing data to describe the age-specific contact matrices for SARS-CoV-2 infectors and their contacts (Appendix p11). Additionally, generalized linear mixed-effects model, GLMM, for binary data with logit link were built to quantify the effects of potential drivers of susceptibility and infectivity of the SARS-CoV-2 virus (i.e., odds ratio and marginal effect), based on 8,159 individual records of contacts who were exposed to locally transmitted cases (see appendix p11-12). These risk factors include age and gender of infectors/contacts, type of contact, generation of SARS-CoV-2 transmission in a cluster, as well as the number of contacts of an infector. Statistical analyses were performed using the R software, version 3.5.0.
Results
Sample description
Between January 23, 2020 and April 02, 2020, 1,019 symptomatic cases and 159 asymptomatic subjects were reported and screened for inclusion (Fig. S1 and Tab. 1). Through active contacts tracing, a total of 15,648 close contacts were identified, of whom 471 contacts were positive for SARS-CoV-2 infection. Among 1,178 SARS-CoV-2 infections, we identified 831 epidemiologically linked cases in 210 clusters. Of these clusters, 499 SARS-CoV-2 infections in 123 clusters had a clear epidemiological link to a previous SARS-CoV-2 infected individual. From 15,648 close contacts, 6,412 were identified by forward contact tracing and resulted in the identification of 285 symptomatic cases and 63 asymptomatic SARS-CoV-2 positive subjects. The remaining 9,236 close contacts were identified through backward contact tracing. The distribution of the cases and close contacts in time and space is presented in Fig. 1 and Fig. S2. Overall, the median age of symptomatic cases and asymptomatic subjects, and their close contacts were 45 (IQR: 34-55), 36 (IQR: 19-52) and 40 (IQR: 27-52) years, respectively (Tab. 1). Cases aged 0-19 years presented milder or no clinical symptoms, while patients aged 40 years and older had more severe illness (P<0.001).
Time-to-key-event distributions
We analyzed 268 locally-acquired confirmed cases belonging to 114 clusters, with information on both the date(s) of exposure and transmission generation in the cluster. We found that the best fitting distribution of incubation period was a Weibull distribution with a mean of 6.4 days (95% CI: 0.7, 16.6 days) (Tab. S3). We performed a sensitivity analysis excluding cases having only exposure end date (17 individuals) and we obtained similar estimates (Appendix, Tab. S3). Symptom onset dates were available for 245 transmission pairs; the resulting serial interval was estimated to have a mean of 5.5 days (95%CI: −5.0, 19.9 days) and a median of 4.8 days, based on a fitted gamma distribution. By considering only pairs with a single identified infector, we find that 14.0 % (31/221) of the empirical serial intervals were negative. The mean time interval from symptom onset to the sampling date of first PCR was estimated to be 4.7 days (95% CI: −2.9, 14.7 days) using the best fitting gamma distribution, based on 531 PCR positive individuals. The generation time was estimated to be 5.5 days (95% CI: 1.7, 11.6 days). The estimated distributions of the incubation period and of the generation time show stark similarities (Fig. 2B).
Pre-symptomatic transmission
Infectiousness was estimated to peak 1.8 days before symptom onset (Fig. 2A). We estimated the proportion of pre-symptomatic transmission (area under the curve, Fig. 2A) at 62.5%, with 95% of transmission events occurring between −7.6 days and 7.3 days of the date of symptom onset, under the intensive contact tracing and isolation strategy undertaken by the Hunan Province. From the analysis of the transmission chains reconstructed by field investigations, 43 pre-symptomatic transmission events were recorded in 23 clusters. A subset of those clusters is shown in Fig. 3A.
Asymptomatic transmission
From the analysis of contact tracing records, we identified 8 clusters with evidence of asymptomatic transmission. There were 11 asymptomatic infectors (5 primary and 6 secondary infections) associated with 15 of 25 local transmission events (10 secondary and 5 tertiary, Fig. 3B).
SARS-CoV-2 risk factors
We first explored differences in the age of SARS-CoV-2 infectors and infectees through the construction of age-specific transmission matrices (Fig. S4). The results suggest that people aged 15-59 years generated a larger mean number of cases than younger (0-14 years old) and older (60+ years old) individuals. Moreover, individuals over 60 yrs were infected more often, suggesting increased susceptibility. Next, to account for the possible effect of multiple confounding factors on the probability of transmission, we performed a multivariate regression analysis. We found that the age of the contact, the contact setting, and the generation of the infector in a cluster were important risk factors for transmission (Tab. 2). Infectiousness was not significantly different between working-age adults (15-59 years old) and other age groups (0-14 years old: p-value=0.162; 60 years and over: p-value=0.332]); in contrast, susceptibility to SARS-CoV-2 infection increased with age (p-value=0.028, Model 2 in Tab. 2). Further, household contacts were associated with a significantly larger risk of SARS-CoV-2 infection than other types of contact. The GLMM model suggests two other statistically significant risk factors: the generation in the transmission chain and the number of contacts identified for an infector (Tab. 2). In particular, the transmission risk was lower for later generations, possibly due to improved case isolation and contacts quarantine that deplete the number of susceptible individuals in the cluster. We also found a slight but significant decrease in transmission risk from cases who reported more contacts. The inclusion of other potential risk factors, such as the gender of an infector/contacts and clinical severity of an infector, did not modify the risk of SARS-CoV-2 transmission and did not improve the fit of the model (Tab. S7, Fig. S5).
Discussion
This analysis of SARS-CoV-2 transmission patterns and risk factors in Hunan, China, is based on the largest contact tracing dataset considered thus far. We found no difference in infectiousness by age, while susceptibility to SARS-CoV-2 infection increased with age. We provide evidence of both pre-symptomatic and asymptomatic SARS-CoV-2 transmission, with the former potentially accounting for up to 62.5% of all transmission events in this dataset. In addition, we estimate that SARS-CoV-2 transmission in households is responsible for most of secondary and tertiary infections. Further, within a cluster, individuals who were exposed to primary cases experienced a significantly higher risk of SARS-CoV-2 infection than those exposed to later cases.
The exposure history data used in this study were collected from in-depth epidemiological investigations, allowing us to provide robust estimation of several key time-to-event distributions. Previous estimates of the serial interval and incubation period were obtained from a limited number of infector-infectee pairs or from different data sources, thus suffering of large uncertainty 19,20. This may explain the large variability of the estimates, ranging from 4.0 days to 7.5 days for the serial interval 1,15,20-22 and from 4.8 days to 8.0 days for the incubation period 1,22-27. Our estimates fall within these intervals. Unlike the serial interval and the incubation period, only a few studies 28,29 provide estimates of the generation time, which is hard to directly infer from field investigations, as it requires knowledge of the infection date of both the infector and her/his infectees. Here, following an approach similar to He, et al 15, we estimate the mean generation time at 5.5 days, in general agreement with Ferretti, et al 29.
Previous studies show a relatively high proportion of pre-symptomatic transmission, but estimates vary significantly, ranging between 13-62% 15,29,30. Our estimate (62.5%) is on the high end of the range found in the literature. This is may be due to two main factors. First, the fraction of pre-symptomatic transmission heavily depends on the intensity of contact tracing and isolation strategy (e.g., whether cases are promptly isolated in dedicated facilities at the time of symptom onset or are isolated at home). Second, the depth of the contact tracing investigation may determine the rate of ascertainment of index cases. Our analysis suggests a key role of interventions (e.g., contact tracing and case isolation) in decreasing the risk of infection, as the risk of infection decreased with the number of the generations in the transmission chain.
We found evidence of asymptomatic transmission in several clusters, with 15 secondary cases linked to asymptomatic infectors. Other studies provide evidence of asymptomatic infection 12,30,31, but do not quantify its contribution to transmission. In our study, we cannot provide a point estimate, as a fraction of asymptomatic infections may have been missed despite extensive PCR testing performed by the Hunan CDC. However, we can provide a lower bound; we estimate that least 3.5% (15/432) of transmission events are associated with asymptomatic infectors, in agreement with Chen, et al (4.5% (6/132), p=0.602) 32.
In agreement with previous studies, we found that the risk of infection from a household member is larger than those resulting from other contacts 10,33. This may be explained by the duration, type, and frequency of contacts between household members as well as the impact of interventions (such as household quarantine) on household contacts. Consistent with the transmissibility of H1N1pdm influenza during the 2009 pandemic in the US 34, we found that SARS-CoV-2 transmissibility decreased with the number of contacts, although the effect is small. Further studies are needed to explain this connection.
Despite the challenges of reporting a low number of infections among children and the complexity of establishing epidemiologic links between children and adults within households 22, we assessed the effects of infector and infectee characteristics on SARS-CoV-2 susceptibility and infectivity. Our findings suggest that SARS-CoV-2 infectivity does not significantly differ by age, while the risk of SARS-CoV-2 infection steadily increases with age (in agreement with Zhang J, et al. 9,11). This implies that caution should be applied when evaluating policies that increase the number of contacts among children, such as re-opening of schools or summer camps.
Our study is not without limitations. First, it suffers from the classic limitations of any epidemiological field investigation. Despite the longitudinal and in-depth investigation of each case and her/his contacts, we could not always accurately reconstruct the entire transmission chain and avoid recall bias in individual records. Moreover, we cannot rule out the possibility of indirect exposures (e.g., contaminated surfaces), which may affect the identification of epidemiological links. Second, our sample size did not allow us to distinguish between different time periods of the pandemic in Hunan, while controlling for all the other covariates. Changes in population awareness and reactive behavioral response to the outbreak may affect the estimates provided in this study.
In conclusion, the evidence of pre-symptomatic and asymptomatic SARS-CoV-2 transmission shown in this study underlines the key role of undetectable SARS-CoV-2 transmission that can hinder control efforts. Control measures should thus be tailored accordingly, especially contact tracing, testing, and isolation. Our findings that transmission can occur up to 7 days before symptoms onset lends support to personal precautions such as mask wearing. In addition, school reopening, and the consequent increase in the number of daily contacts among children and teenagers, is expected to increase the contribution of children to SARS-CoV-2 transmission. School outbreaks have already been reported in several occasions 5,35,36; time will tell whether schools can become a major foci of transmission in the coming months.
Author contributions
S. Hu, W. Wang, Y. Wang, L. Gao, and H. Yu had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: L. Gao, M. Ajelli, H. Yu.
Acquisition, analysis, or interpretation of data: K. Luo, L. Ren, Q. Sun, X. Chen, G. Zeng, J. Li, L. Liang, Z. Deng, W. Zheng, M. Li, H. Yang, J. Guo, K. Wang, X. Chen, Z. Liu, H. Yan, H. Shi, Z. Chen, Y. Zhou.
Drafting of the manuscript: S. Hu, W. Wang, M. Litvinova, M. Ajelli, H. Yu.
Critical revision of the manuscript for important intellectual content: K. Sun, A. Vespignani, C. Viboud, L. Gao, M. Ajelli, H. Yu.
Statistical analysis: W. Wang, Y. Wang, M. Litvinova, M. Ajelli.
Administrative, technical, or material support: K. Luo, Q. Sun, G. Zeng, Z. Deng, H. Yang, Z. Liu, K. Sun.
Supervision: L. Gao, M. Ajelli, H. Yu,
Obtained funding: L. Gao, H. Yu.
Conflicts of Interest Disclosures
Hongjie Yu has received research funding from Sanofi Pasteur, GlaxoSmithKline, Yichang HEC Changjiang Pharmaceutical Company, and Shanghai Roche Pharmaceutical Company. None of those research funding is related to COVID-19. All other authors report no competing interests.
Funding/Support
National Science Fund for Distinguished Young Scholars (No. 81525023), National Science and Technology Major Project of China (No. 2017ZX10103009-005, No. 2018ZX10713001-007, No. 2018ZX10201001-010), and Hunan Provincial Innovative Construction Special Fund: Emergency response to COVID-19 outbreak (No. 2020SK3012).
Role of the Funder/Sponsor
The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Data Availability
Individual-based data on 1,178 SARS-CoV-2 infected individuals and their 15,648 contacts identified by contact tracing monitoring over the period from January 13-April 02, 2020 were extracted from the notifiable infectious diseases reporting system in Hunan Province, China. Demographic characteristics, severity classification, exposure and travel history, and key clinical timelines were retrieved.