Abstract
Background The 2019-nCoV outbreak in Wuhan, China has attracted world-wide attention. As of February 11, 2020, a total of 44730 cases of novel coronavirus-infected pneumonia associated with COVID-19 were confirmed by the National Health Commission of China.
Methods Three approaches, namely Poisson likelihood-based method (ML), exponential growth rate-based method (EGR) and stochastic Susceptible-Infected-Removed dynamic model-based method (SIR), were implemented to estimate the basic and controlled reproduction numbers.
Results A total of 71 chains of transmission together with dates of symptoms onset and 67 dates of infections were identified among 5405 confirmed cases outside Hubei as reported by February 2, 2020. Based on this information, we find the serial interval having an average of 4.41 days with a standard deviation of 3.17 days and the infectious period having an average of 10.91 days with a standard deviation of 3.95 days.
Conclusions The controlled reproduction number is declining. It is lower than one in most regions of China, but is still larger than one in Hubei Province. Sustained efforts are needed to further reduce the Rc to below one in order to end the current epidemic.
1. Introduction
On December 29, 2019, Wuhan, the capital city of Hubei Province in Central China, has reported four cases of pneumonia with unknown etiology. Since then, the outbreak has rapidly worsened over a short span of time and has received considerable global attention. On January 7, 2020, the pathogen of the current outbreak was identified as a novel coronavirus (2019-nCoV), and its gene sequence was quickly submitted to WHO (The coronavirus was renamed COVID-19 by WHO on February 12).1,2 On January 30, WHO announced the listing of this novel coronavirus-infected pneumonia (NCP) as a “public health emergency of international concern”. As of February 11, 2020, the National Health Commission (NHC) of China had confirmed a total of 44730 cases of novel coronavirus-infected pneumonia (NCP) associated with the COVID-19 in Mainland China, including 1114 fatalities and 4771 recoveries.
Since January 19, 2020, strict containment measures, including travel restrictions, contact tracing, entry or exit screening, non-hospital isolation, quarantine, awareness campaigns and others, have been implemented by the Wuhan Government and quickly adopted by other cities within China with the aim to minimize virus transmission via human-to-human contact. Similar measures were previously employed in China in 2009 to tackle the outbreak of H1N1 including mandatory quarantine of anyone who has had close contact with confirmed patients.
This article investigates the change in the basic reproduction number R0 and controlled reproduction number Rc since the outbreak of COVID-19. We have found that the estimated controlled reproduction numbers Rc in all different regions are significantly smaller compared with the basic reproduction numbers R0, but Rc is still greater than one in Hubei Province.
2. Data
Data were collected from provincial/municipal health commissions in China as well as through ministry of health in other countries and regions with details of each confirmed case including case ID, region, age, gender, date of symptom onset, date of diagnosis, history of traveling to or residing in Wuhan, and, if any, related remarks such as contact identification, cases and case-related information. In addition, the collected dataset also contains date of infection and chains of transmission of infection which can be inferred from travel or residency history in Wuhan and other relevant information, if available, as follows:
If the individual has not been to Hubei Province recently, but were exposed within a four-day period (i.e., the individual has had contact with a confirmed case of NCP on a certain day), then the corresponding date of infection is inferred as the middle of the exposure period;
If the individual has travelled to Hubei Province but has returned within four days, then the date of infection is inferred as the middle of the travelling period;
If the individual has not been to Hubei Province recently, but in close contact with an imported case from Hubei, then the individual is identified to be infected by this imported case;
If the individual has not been to Hubei Province recently, but in close contact with a local case who was clearly infected before that individual, then this individual is identified to be infected by the corresponding local case.
Note: if the individual has been to Hubei Province, the transmission history would not be recorded despite the existence of contact tracing information.
3. Inference about the Serial Interval and Infectious Period
In this study, serial interval is defined as the time difference between dates of infection of successive cases in a chain of transmission (different textbooks may have different definitions). Infectious period is the duration of which an infected individual can transmit pathogens to a susceptible host. In this study, infectious period is defined as the time difference between date of infection and date of diagnosis as there is strong evidence showing that a diseased individual remains contagious even during the incubation period, and would be immediately isolated upon positive diagnosis hence losing the transmissibility. Both are key quantities that depict an epidemic and are essential to estimate the basic/controlled reproductive number, R0 / Rc. Among 139 chains of transmission identified from 5405 confirmed cases outside Hubei as recorded by February 2, 2020, none of them have their dates of infection acquired, but 71 of them have their dates of symptoms onset available.
Hence, the corresponding serial intervals were approximated by the differences in dates of symptom onset rather than the actual dates of infection, see Figure 1.
Histogram of serial interval with the average of 4.27 days before correction.
We can see that some serial intervals are negative which is certainly impossible by definition. However, noting that the serial intervals were approximated from the dates of symptoms onset, this suggests that the negative values could be caused by different lengths of incubation period between individuals. Here a simple correction is implemented by resetting the negative values to zeros. This shifts the average of the serial intervals to 4.41 days and the standard deviation to 3.17 days after corrections, see table 1. Note that the serial interval of SARS-nCoV in Hongkong was 8.4 days on average.3 In addition, a total of 67 cases in the collected data were able to identify the corresponding dates of infection. Figure 2 plots the histogram of infectious period while Table 2 shows the numerical summary.
numerical summary of serial intervals.
numerical summary of infectious period.
Histogram of infectious period with the average of 10.91 days.
We found that there were no significant demographical differences between the subset of cases used to estimate serial interval and infectious period and the cases in the full dataset. Therefore, the inference made on serial interval and infectious period based on the corresponding subsets should be able to represent the full dataset.
4. Estimation of Basic/Controlled Reproduction Number
Definition
The reproduction number R0is defined as the (average) number of new infections generated by one infected individual during the entire infectious period in a fully susceptible population.4 It can be also understood as the average number of infections caused by a typical individual during the early stage of an outbreak when nearly all individuals in the population are susceptible. The basic reproduction number reflects the ability of an infection spreading under no control. When the size of susceptible population is limited, the quantity, effective reproduction number Re, is used instead of R0. Similarly, the quantity, controlled reproduction number Rc, should be used to describe the ability of disease spreading when interventions (such as quarantine, isolation, or traffic control) are taking place. Hence a good measure of any intervention is to reduce Rc. Note that the disease will decline and eventually die out if Rc ≤1.
Methods
The basic reproduction number can be estimated through a variety of models.5 In this section, we have compared three most popular estimates of R0/Rc as shown below.
(1) Poisson Likelihood-based (ML) method
Let Nt be the number of reported new confirmed cases on day t. Suppose that the serial interval has a maximum of k days and the number of new cases generated by an infected individual is assumed to follow a Poisson distribution with parameter R0.6 The probability that the serial interval of an individual in 1 days is wj, which can be estimated from the empirical distribution of serial interval or by setting up a discretized Gamma prior on it. Thus, the likelihood function can be reduced into a thinned Poisson
where
The reproduction number R can be estimated by maximizing the likelihood function. Note that if the empirical distribution of serial interval is used or wj, are given, then
(2) Exponential growth rate-based (EGR) method
At the early period of an epidemic, the number of infected cases rises exponentially. Suppose the exponential epidemic growth rate (Malthusian coefficient) is R, which can be estimated by fitting a least square line to the daily number of reported new confirmed cases in a log-scale, namely, log (Nt). Let fG(t) denote the probability density function of serial interval. Hence the reproduction number can be calculated according to the Euler-Lotka equation in a moment generating form7
(3) Stochastic dynamic model-based method
Here we consider a stochastic Susceptible-Infected-Removed (SIR) model rather than a standard deterministic one. The major advantage of using a stochastic dynamic model is that it affords improved accounting for real variabilities and increases opportunity for quantifying uncertainties.8 Let S(t), I(t) and R(t) denote the number of susceptible, infectious and recovered population at time t respectively, and note that N= s(t) + 1(t) + R(t). Suppose that the infectious period of an individual is a random variable T∼ Exp(γ), then the reproduction number R= β(T) = β/ γ, where γ and β are the recovery rate and transmission rate respectively in the system of ordinary differential equation (ODE) below,
The maximum likelihood method is used to estimate model parameters where the likelihood is obtained by sequential Monte Carlo method, and parameters are estimated using the Iterated Filtering algorithm (IF2)9 implemented as mif in the R package pomp10 with s(0) equals the population of the region, R(0) = 0, I(0) is 10 times the average number of confirmed cases from Day 0 to Day 7 and γ = 10.91 obtained from the collected data described ahead.
Results
In this section we have estimated the basic reproduction number R0 and the controlled reproduction number Rc. Since January 19, 2020, various containment measures have been strictly implemented, especially after the State Council agreed to include NCP into the Management of the Infectious Diseases Law and the Health and Quarantine Law on January 20. Based on an average10.91-day infectious period estimate from our collected data, we expect a flatter rate of increment starting on January 29. Figure 3 plots the number of daily new cases in a log-scale against date, and, as anticipated, the trend supports our guess.Therefore, the quantities R0 and Rc are estimated based on collected data in two separate periods, i.e., from January 21 (the starting date of daily updates of confirmed cases nationwide) to January 28, and from January 29 to February 5 (the end date of this study) respectively.
Visualization of daily numbers of new confirmed case along with date, as obvious change of rate occurred on January 29 as expected.
The estimates of R0 and Rc by Poisson likelihood (ML) and exponential growth rate (EGR) in selected regions of China are listed in Table 3 and Table 4. Despite the disagreement between different estimation methods, all three methods indicate notable reductions from R0 to Rc which suggests an improvement in the current situation. This is possibly due to the effective interventions and prompt actions by the local and central governments to minimize further spreading. We also notice that EGR yields smaller estimates of Rc compared to other methods. This might be because the number of infected patients does not grow exponentially after such strict containment measures.
Estimates and 95% confidence intervals of basic reproduction number in some selected provinces (or cities) of China, from Jan 21 to Jan 28, 2020.
Estimates and 95% confidence intervals of controlled reproduction number in some selected provinces (or cities) of China, from Jan 29 to Feb 5, 2020.
Furthermore, the time-varying controlled reproduction number Rc(t) can be estimated through the Poisson likelihood (ML) method where t is from Feb 1 to Feb 10, 2020. For each Day t, the number of daily reported new cases from Day t-7 to Day t is used to estimate . Figure 4 plots the estimated controlled reproduction number
along with its 95% CI for selected regions of China. Note that the estimated
reflects the average spreading ability of the epidemic in a short period prior to Day t. As a result, the real-time Rc(t) might be overestimated if the general trend of Rc(t) is declining.
The estimated controlled reproduction number in (a) China, (b) Hubei, (c) Other provinces except Hubei, (d) Beijing, (e) Shanghai, (f) Guangdong, (g) Zhejiang, (h) Hunan, and (i) Henan. The dashed line is the 95% confidence interval.
5. Conclusion
Despite the continuous increase in new confirmed cases on a daily basis, the estimated controlled reproduction numbers Rc produced by all three methods in all different regions are significantly smaller compared with the basic reproduction numbers R0. As discussed in Section 4, the real-time controlled reproduction number may be even lower than the estimated values in Figure 3. Nonetheless, additional effort is needed to further reduce Rc below one in Hubei Province.
6. Discussion
The dataset used in this study is based on the confirmed cases reported by NHC China. However, during the period of data collection, the official guidelines for diagnosis and treatment were updated four times. The criteria of confirmation have evolved from the original “whole genome sequencing of the respiratory excretion” to “positive viral nucleic acid results by the RT-PCR of the respiratory excretion or viral gene sequence”, and, most currently, the inclusion of positive nucleic acid results of the blood sample. In the meanwhile, the confirmation process is simplified by removing the accreditation process by the national expert committee for confirmed cases. The fourth edition granted the accrediting authority to the municipalities.11 In addition, the medical resources in Hubei Province especially in Wuhan have been enhanced remarkably. All of these changes might result in a temporary surge of confirmed cases and lead to an overestimation of Rc, especially in Hubei Province.
Furthermore, the current containment measures mainly aim to cut the transmission from human to human via droplets of respiratory. However, other transmission pathways, including fecal-oral transmission and aerosol transmission, could not yet be excluded based on current evidence. If other transmission mechanisms do exist, the Rc values would remain high in the future unless further measures would intersect these transmission pathways.
Acknowledgments
We thank Taojun Hu, Xueqing Liu and Yuying Li from School of Public Health, Peking University for assistance of data collection.
Footnotes
↵* Joint first authors