Abstract
COVID-19 epidemic doubling time by Chinese province was increasing from January 20 through February 9, 2020. Yet, the harmonic mean doubling time was relatively short, ranging from 1.4 (Hunan, 95% CI, 1.2-2.0) to 3.0 (Xinjiang, 95% CI, 2.0-4.9) days, with an estimate of 2.5 days (95% CI, 2.4-2.7) for Hubei.
To the editor
Our ability to estimate the basic reproduction number of emerging infectious diseases is often hindered by the paucity of information about the epidemiological characteristics and transmission mechanisms of new pathogens (1). Alternative metrics could synthesize real-time information about the extent to which the epidemic is expanding over time. Such metrics would be particularly useful if they rely on minimal and routinely collected data that capture the trajectory of an outbreak (2).
Epidemic doubling times characterize the sequence of intervals at which the cumulative incidence doubles (3). Here we analyze the number of times the cumulative incidence doubles and the evolution of the doubling times of the COVID-19 epidemic in mainland China (4). We use province-level data from January 20 (when nationwide reporting began) through February 9, 2020. See Technical Appendix 1 for a sensitivity analysis based on a longer time period.If an epidemic is growing exponentially with a constant growth rate r, the doubling time remains constant and equals to (ln 2)/r. An increase in the epidemic doubling time indicates a slowdown in transmission if the underlying reporting rate remains unchanged (Technical Appendix 2).
Daily cumulative incidence data were retrieved from provincial health commissions’ websites (Technical Appendix 3). Data were double-checked against the cumulative national total published by the National Health Commission (5), data compiled by the Centre for Health Protection, Hong Kong, when available (6) and by John Hopkins University (7). Whenever discrepancies arose, provincial government sources were deemed authoritative. Tibet was excluded from further analysis because there had only been one case reported during the study period. See Technical Appendix 1 for data.
From January 20 through February 9, the harmonic mean doubling time estimated from cumulative incidence ranged from 1.4 (95% CI, 1.2, 2.0) days (Hunan) to 3.0 (95% CI, 2.0, 4.9) days (Xinjiang). In Hubei, it was estimated as 2.5 (95% CI, 2.4, 2.7) days. For illustrative purpose only, this estimate corresponds to an estimated effective reproductive number in a range of 2.1 to 4.2 in scenarios based on different assumptions and data sources. See Technical Appendix 2 for scenario analysis of various parameter values. The cumulative incidence doubled 6 times in Hubei. The harmonic mean doubling time in mainland China except Hubei was 1.8 (95% CI, 1.5-2.2) days. Provinces with a harmonic mean doubling time <2d included Fujian, Guangxi, Hebei, Heilongjiang, Henan, Hubei, Hunan, Jiangxi, Shandong, Sichuan, and Zhejiang (Figures 1 and S1).
As the epidemic progressed, it took longer for the cumulative incidence in mainland China (except Hubei) to double itself, which indicated an overall sub-exponential growth pattern outside Hubei (Figure S1A). In Hubei, the doubling time decreased and then increased.A gradual increase in the doubling time coincide with the social distancing measures and intra-and-inter-provincial travel restrictions imposed across China since the implementation of quarantine of Wuhan on January 23 (8).
Our estimates of doubling times are shorter than recent estimates of 7.4 days (95% CI, 4.2-14) (4), 6.4 days (95% CrI, 5.8-7.1) (9), and 7.1 days (95% CI, 3.0-20.5) (10) respectively. Li et al. covered cases reported by January 22 (4). Wu et al. statistically inferred case counts in Wuhan by internationally exported cases as of January 25 (9). Volz et al. identified a common viral ancestor on December 8, 2019 using Bayesian phylogenetic analysis and fitted an exponential growth model to provide the epidemic growth rate (10). Our estimates are based on cumulative reported numbers of confirmed cases by date of reports by province from January 20 through February 9. Our study is also congruent with Maier and Brockmann (11) who also identified sub-exponential growth of the outbreak across provinces, as mass quarantine and restriction of travels across mainland China began since January 23, 2020.
Our study is subject to limitations, including the underreporting of confirmed cases as reported by Chinese media (12). One reason for underreporting is underdiagnosis, due to lack of diagnostic tests, healthcare workers, and other resources. Further, underreporting is likely heterogeneous across provinces. As long as reporting remains invariant over time within the same province, the calculation of doubling times remains reliable; however, this is a strong assumption. Growing awareness of the epidemic and increasing availability of diagnostic tests might have strengthened reporting over time, which could have artificially shortened the doubling time. Nevertheless, apart from Hubei and Guangdong (first case reported on January 19), nationwide reporting only began on January 20, and at this point, Chinese authorities had openly acknowledged the magnitude and severity of the epidemic. Due to a lack of detailed case data describing incidence trends for imported and local cases, we focused our analysis on the overall trajectory of the epidemic without adjusting for the role of imported cases on the local transmission dynamics. Indeed, it is likely that the proportion of imported cases was significant for provinces that only reported a few cases; their short doubling times in the study period could simply reflect rapid detection of imported cases. However, with the data until February 9, only two provinces had a cumulative case count of <40 (Table S1). It would be interesting to investigate the evolution of the doubling time after accounting for case importations if more detailed data becomes available.
To conclude, we observed an increasing trend in the epidemic doubling time of COVID-19 by Chinese province from January 20 through February 9, 2020. The harmonic mean doubling time of cumulative incidence in Hubei during the study period was short and estimated at 2.5 (95% CI, 2.4-2.7) days.
Data Availability
All data analyzed is publicly available, aggregated, data. The data is provided in Table S1 in Technical Appendix 1.
First author(s) biography
Kamalich Muniz-Rodriguez, MPH, is a doctoral student at the Jiann-Ping Hsu College of Public Health, Georgia Southern University. Her research interests include infectious disease epidemiology, digital epidemiology and disaster epidemiology.
Gerardo Chowell, PhD, is Professor of Epidemiology and Biostatistics, and Chair of the Department of Population Health Sciences at Georgia State University School of Public Health. As a mathematical epidemiologist, Prof Chowell studies the transmission dynamics of emerging infectious diseases, such as Ebola, MERS and SARS.
Disclaimer
This article does not represent the official positions of the Centers for Disease Control and Prevention, the National Institutes of Health, or the United States Government.
Acknowledgement
GC acknowledges support from NSF grant 1414374 as part of the joint NSF-NIH-USDA Ecology and Evolution of Infectious Diseases program. ICHF acknowledges salary support from the National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention (19IPA1908208). This article is not part of ICHF’s CDC-sponsored projects.
Technical appendix 1
Additional information on our motivation, scope and methods
Motivation
R0 is a widely used indicator of transmission potential in a totally susceptible population and is driven by the average contact rate and the mean infectious period of the disease (1). Yet, it only characterizes transmission potential at the onset of the epidemic and varies geographically for a given infectious disease according to local healthcare provision, outbreak response, as well as socioeconomic and cultural factors. Furthermore, estimating R0 requires information about the natural history of the infectious disease. Thus, our ability to estimate reproduction numbers for novel infectious diseases is hindered by the paucity of information about their epidemiological characteristics and transmission mechanisms. More informative metrics could synthesize real-time information about the extent to which the epidemic is expanding over time. Such metrics would be particularly useful if they rely on minimal data on the outbreak’s trajectory (2).
Scope and definition
We restricted our analysis to mainland China in this paper. A ‘province’ herein encompasses three different types of political sub-divisions of mainland China, namely, a province, a directly administered municipality (Beijing, Chongqing, Shanghai, and Tianjin) and an autonomous region (Guangxi, Inner Mongolia, Ningxia, Tibet, and Xinjiang). Our analysis does not include the Hong Kong Special Administrative Region and the Macau Special Administrative Region, which are under the effective rule of the People’s Republic of China through the so-called ‘One Country, Two Systems’ political arrangements. Likewise, our analysis does not include Taiwan, which is de facto governed by a different government (the Republic of China).
Methods
Figures 1A, 1B, 2A, 2C and all figures in Technical Appendix 4 were created using R version 3.6.2 (R Core Team). Significance level in this manuscript was a priori decided to be α = 0.05.
Additional information on our results and discussion
Cumulative incidence over time
In Technical Appendix 4, we provided a total of 31 plots of cumulative incidence over time (left panel) and log10 (cumulative incidence) over log10 (time) (right panel), for Hubei province, mainland China (except Hubei), and then each of the provinces except Tibet (as there was only one case). Thus, we linearized an exponential curve. If social distancing would have an impact, the slope of the log-log plot would decrease, indicating a decreasing epidemic growth rate.
Sensitivity analysis
We performed a sensitivity analysis by expanding our data analysis to the data since December 31, 2019, when Hubei first reported a cluster of pneumonia cases with unexplained etiology that turned out to be COVID-19. The only difference between the sensitivity analysis and the main analysis is the inclusion of Hubei and Guangdong data from December 31, 2019, through January 19, 2020, because nationwide reporting started on January 20, 2020. The only differences in results were found for Hubei and Guangdong. The harmonic mean doubling time for Hubei was 4.06 (95% CI. 3.86, 4.33), and the cumulative incidence in Hubei doubled nine times from December 31, 2019, through February 9, 2020 (Table S3, Figures S2, S3, S4). The first doubling time of Hubei (Figure S2) was high, reflecting that real- time data was unavailable before mid-January. It was only by January 17, 2020, onwards when data reporting become increasingly transparent and timely.
Authors’ contributions
Project management: Dr. Gerardo Chowell, Dr. Isaac Chun-Hai Fung and Ms. Kamalich Muniz- Rodriguez
Manuscript writing: Dr. Isaac Chun-Hai Fung and Dr. Gerardo Chowell
Manuscript editing and data interpretation: Ms. Kamalich Muniz-Rodriguez, Dr. Gerardo Chowell, Dr. Isaac Chun-Hai Fung, Dr. Lone Simonsen, and Dr. Cecile Viboud MATLAB code: Dr. Gerardo Chowell
Doubling time calculation using MATLAB and presentation of results: Ms. Kamalich Muniz-Rodriguez, Dr. Gerardo Chowell and Dr. Isaac Chun-Hai Fung
Statistical analysis in R: Dr. Isaac Chun-Hai Fung, Ms. Kamalich Muniz-Rodriguez
Data management and quality check of epidemic data entry: Ms. Kamalich Muniz-Rodriguez, Dr. Isaac Chun-Hai Fung
Curation of epidemic data for countries and territories outside mainland China (including Hong Kong, Macao and Taiwan): Ms. Kamalich Muniz-Rodriguez and Ms. Sylvia K. Ofori
Curation of epidemic data for provinces in mainland China: Ms. Manyun Liu (from the early reports, up to Jan 24, 2020 data), Ms. Po-Ying Lai (since Jan 25, 2020 data to today), Mr. Chi-Hin Cheung (since Jan 27, 2020 data to today), and Ms. Kamalich Muniz-Rodriguez and Dr. Isaac Chun-Hai Fung (whenever there is a back-log).
Retrieval of epidemic data from official websites (downloading and archiving of China’s national and provincial authorities’ press releases): Ms. Manyun Liu and Dr. Dongyu Jia (at the very beginning of our project)
Retrieval of statistical data from the official website of National Bureau of Statistics of the People’s Republic of China: Mr. Chi-Hin Cheung
Retrieval of publicly available statistical data from various sources: Ms. Yiseul Lee, Dr. Isaac Chun-Hai Fung
Map creation: Ms. Kimberlyn M. Roosa
Literature review assistance: Ms. Sylvia K. Ofori
Plots of log cumulative incidence versus time: Ms. Sylvia K. Ofori
Technical Appendix 2: Dr. Isaac Chun-Hai Fung, Dr. Gerardo Chowell
Technical Appendix 3: Ms. Manyun Liu, Dr. Isaac Chun-Hai Fung
Technical Appendix 2: Harmonic mean doubling time and effective reproductive number
Doubling time calculation and its relationship with growth rate of an epidemic
As the epidemic grows, the times at which cumulative incidence doubles are given by such that where , and i = 0,1,2,3, …, nd where nd is the total number of times cumulative incidence doubles. The actual sequence of “doubling times” are defined as follows:
To quantify parameter uncertainty, we used parametric bootstrapping with a Poisson error structure around the harmonic mean of doubling times dj to obtain the 95% confidence interval. See references (1- 3) for further details.
Doubling time calculation was conducted using MATLAB R2019b (Mathworks, Natick, MA).
If we assume homogeneous mixing (equal probability of acquiring infection through contacts) and exponential growth, then, C(t2) = C(t1)exp(rt), and therefore, ln(C(t2)/C(t1)) = rt. When C(t2)/C(t1) = 2 and thus t is the doubling time, i.e. t = td, ln2 = rtd. Therefore, the doubling time, td, equals to (ln2)/r. See Vynnycky and White (4), panel 4.1, p.74 for further explanation.
Connecting the harmonic mean doubling time and effective reproductive number
We explored a number of scenarios using parameter values extracted from recent literature, to calculate the effective reproductive number, R, based on our study’s estimated harmonic mean doubling time from January 20 through February 9, 2020. Instead of basic reproductive number R0, R was estimated because social distancing measures have been introduced to many parts of China, since the quarantine of Wuhan on January 23, 2020. Our goal is to provide some indications of the uncertainty around the effective reproductive number R based on the epidemic doubling time that we estimated.
We reviewed parameter values from selected epidemiological literature on COVID-19, listed in Table S4. We chose the harmonic mean doubling time of Hubei (the epicenter), and those of Hunan (lowest), and Xinjiang (highest) for our scenario analysis. We rounded our R estimates to 1 decimal place.
Part 1
We made the assumption that the incubation period could be taken as a proxy for pre-infectious period (also known as latent period). We explored Scenario 1 in which a mean incubation period (D1) of 6.4 days estimated by Backer, Klinkenberg and Wallinga (5) based on 88 cases with known travel history to or from Wuhan, and a mean duration from onset of symptoms to isolation (D2) of 0.7 day estimated by authors from Guangdong Provincial CDC using data of cases after January 19, 2020, in a study using 153 cases confirmed by January 23, 2020. We explored Scenario 2 using the Guangdong CDC’s data with D1= 4.8 and D2 = 0.7.
Assuming that the pre-infectious period and the infectious period follow the exponential distribution, we used the following equation for R0 in Vynnycky and White (4), Table 4.1, Equation 4.13, to calculate R:
In mathematical notation, R0 = (1+rD1)(1+rD2)
Scenario 1
If D1 = 6.4 and D2 = 0.7, the effective reproductive number, R, in Hubei was 3.3, corresponding to a harmonic mean doubling time of 2.5 days from January 20 through February 9, 2020. In Hunan, with a harmonic mean doubling time of 1.4, the R was 5.6, while in Xinjiang, with a harmonic mean doubling time of 3.0, the R was 2.9 (Table S5).
Scenario 2
If D1 = 4.8 and D2 = 0.7, the R in Hubei was 2.8. In Hunan, the R was 4.5, while in Xinjiang, the R was 2.5 (Table S5).
Part 2
We used the serial interval (TS) of 7.5 days from Li et al. (6) and 3.96 days from Du et al. (7) explored the scenarios where the ratio between the infectious period and the serial interval, fr, is 0.5, 0.75 and 1.
We used the following R0 equation in Vynnycky and White (4) (Table 4.1 Equation 4.16) to calculate R: R0 = 1 + growth rate ✕ serial interval + ratio between the infectious period and the serial interval ✕ (1 - ratio between the infectious period and the serial interval) ✕ (growth rate ✕ serial interval)2
In mathematical notation, R0 = 1 + rTS + fr(1-fr)(rTS)2
Scenario 3
If half of the serial interval of 7.5 days is infectious, fr = 0.5, the effective reproductive number, R, in Hubei was 4.2, corresponding to a harmonic mean doubling time of 2.5 days from January 20 through February 9, 2020. In Hunan, with a harmonic mean doubling time of 1.4 days, the R was 8.2, while in Xinjiang, with a mean doubling time of 3.0 days, the R was 3.5 (Table S5).
Scenario 4
If three-quarters of the serial interval of 7.5 days is infectious, fr = 0.75, the R in Hubei was 3.9. In Hunan, the R was 7.3, while in Xinjiang, the R was 3.3 (Table S5).
Scenario 5
If the entire serial interval of 7.5 days is infectious, fr = 1, the R in Hubei was 3.1. In Hunan, the R was 4.7, while in Xinjiang, the R was 2.7 (Table S5).
Scenario 6
If half of the serial interval of 3.96 days is infectious, fr = 0.5, the R in Hubei was 2.4. In Hunan, the R was 3.9, while in Xinjiang, the R was 2.1 (Table S5).
Scenario 7
If three-quarters of the serial interval of 3.96 days is infectious, fr = 0.75, the R in Hubei was 2.3. In Hunan, the R was 3.7, while in Xinjiang, the R was 2.1 (Table S5).
Scenario 8
If the entire serial interval of 3.96 days is infectious, fr = 1, the R in Hubei was 2.1. In Hunan, the R was 3.0, while in Xinjiang, the R was 1.9 (Table S5).
Conclusions
The effective reproductive number, R, of COVID-19 in Hubei was in a range of 2.1 to 4.2 in scenarios based on different assumptions and data sources. For provinces with a short mean doubling time, such as Hunan, R could be in the range of 3.0 to 8.2, while in provinces with a long mean doubling time, such as Xinjiang, R could be in the range of 1.9 to 3.5. These estimates are meant to illustrate the range of possibilities, as they are critically dependent on the estimates of the other epidemiologic parameters, such as serial intervals.
Technical Appendix 3
Footnotes
Email addresses: km11200{at}georgiasouthern.edu (K. Muniz-Rodriguez); gchowell{at}gsu.edu (G. Chowell) westerpants{at}gmail.com (C.-H. Cheung); djia{at}georgiasouthern.edu (D. Jia); pylai{at}bu.edu (P.-Y. Lai); ylee97{at}student.gsu.edu (Y. Lee); ml16842{at}georgiasouthern.edu (M. Liu); so01935{at}georgiasouthern.edu (S. K. Ofori); kroosa1{at}student.gsu.edu (K. M. Roosa); lsimonsen2{at}gmail.com (L. Simonsen); viboudc{at}mail.nih.gov (C. Viboud); cfung{at}georgiasouthern.edu (I. C.-H. Fung)