Abstract
On January 23, 2020, China quarantined Wuhan to contain an emerging coronavirus (2019-nCoV). Here, we estimate the probability of 2019-nCoV importations from Wuhan to 369 cities throughout China before the quarantine. The expected risk exceeds 50% in 128 [95% CI 75 186] cities, including five large cities with no reported cases by January 26th.
A novel coronavirus (2019-nCoV) has emerged in Wuhan, China (1). On January 30, 2020, the World Health Organization declared it a public health emergency of international concern. As of January 31, 2020, there were 192 reported fatalities and 3215 laboratory-confirmed cases in Wuhan, 8576 additional cases spread across over 300 cities in mainland China, and 127 exported cases in 23 countries spanning Asia, Europe, North America, and Oceania. The rapid global expansion, rising fatalities, unknown animal reservoir, and evidence of person-to-person transmission potential (2) resemble the 2003 SARS epidemic and raise concerns about global spread.
On January 22nd, China announced a travel quarantine of Wuhan and, by January 30th, enlarged the radius to include 16 cities totalling 45 million people. At the time of the quarantine, China was already two weeks into the forty-day Spring Festival period, during which hundreds of millions of people travel to celebrate the Lunar New Year (3). Based on the timing of exported cases reported outside of China, we estimate that among cases that were infected by January 12th, only 9.82% (95% CrI: 2.58% - 59.44%) may have been confirmed in Wuhan by January 22nd. By limiting our estimate to infections occurring at least ten days before the quarantine, we are accounting for a 5-6 day incubation period followed by 4-5 days between symptom onset and case detection (2,4,5) (see Appendix for details). This lag between infection and detection (5) coupled with an overall low detection rate also suggests that newly infected cases that traveled out of Wuhan just prior to the quarantine may have remained infectious and undetected in dozens of Chinese cities for days to weeks. Moreover, these silent importations may have seeded sustained outbreaks that have not yet become apparent.
We estimated the probability of 2019-nCoV importations from Wuhan to cities throughout China before January 23rd using a simple model of exponential growth coupled with a stochastic model of human mobility among 370 Chinese cities (see Appendix). Given that 98% of all trips during this period are taken by train or car, our analysis of air, rail and road travel data yields more granular risk estimates than possible with air passenger data alone (6).
By fitting our epidemiological model to data on the first 19 cases reported outside of China, we estimate an epidemic doubling time of 7.38 (95% CrI: 5.58 - 8.92) days and a total cumulative infections in Wuhan by January 22nd of 11,213 (95% CrI: 1,590 - 57,387) (see Appendix for details), which are consistent with a recent epidemiological analysis of the first 425 patients confirmed in Wuhan (7). Based on these estimates of early epidemic growth, 128 Chinese cities have at least a 50% chance of having had 2019-nCoV case importations from Wuhan in the three weeks preceding the Wuhan quarantine (Figure 1). By January 26th, 107 of these high risk cities had reported cases and 21 had not, including five cities with importation probabilities exceeding 99% and populations over two million people: Bazhong, Fushun, Laibin, Ziyang, and Chuxiong. Under our lower bound estimate for the doubling time (5.58 days), 186 of the 369 cities lie above the 50% importation risk threshold.
This risk assessment identified several cities throughout China that were likely to have undetected cases of 2019-nCoV in the week following the quarantine, and suggests that early 2020 ground and rail travel seeded cases far beyond the Wuhan region quarantine. We note that these conclusions are based on several key assumptions. Our mobility model may be demographically biased by the user base of Tencent, a major social media company that hosts apps including WeChat (∼1.13 billion active users in 2019) and QQ (∼808 million active users in 2019) (8). Further, there is considerable uncertainty regarding the lag between infection and case detection. Our assumption of a 10-day lag is based on early estimates for the incubation period of 2019-nCoV (7) and prior estimates of the lag between symptom onset and detection for SARS (9). We expect that estimates for the doubling time and incidence of 2019-nCoV will improve as reconstructed linelists (6) and more granular epidemiological data become available (see Appendix Table S2 for sensitivity analysis). However, our key qualitative insights are likely robust to these uncertainties, including extensive pre-quarantine 2019-nCoV exportations throughout China and far greater case counts in Wuhan than reported prior to the quarantine.
Data Availability
Data will become available upon publication in a peer-reviewed journal.
Supplementary Appendix
Data
We analyze the daily number of passengers traveling between Wuhan and 369 other cities in mainland China. We obtained mobility data from the location-based services (LBS) of Tencent (https://heat.qq.com/). Users permit Tencent to collect their real-time location information when they install apps (e.g., WeChat, QQ, and Tencent Map). From the geolocation of their users over time, Tencent reconstructed anonymized origin-destination mobility matrices by mode of transportation (air, road and train) between 370 cities in China (368 cities in mainland China and the Special Administrative Regions of Hong Kong and Macau). The data are anonymized and include 28 and 32 million trips to and from Wuhan, respectively, between December 3, 2016 to January 24, 2017. We estimate daily travel volume throughout the seven weeks preceding the Wuhan quarantine (December 1, 2019 - January 22, 2020) by aligning the dates of the Lunar New Year, resulting in a three day shift. To infer the number of new infections in Wuhan per day between December 1, 2019 and January 22, 2020, we use the mean daily number of passengers travelling to the top 27 foreign destinations from Wuhan in 2019 or 2018, which are provided by recent studies (11–13).
Model
We consider a simple hierarchical model to describe the dynamics of 2019-nCoV infections, detections and spread.
(1) Epidemiological model
Based on the epidemiological evidence of the first 425 patients of 2019-nCoV that were confirmed in Wuhan by January 22, 2020 (7), we make the following assumptions regarding the number of new cases dIw(t) infected in Wuhan per day t:
The 2019-nCoV epidemic was growing exponentially between December 1, 2019 and January 22, 2020: where i0 denotes the number of initial cases on December 1, 2019 (14), and λ denotes the epidemic growth rate in Wuhan between December 1, 2019, and January 22, 2020.
After infection, new cases were detected with an average delay of D = 10 days (15), which comprises an incubation period of 5-6 days (7,16–20) and a delay from symptom onset to detection of 4-5 days (4,21). During this 10-day interval, cases are labeled infected. Given the uncertainty in these estimates, we also performed the estimates by assuming either a shorter delay (D = 6 days) or a longer delay (D = 14 days) between infection and case detection (Table S2).
Our model can be improved by incorporating the probability distribution for the delay between infection and detection, as reconstructed linelists (22–25) and more granular epidemiological data is becoming available.
Under these assumptions, the number of infected cases at time t is given by and the prevalence of infected cases is given by where Nw=11.08 million is the population size of Wuhan.
(2) Mobility model based on the Poisson process
We assume that visitors to Wuhan have the same daily risk of infection as residents of Wuhan.
Risk of exportation from infected residents
Under this assumption, the rate at which infected residents of Wuhan travel to city j on day t is γj,t = ξ(t) · Wj,t, where Wj,t is the number of Wuhan’s local residents that travel to city j on day t.
Risk of exportation from infected travelers
Similarly, the rate at which travelers from city j get infected and return to their home city while still infected is ψ j,t = ξ(t) · Mj,t, where Mj,t is the number of travelers from city j to Wuhan on day t. We make the simplifying assumption that newly infected visitors to Wuhan will return to their home city while still infectious.
Assuming that the introduction of cases from the epidemic origin, Wuhan, to each city j is essentially a non-homogeneous Poisson process (26–28), we calculate the probability of introducing at least one case from Wuhan to city j by day t using where t0 denotes the start time of the studied period (i.e., December 01, 2019).
Inference of epidemic parameters
We applied a likelihood-based method to estimate model parameters (i.e. the number of initial cases and epidemic growth rate λ) from the arrival times of the first 19 reported cases that travelled from Wuhan to 11 different cities outside of China, as of January 22, 2020 (Table S1). All 19 cases were Wuhan residents. We aggregated all other cities without cases reported by January 22, 2020 into a single location (j = 0).
Let Nj denote the number of Wuhan resident cases detected in location j outside of China, and xj,i denote the arrival time of the i-th resident case detected in location j. Let xj,0 denote the start of international surveillance for travelers from Wuhan (i.e. January 01, 2020 (29)) and E denote the end of the study period (i.e., January 22, 2020). As indicated above, the daily rate of infected residents of Wuhan arriving in location j at time t is γj,t. The log-likelihood for all the 19 cases reported outside of China by January 22, 2020, is given by:
Parameter estimation
We directly estimate the number of initial cases on December 01, 2019, and the epidemic growth rate λ during the period between December 01, 2019 and January 22, 2020. We infer these parameters in a Bayesian framework via Markov Chain Monte Carlo (MCMC) method with Hamiltonian Monte Carlo sampling. We use non-informative flat prior with i0 ∈ [0.1, 10] and λ ∈ [0.01, 0.5]. From these, we derive the doubling time of incident cases as dT = log(2)/ λ and the cumulative number of cases and of reported cases by January 22, 2020. We also derive the basic reproduction number, assuming a susceptible-exposed-infectious-recovery (SEIR) model for 2019-nCoV in which the incubation period is exponentially distributed with mean L ∈ [3, 6] and the infectious period is also exponentially distributed with mean Z ∈ [2, 7]. The reproduction number is then given by R0 = (1 + λ · L) · (1 + λ · Z).
We estimate the case detection rate in Wuhan by taking the ratio between the number of reported cases in Wuhan by January 22, 2020 and our estimates for the number of infections occurring at least 10 days prior (i.e., by January 12, 2020). We truncate our estimate 10 days before the quarantine to account for the estimated time elapsed between infection and case detection, assuming a 5-6 day incubation period (7,16–19,30) followed by 4-5 days between symptom onset and case detection (4,21). Given the uncertainty in these estimates, we also provide estimates assuming shorter and longer delays in the lag between infection and case reporting (Table S3).
We ran 10 chains in parallel. Our trace plot and diagnosis confirmed the convergence of MCMC chains with posterior median and 95% credible intervals (CrI) estimates as follows:
Epidemic growth rate : 0.09394 (95% CrI: 0.0777 - 0.124), corresponding to an epidemic doubling time (of incident cases) of 7.38 (95% CrI: 5.58 - 8.92) days;
Number of initial cases in Wuhan on December 1, 2019: 7.28 (95% CrI: 2.04 - 9.87);
Basic reproduction number : 1.96 (95% CrI: 1.52 - 2.83);
Cumulative number of infections in Wuhan by January 22, 2020: 11,213 (95% CrI: 1,590 - 57,387);
Case detection rate by January 22, 2020: 9.82% (95% CrI: 2.58% - 59.44%). This is the ratio between the 425 cases confirmed in Wuhan during this period (31) and our estimate that 4,326 (95% CrI: 715 - 16,479) cumulative infections occurred by January 12, 2020 (i.e., at least ten days prior to the quarantine to account for the typical lag between infection and case detection).
Supplementary Figures and Tables
Acknowledgements
We thank Henrik Salje, Dongsheng Luo, Bo Xu, Cécile Tran Kiem, Dong Xun, and Lanfang Hu for helpful discussions. We acknowledge the financial support from NIH (U01 GM087719), the Investissement d’Avenir program, the Laboratoire d’Excellence Integrative Biology of Emerging Infectious Diseases program (Grant ANR-10-LABX-62-IBEID), European Union V.E.O project, the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources (KF-2019-04-034), and the National Natural Science Foundation of China (61773091).
Code for estimating epidemiological parameters and probabilities of case introductions, as well as aggregate mobility data are available at: https://github.com/linwangidd/2019nCoV_EID.
Aggregate data are also available in Appendix Table S3. Additional code and data requests should be addressed to Lauren Ancel Meyers (laurenmeyers{at}austin.utexas.edu, 512-471-4950).