Abstract
On January 23, 2020, China quarantined Wuhan to contain an emerging coronavirus (COVID-19). We estimated the probability of transportation of COVID-19 from Wuhan to 369 cities in China before the quarantine. The expected risk is >50% in 130 (95% CI 89–190) cities and >99% in the 4 largest metropolitan areas of China.
In December 2019, a novel coronavirus (COVID-19) emerged in Wuhan, China (1). On January 30, 2020, the World Health Organization (WHO) declared the outbreak a public health emergency of international concern (2). By January 31, 2020, 192 fatalities and 3,215 laboratory-confirmed cases were reported in Wuhan; 8,576 additional cases were spread across >300 cities in mainland China; and 127 exported cases were reported in 23 countries/states spanning Asia, Europe, Oceania, and North America. The rapid global expansion, rising fatalities, unknown animal reservoir, and evidence of person-to-person transmission potential (3,8) initially resembled the 2003 SARS epidemic and raised concerns about global spread.
On January 22, 2020, China announced a travel quarantine of Wuhan and by January 30, expanded the radius to include 16 cities, encompassing a population of 45 million. At the time of the quarantine, China was already 2 weeks into the 40-day Spring Festival, during which several billion people travel throughout China to celebrate the Lunar New Year (4). Considering the timing of exported COVID-19 cases reported outside of China, we estimate that only 8.95% (95% CrI 2.22% - 28.72%) of cases infected in Wuhan by January 12 might have been confirmed by January 22, 2020. By limiting our estimate to infections occurring ≥10 days before the quarantine, we account for an estimated 5–6-day incubation period and 4–5 days between symptom onset and case detection (3,5,6,8) (Appendix). The low detection rate coupled with an average lag of 10 days between infection and detection (6) suggest that newly infected persons who traveled out of Wuhan just before the quarantine might have remained infectious and undetected in dozens of cities in China for days to weeks. Moreover, these silent importations already might have seeded sustained outbreaks that were not immediately apparent.
We estimated the probability of transportation of infectious COVID-19 cases from Wuhan to cities throughout China before January 23 by using a simple model of exponential growth coupled with a stochastic model of human mobility among 369 cities in China (Appendix). Given that an estimated 98% of all trips between Wuhan and other Chinese cities during this period were taken by train or car, our analysis of air, rail, and road travel data yields more granular risk estimates than possible with air passenger data alone (7).
By fitting our epidemiologic model to data on the first 19 cases reported outside of China, we estimate an epidemic doubling time of 7.31 days (95% CrI 6.26 - 9.66 days) and a cumulative total of 12,400 (95% CrI 3,112–58,465) infections in Wuhan by January 22, 2020 (Appendix). Both estimates are consistent with a recent epidemiologic analysis of the first 425 cases confirmed in Wuhan (8). By assuming these rates of early epidemic growth, we estimate that 130 cities in China have ≥50% chance of having a COVID-19 case imported from Wuhan in the 3 weeks preceding the quarantine (Figure). By January 26th, 107 of these high-risk cities had reported cases and 23 had not, including 5 cities with importation probabilities >99% and populations >2 million: Bazhong, Fushun, Laibin, Ziyang, and Chuxiong. Under our lower bound estimate of 6.26 days for the doubling time, 190/369 cities lie above the 50% threshold for importation. Our risk assessment identified several cities throughout China likely to be harboring yet undetected cases of COVID-19 a week after the quarantine, suggesting that early 2020 ground and rail travel seeded cases far beyond the Wuhan region under quarantine.
Our conclusions are based on several key assumptions. To design our mobility model, we used data from Tencent (https://heat.qq.com), a major social media company that hosts applications, including WeChat (≈1.13 billion active users in 2019) and QQ (≈808 million active users in 2019) (9); consequently, our model might be demographically biased by the Tencent user base. Further, considerable uncertainty regarding the lag between infection and case detection remains. Our assumption of a 10-day lag is based on early estimates for the incubation period of COVID-19 (8) and prior estimates of the lag between symptom onset and detection for SARS (10). We expect that estimates for the doubling time and incidence of COVID-19 will improve as reconstructed linelists and more granular epidemiologic data become available (Appendix). However, our key qualitative insights likely are robust to these uncertainties, including extensive pre-quarantine COVID-19 exportations throughout China and far greater case counts in Wuhan than reported before the quarantine.
Data Availability
Data will become available upon publication in a peer-reviewed journal.
Supplementary Appendix
Data
We analyzed the daily number of passengers traveling between Wuhan and 369 other cities in mainland China. We obtained mobility data from the location-based services of Tencent (https://heat.qq.com). Users permit Tencent to collect their real-time location information when they install applications, such as WeChat (≈1.13 billion active users in 2019) and QQ (≈808 million active users in 2019), and Tencent Map. By using the geolocation of users over time, Tencent reconstructed anonymized origin–destination mobility matrices by mode of transportation (air, road, and train) between 370 cities in China, including 368 cities in mainland China and the Special Administrative Regions of Hong Kong and Macau. The data are anonymized and include 28 million trips to and 32 million trips from Wuhan, during December 3, 2016–January 24, 2017. We estimated daily travel volume during the 7 weeks preceding the Wuhan quarantine, December 1, 2019–January 22, 2020, by aligning the dates of the Lunar New Year, resulting in a 3-day shift. To infer the number of new infections in Wuhan per day during December 1, 2019–January 22, 2020, we used the mean daily number of passengers traveling to the top 27 foreign destinations from Wuhan during 2018–2019, which were provided in other recent studies (1–3).
Model
We considered a simple hierarchical model to describe the dynamics of 2019 novel coronavirus (COVID-19) infections, detections, and spread.
Epidemiologic Model
By using epidemiologic evidence from the first 425 cases of COVID-19 confirmed in Wuhan by January 22, 2020 (4), we made the following assumptions regarding the number of new cases, dIw(t), infected in Wuhan per day, t.
The COVID-19 epidemic was growing exponentially during December 1, 2019–January 22, 2020, as determined by the following: in which i0 denotes the number of initial cases on December 1, 2019 (5), and λ denotes the epidemic growth rate during December 1, 2019–January 22, 2020.
After infection, new cases were detected with a delay of D = 10 days (6), which comprises an incubation period of 5–6 days (4,7–11) and a delay from symptom onset to detection of 4–5 days (12,13). During this 10-day interval, we labeled cases as infected. Given the uncertainty in these estimates, we also performed the estimates by assuming a shorter delay (D = 6 days) and a longer delay (D = 14 days) between infection and case detection (Appendix Table 2).
Our model can be improved by incorporating the probability distribution for the delay between infection and detection, as reconstructed linelists (14–17) and more granular epidemiologic data are becoming available.
Under these assumptions, we calculated the number of infectious cases at time, t, by the following: The prevalence of infectious cases is given by the following: in which N ω = 11.08 million, the population of Wuhan.
Mobility Model
We assume that visitors to Wuhan have the same daily risk for infection as residents of Wuhan and construct a nonhomogeneous Poisson process model (18–20) to estimate the exportation of COVID-19 by residents of and travelers to Wuhan. Let Wj,t denote the number of Wuhan residents that travel to city j at time t, and Mj,t denote the number of travelers from city j to Wuhan at time t. Then, the rate at which infected residents of Wuhan travel to city j at time t is given γj,t = ξ(t) × Wj,t, and the rate at which travelers from city j get infected in Wuhan and return to their home city while still infected is Ψj,t = ξ(t) × Mj,t. This model assumes that newly infected visitors to Wuhan will return to their home city while still infectious.
Based on this model, the probability of introducing ≥1 cases from Wuhan to city j by time t is given by where t0 denotes the beginning of the study period, December 1, 2019.
Inference of Epidemic Parameters
We applied a likelihood-based method to estimate our model parameters, including the number of initial cases i0 and the epidemic growth rate λ, from the arrival times of the 19 reported cases transported from Wuhan to 11 cities outside of China, as of January 22, 2020 (Appendix Table 1). All 19 cases were Wuhan residents. We aggregated all other cities without cases reported by January 22, 2020 into a single location (j = 0).
Let Nj denote the number of infected Wuhan residents who were detected in location j outside of China, denote the time at which the i-th Wuhan resident case was detected in location j, χj,0 denote the time at which international surveillance for infected travelers from Wuhan began (January 1, 2020) (21), and E denote the end of the study period (January 22, 2020). As indicated above, the rate at which infected residents of Wuhan arrive at location j at time t is γj,t. Then, the likelihood for all 19 cases reported outside of China by January 22, 2020 is given by which yields the following log-likelihood function:
Parameter Estimation
We directly estimated the number of initial cases, i0, on December 1, 2019, and the epidemic growth rate, λ, during December 1, 2019–January 22, 2020. We infer the epidemic parameters in a Bayesian framework by using the Markov Chain Monte Carlo (MCMC) method with Hamiltonian Monte Carlo sampling and noninformative flat prior. From these, we derive the doubling time of incident cases as dT = log(2)/λ and the cumulative number of cases and of reported cases by January 22, 2020. We also derived the basic reproduction number by assuming a susceptible-exposed-infectious-recovery (SEIR) model for COVID-19, in which the incubation period is exponentially distributed with mean L in the range of 3 - 6 days and the infectious period is also exponentially distributed with mean Z in the range of 2 to 7 days. The reproduction number is then given by R0 = (1 + λ × L) × (1 + λ × Z).
We estimated the case detection rate in Wuhan by taking the ratio between the number of reported cases in Wuhan by January 22, 2020 and our estimates for the number of infections occurring ≥10 days prior (i.e., by January 12, 2020). We truncated our estimate 10 days before the quarantine to account for the estimated time between infection and case detection, assuming a 5–6 day incubation period (4,7–11) followed by 4–5 days between symptom onset and case detection (12,13). Given the uncertainty in these estimates, we also provide estimates assuming shorter and longer delays in the lag between infection and case reporting (Appendix Table 3).
We ran 10 chains in parallel. Trace plot and diagnosis confirmed the convergence of MCMC chains with posterior median and 95% CrI estimates as follows:
Epidemic growth rate, λ: 0.095 (0.072 - 0.111), corresponding to an epidemic doubling time of incident cases of 7.31 (95% CrI 6.26 - 9.66) days;
Number of initial cases in Wuhan on December 1, 2019: 7.78 (95% CrI 5.09 - 18.27);
Basic reproductive number, R0: 1.90 (95% CrI 1.47 - 2.59);
Cumulative number of infections in Wuhan by January 22, 2020: 12,400 (95% CrI 3,112–58,465);
Case detection rate by January 22, 2020: 8.95% (95% CrI 2.22% - 28.72%). This represents the ratio between the 425 confirmed cases in Wuhan during this period (22) and our estimate that 4,747 (95% CrI 1,480–19,151) cumulative infections occurred by January 12, 2020 (i.e., ≥10 days before the quarantine to account for the typical lag between infection and case detection).
Acknowledgements
We thank Henrik Salje, Dongsheng Luo, Bo Xu, Cécile Tran Kiem, Dong Xun, and Lanfang Hu for helpful discussions. We acknowledge the financial support from NIH (U01 GM087719), the Investissement d’Avenir program, the Laboratoire d’Excellence Integrative Biology of Emerging Infectious Diseases program (Grant ANR-10-LABX-62-IBEID), European Union V.E.O project, the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources (KF-2019-04-034), and the National Natural Science Foundation of China (61773091).
Code for estimating epidemiological parameters and probabilities of case introductions, as well as aggregate mobility data are available at: https://github.com/linwangidd/2019nCoV_EID. Aggregate data are also available in Appendix Table S3. Additional code and data requests should be addressed to Lauren Ancel Meyers (laurenmeyers{at}austin.utexas.edu, 512-471-4950).