Abstract
This paper proposed a quarantine-susceptible-exposed-infectious-resistant (QSEIR) model which considers the unprecedented strict quarantine measures in almost the whole of China to resist the epidemic. We estimated model parameters from published information with the statistical method and stochastic simulation, we found the parameters that achieved the best simulation test result. The next stage involved quantitative predictions of future epidemic developments based on different containment strategies with the QSEIR model, focused on the sensitivity of the outcomes to different parameter choices in mainland China. The main results are as follows. If the strict quarantine measures are being retained, the peak value of confirmed cases would be in the range of [52438, 64090] and the peak date would be expected in the range February 7 to February 19, 2020. During March18-30, 2020, the epidemic would be controlled. The end date would be in the period from August 20 to September 1, 2020. With 80% probability, our prediction on the peak date was 4 days ahead of the real date, the prediction error of the peak value is 0.43%, both estimates are much closer to the observed values compared with published studies. The sensitive analysis indicated that the quarantine measures (or with vaccination) are the most effective containment strategy to control the epidemic, followed by measures to increase the cured rate (like finding special medicine). The long-term simulation result and sensitive analysis in mainland China showed that the QSEIR model is stable and can be empirically validated. It is suggested that the QSEIR model can be applied to predict the development trend of the epidemic in other regions or countries in the world. In mainland China, the quarantine measures can’t be relaxed before the end of March 2020. China can fully resume production with appropriate anti-epidemic measures beginning in early April 2020. The results of this study also implied that other countries now facing the epidemic outbreaks should act more decisively and take in time quarantine measures though it may have negative short-term public and economic consequences.
Introduction
In late December, 2019, an atypical pneumonia case, caused by a virus called COVID-19, was first reported and confirmed in Wuhan, China. Although the initial cases were considered to be associated with the Huanan Seafood Market, the source of the COVID-19 is still unknown. The confirmed cases increased with exponential speed, from 41 on January 10, 2020 to 5,974 on January 28, 2020 in mainland China, far exceeding those of the SARS epidemic in 2003 (see figure 1). By February 22(24:00 GMT), 2020, there have been 76,936 cumulative confirmed cases of COVID-19 infections in mainland China, including 2,442 cumulative deaths and 22,888 cumulative cured cases. 64,084 cumulative confirmed cases were in Wuhan, accounting for 83.3% of the cumulative confirmed cases in mainland China. Equally of concern, a WHO news release noted that 1,400 cases were reported in 26 countries outside China, with the Republic of Korea (346), Japan (105) and Singapore (86) ranked as the top 3 (figure 2), while 35 cases were reported in United States of America1.
The transmissibility of COVID-19 — or at least its geographical distribution (figure 2) — seems to be higher and broader than initially expected (Horton, 2020). Compared to SARS-CoV (9.56% mortality) and MERS-CoV (34.4% mortality), the COVID-19 appears to be less virulent at this point except for the elderly and those with underlying health conditions (table 1). COVID-19 was confirmed as subject to human-to-human transmission and it is very contagious. The basic reproduction number R0 for COVID-19 was estimated by WHO and some research institutes in the range of 1.4-6.6 (table 2). This value is slightly higher than that of the 2003 SARS epidemic, and much higher than that of influenza and Ebola. The incubation days of COVID-19 in Wuhan city is 5-10 days with a mean of 7 days (Fan et al., 2020). On average, the duration from confirmed stage to cure or death is 10 days in nation-wide reporting according to Guan et al. (2020). A long incubation period and an associated large number of patients with mild symptoms increase the difficulty of prevention and control of the epidemic. The likelihood of travel-related risks of the disease spreading has been noted by Bogoch et al. (2020) and Cao et al. (2020a) wherein they indicated the potentials for further regional and global spread (Leung et al., 2020).
As the epidemic broke out on the eve of the Spring Festival, large-scale population movements and gatherings of people aggravated the epidemic. After the outbreak, local governments have adopted a series of unprecedented mitigation policies in place to contain the spread of the epidemic. The major local public emergency started with a category Class I response to health incidents, with positively diagnosed cases either quarantine or put under a form of self-quarantine at home (Gan et al., 2020). Suspicious cases were confined in monitored house arrest. Most exits and entries into cities were shut down. Certain categories of contact were banned; for instance, universities and schools remained closed, and many businesses remained closed. People were asked to remain in their homes for as much time as possible (Fahrion et al., 2020). These interventions have reduced the population’s contacts to a certain extent, helped to cut off pathways for the spread of the virus and reduce the rate of disease transmission.
However, the long-term management and control has brought considerable inconvenience to the daily lives of people. The failure of factories to start on time and run normally after the Spring Festival also had severe effects on Chinese national and global economies. Ayittey et al. (2020) and CNN Business (2020) estimated it would result in China’s GDP declining 4.5% year-on-year in Q1 in 2020; the loss in China would be up to $62 billion in the same quarter. Zhang (2020), Huang (2020), Li and Zhang (2020) and IMF News (2020) considered the growth of China’s GDP would be 5.0%-5.6% in 2020, decrease 0.5-1.1 percentage points from 2019. IHS Markit (2020) estimated a reduction of global real GDP of 0.8% in Q1 and 0.5% in Q2 in 2020, and the global real GDP would be reduced by 0.4% in 2020. The longer the duration of the epidemic, the more negative the impacts on China and the rest of the world, with the latter effects largely centered on disruptions in increasingly complicated supply chains. Therefore, it is important to estimate the dynamic evolution mechanism of the epidemic in mainland China, to find when the epidemic will end and how this result depends on different containment strategies. These are issues of great significance with important clinical and policy implications (Joseph et al., 2020).
QSEIR Model
The traditional infectious disease dynamics susceptible–exposed–infectious–resistant (SEIR) model has been very popular in analyzing and predicting the development of an epidemic (see Lipsitch et al., 2003; Pastor-Satorras, 2015). SEIR models the flows of people between four states: susceptible (S), exposed (E), infected (I), and resistant (R). Each of those variables represents the number of people in those groups. Assume that the average number of exposed cases that are generated by one infected person of COVID-19 is β. The parameter β is similar to the basic reproduction number which can be thought of as the expected number of cases directly generated by one case in a population where all individuals are susceptible to infection. Considering the protective measures were taken, β should be smaller than the basic reproduction number in table 2. An individual in the exposed state (type E) will have the probability δ changes to individuals in the infected state (type I), and an individual in the infected state (type I) will change to the cure state (type R) with a probability of γ or to death state (type F) with a probability of η per unit time. In contrast to the traditional SEIR model, we propose a quarantine-susceptible-exposed-infectious-resistant (QSEIR) model that considers the unprecedented strict quarantine measures in mainland China to resist the epidemic. The parameter, α(t), was designed to represent the ratio of people who was not restricted to a specific area and had chances to contact with COVID-19 virus during special period. The α(t) and β(t) vary according to the strength of the prevention and control measures for the epidemic. To make the model accord with reality, contrast with the standard SEIR model, we added two parameters Δ(t) and θ(t). The Δ(t) is the ratio of people with vaccination at time t. θ(t) is the natural mortality of the population in a region at time t (figure 3). The value of δ(t) is closely related with the virus incubation and infectious periods and γ(t) is dependent on the treatment level and patients’ health status. It is assumed that the virus incubation period is 7 days and the duration from confirmed stage to cure or death is 10 days based on nation-wide information (Guan et al., 2020; Fan et al., 2020). The model is an ordinary differential equation model, described by the following equation.
Equation (6) is specially designed to fit for China’s actual epidemic prevention measures. In actual calculations, Δ(t) was assumed to be 0, because no vaccination has yet been developed. θ(t) can also be assumed to be 0 if we are only concerned with the fatality of CONVID-19. The other four parameters β(t), γ(t), δ(t) and η(t) are not easy to determine, since the virus incubation period, infectious period, and case statistics that have close relationships with these parameters have varying (unknown) degrees of accuracy. The choice of estimation techniques for the key epidemiological parameters in the QSEIR model of COVID-19 has become a research priority (Cao et al., 2020b).
Data Source
We obtained the number of COVID-19 cases time series data from January 10 to February 22, 2020 for mainland China released by the National Health Commission of China and health commissions at the provincial level in China2. Due to limited testing and treatment resources while facing a major outbreak with a sudden onset, there was under-screening and under-reporting in the early stages of the epidemic in its epicenter, Wuhan, and this generated biases in the data during the early stages (Cao et al., 2020b). Note that this challenge also existed in SARS and other coronavirus outbreaks (Hartley and Smith, 2003; Razum and Becher, 2003). After the isolation of Wuhan on January 23, 2020 with the stricter requirements of data statistics and the provision of detection levels, the data are more and more reliable.
Parameters Estimation
We estimated model parameters reversely with QSEIR model by equations (8)-(12). β(t), γ(t), η(t) and δ(t) can be calculated (see table 4).
From equations (1)-(6), we obtain:
Note that we found some δ(t) in table 4 was>1, which is obviously incorrect, the reason was mainly because biases in the data during the early stages (Cao et al., 2020b). We deleted these data and calculated the average, median and variance of the rest value of the four parameters in first step. In step 2, we deleted values>1.5 times of the column average. In step 3, we calculated the average, median and variance of the rest value of the four parameters (see table 5). With table 5, we set the four parameters belong to the range of their average/median±variance. The parameter α(t) was roughly estimated as 1.2-2.0 times of cumulative confirmed cases on February 22, 2020 divided by population in mainland China.
Then, we set the values of these parameters in their ranges randomly, and input them to QSEIR model, we got E(i), I(i), R(i), F(i) at each day i, we used the real data I0, E0, R0 and F0 from February 13 to February 22 in 2020 to test the accuracy of the simulation by errck with equation (13).
In equation (13), we first calculated the average of the absolute differences between the real data of E0, I0, R0 and F0 and their simulated value of E, I, R and F of kth simulation, then we added the four-average value.
5,0000 times simulation were made (figure 4), the result is convergence. The minimum value of errck is 20.42% (figure 4), and the estimated values of the five parameters in this case were listed in the last row of table 5. They were applied in the long-term simulation. The minimum value of errck is 20.42%, with the current published data that was available, we can use these parameters that can make QSEIR model results with about 80% simulation accuracy.
Results
We set January 23, 2020 as the beginning date of the simulation; the initial values of variables were set as of this date (table 6). If we set the simulation period D as 300 days, input the best parameters we found, with the MATLAB program of QSEIR model, we can present the results shown in figure 5. The results showed that with 80% probability, the peak value of I was 58,264 on February 13, 2020. After June 19, 2020, the value of I would be < 50 and from July 29, 2020, the number would be smaller than 5. By August 26, 2020, I would be smaller than 1, implying that the COVID-19 would essentially end. From March 17, 2020, E would be < 5 and, a week later on March 24, the number of E would be < 1, which means the epidemic would be totally controlled since this day, no new infected people would appear. The cumulative confirmed cases of COVID-19 in mainland China was estimated to be 97,653, and the cumulative number of deaths was estimated to be 8,754.
Considering there have 20% estimation error of errc, the peak value of I would be in the range of [52,438, 64,090] and the peak date would be expected in the range February 7 to February 19, 2020. The end date would be in range from August 20, 2020 to September 1, 2020. During the period between March18-30, 2020 the epidemic would be totally controlled.
Yan et al. (2020) predicted that the peak value of confirmed cased in mainland China would be > 40,000, Hermanowicz (2020) predicted it to be 65,000, while Li and Feng (2020) estimated 51,600. There have been a number of studies estimating the peak number and date of confirmed cases in mainland China in the early stage of the epidemic (Batista, 2020; Gamero et al., 2020; Hermanowicz, 2020; Liu et al., 2020(a); Shi et al., 2020; Xiong and Yan, 2020). However, due to the limited emerging understanding of the new virus and its transmission mechanisms, their results were in the range from January 14, 2020 (Yan et al., 2020) to the beginning of March, 2020 (Geng et al., 2020) (see table 3). Most of them are in the mid of February, 2020, which are approximate to the real date February 17, 2020. The results of Wang et al. (2020b), Gamero et al. (2020) Xiong and Yan (2020), Li et al. (2020c), Hermanowicz (2020) and Shi et al. (2020) were in correspondence with our results, which are closer to the observed data. With 80% probability, our prediction of the peak date is 4 days ahead of the real date, the prediction error of the peak value is 0.43%, both estimates are much closer to the observed values compared with other published studies.
Furthermore, the existing studies seldom provided estimates of the duration of the epidemic and effects of different containment strategies in mainland China. At the regional level, Wu et al. (2020b) concluded that in Guangdong province, the epidemic would be totally controlled by mid to late March, 2020. The cumulative confirmed cases in Guangdong was ranked second among provinces in China. The number was 1,342 on February 22, 2020, which accounted for 1.74% of the cumulative confirmed cases in mainland China. The date on which no new exposed cases should be similar with that of mainland China. The result of Wu et al. (2020b) is correspondence with our result.
Yang et al. (2020) provided that in Chongqing the end date would be about May 11, 2020. The peak value of confirmed cases in Chongqing was 41 on January 30, 2020. The cumulative confirmed cases in Chongqing was 573 on February 22, 2020, accounting for 0.74% of the total in China. Therefore, its end date should be much earlier than that of mainland China. The end date of August 26, 2020 in mainland China in our research can be partly explained by Yang et al. (2020).
Sensitivity analysis
How would E, I, R, F change if the value of parameters (α(t), β(t), η(t), δ(t), γ(t)) varied or if the beginning date January 23, 2020 of the simulation changed? We conducted sensitivity analyses of them in terms of their impacts on the I index one by one.
Figures 6-7 and tables 7-8 showed that the larger the value of β(t) or δ(t), the higher the peak value of the I index and the earlier the peak time. With the increase of β(t) or δ(t), their sensitive coefficient to I index decreased progressively. The sensitivity coefficient of α(t) to I index was the biggest. When α(t) increased 0.001%, 8,596 more confirmed cases will be observed (figure 10 and table 11). These results indicated that quarantine measures (or with vaccination that is not yet available) are the most effective containment strategy to control the epidemic. Figures 8-9 and tables 9-10 showed that the greater the value of γ(t) or η(t), the smaller the peak value of the I index. The peak date of I was not very sensitive to the change of γ(t). When γ(t) increased 1%, confirmed cases will be decrease between 4,395 and 7,432. When η(t) decreased 1%, 4,138 to 4,640 additional confirmed cases could be expected. The average absolute sensitive coefficient of γ(t) and η(t) to I ranked the second and third in those of five parameters (tables 7-11). This showed that to improve the rate of cure, the development of special medicine should be the second most effective measure.
If the beginning date of the simulation changed from January 23 to January 30 or February 6 in2020 with the value of variables in table 12, together with the same estimated value of parameters in table 5 and, QSEIR program, we can show the main results that started from January 30 in figure 11. Compared with the baseline, the peak value of the I index increased 0.9% or 1.5%. The peak date of I or the ended date of COVID-19 would be 3 days or 1 day ahead (figure 12 and table 12). Results mean that the simulating results were not sensitive to the initial start date. The QSEIR model system is stable.
Due to the downward pressure on the economy, some enterprises resumed work one after another in compliance with the requirements of epidemic prevention and control. Because newly confirm ed cases are decreasing day by day since February 17,2020, the outbreak was gradually brought u nder control, some people began to relax their vigilance. Some began to travel; some went out wi thout masks. If the control measures are slightly relaxed from March 10, α(t) increased 0.00001 f rom 0.00006975, which means the number of S increased to 14,000, the end date would be exten ded from August 26, 2020 to September 14, 2020. And the date that the epidemic can be controll ed would be extended 70 days, which would be on June 2nd, 2020. The cumulative confirmed cases would increase from 97,653 to 111,619, up 14.3% (figure 13). Evidence suggests that the colossal public health efforts of the Chinese Government have saved thousands of lives (Editorial, 20 20). It indicated that the quarantine measures should not be relaxed before the end of March, 202 0 in mainland China.
Conclusion
The paper proposed a QSEIR model that considers the unprecedented strict quarantine measures which are more fit for the epidemic situation in mainland China. Parameter estimation is the most critical part when using this kind of SEIR model to predict the trend of epidemic (Cao et al., 2020b). We estimated the model parameters reversely for the QSEIR model from published information with statistical methods and stochastic simulations; from these experiments, we found the parameters that achieved the best simulation test results. The application verified that the method is effective. The paper not only predicted the peak number and peak date of confirmed cases, but also provided estimates of the sensitivity of parameters of QSEIR, the duration of the epidemic and effects of different containment strategies at the same time. The long-term simulation result and sensitive analysis in mainland China showed that the QSEIR model is stable and can be empirically validated. It is suggested that the QSEIR model can be applied to predict the development trend of the epidemic in other regions or countries in the world.
Discussions
In QSEIR model, the parameters are dynamically changing for each day. Parameters estimation is the most important part in the kind of SEIR model (Cao et al., 2020b). The paper illustrated the method to generate the parameter estimations. Given data limitation, we estimated a constant value to each of them with 20% errors in simulation tests, which was the best result in 50000 times stochastic simulation within their statistical ranges. We applied these values in prediction and obtained better results than existed researches. With the improvement of data quality and more data, variable parameters can be estimated and the forecasting accuracy of the model could be enhanced.
The vaccine research and development cycle are relatively long, from researching products to large-scale production and promotion, it takes about 6-18 months. It seems that the COVID-19 vaccination cannot be applied in large-scale quantities before the end of August, 20203.However, the COVID-19 is now spreading more seriously in other countries and regions in the world and there is also the possibility of its returning to China. As of March 7, 2020, 21, 110 confirmed cases of COVID-19 have been reported in 93 countries/territories/areas. Hence, it is imperative that the development of vaccines and specific drugs for COVID-19 should be promoted by many countries with the technical resources to conduct the necessary high-level research. Until they appear, it is the most important that appropriate quarantine measures are retained. In mainland China, the quarantine measures should not be relaxed before the end of March, 2020. China can fully resume production with appropriate anti-epidemic measures beginning in early April, 2020. The results of this study also implied that other countries now facing the epidemic outbreaks should act more decisively and take in time quarantine measures though it may have negative short-term public and economic consequences (Editorial, 2020).
Data Availability
We collated epidemiological data from publicly available data sources (news, articles, press releases, and published reports from public health agencies). All the epidemiological information that we used is documented in the article.
Contributors
Liu X L designed the QSEIR model, gave method to estimate parameters, compiled MATLAB program, got results and wrote the draft of the manuscript. Hewings G suggested to make sensitive analysis of parameters and estimate effects of different containment strategies. He edited the manuscript. Wang S Y explained some results and provided policy implications. Qin M H, Xiang X, Zheng S and Li X F collected data, some references and analyzed some data, the four of them made equal contributions to the paper.
Declaration of interests
We declare no competing interests.
Acknowledgements
This paper was supported by the 2019 Chinese Government Scholarship and National Natural Science Foundation of China under Grants No. 71874184 and No. 71988101.