Abstract
Four months into the ongoing novel coronavirus disease 2019 (COVID-19) pandemic, this work provides a simple and direct projection of the outbreak spreading potential and the pandemic cessation dates in China, Iran, the Philippines and Taiwan, using the generalized logistic model (GLM). The short-term predicted number of cumulative COVID-19 cases matched the confirmed reports of those who were infected across the four countries, suggesting GLM as a valuable tool for characterizing the transmission dynamics process and the trajectory of COVID-19 pandemic along with the impact of interventions.
Introduction
Coronavirus disease 2019 (COVID-19), caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was identified in China, in November 2019, declared to be a public health emergency of international concern in two months, and recognized as a pandemic on 11th March 2020. As of 10th April 2020, approximately 1.7 million cases of COVID-19 have been reported in 210 countries and territories, resulting in approximately 95,800 deaths. Emerging infectious diseases (EID), which appear for the first time, or that may have existed previously but is rapidly are possibly the deadliest and continue to challenge human health.
As the world races to find a vaccine or a treatment to combat the pandemic, many concerns arise about the outbreak severity, particularly the potential number of infected people. Hence, it is of a great importance to estimate the outbreak evolution using epidemiology models. Here, the epidemiological dataset of confirmed cases with COVID-19 in China, Iran, the Philippines and Taiwan, as of 10th April 2020, was analyzed, using the generalized logistic model (GLM), also known as Richards’ model. This empirical function has made many remarkable coincidences with real SARS, Zika and Ebola epidemic data for real-time prediction of outbreak development (1–4). Early assessment of the severity of infection and transmissibility can help quantify COVID-19 pandemic potential and anticipate the likely number of infected people by the end of the epidemics.
Data and methods
Specific countries were selected as denoting different COVID-19 incidence scales. China and Iran, the two major centers of COVID-19 outbreak in eastern and southern Asia, respectively with tens of thousands of cases; the Philippines, a representative of an archipelagic country with thousands of cases, and Taiwan, with only hundreds of cases. For each country, the officially reported data on COVID-19 daily cases from the onset of the outbreak to April 10th, 2020 were collected from governmental or health authorities’ websites (Appendix Table, Figure). To allow outbreak projection, the data from the early phase of the outbreak (35–40 days) were fitted with the GLM as described previously (1,5). Briefly, this approach enables the evaluation of the cumulative number of COVID-19 cases, represented by Y(t). The dynamics of Y, in period t, can be expressed as:
Where K is the upper asymptote, or the maximum cumulative case incidence, r is the intrinsic growth rate and tm is the turning point, the time where maximum number of cases per day occur. The model predicts that cumulative COVID-19 case incidence follows an S-shaped curve and gradually reaches K. Not a single new COVID-19 case emerging within 3 consecutive months defines the end of the epidemic (1).
The basic reproduction number of COVID-19, R0, the expected number of secondary cases produced by a single infection was estimated as R0 = erT, where T is the mean serial interval; the time that elapses between onset of symptoms in the primary case and onset of symptoms of the secondary case.
The model should conform to several assumptions: first, as the number of tests conducted affects the reported number of daily cases, similar number of individuals are tested daily. Secondly, cases are not imported from outside the country. Thirdly, the model does not consider human behavior and is conditional on the assumption that public gatherings are highly limited, allowing the epidemic to follow its natural course. Likewise, implementation of intervention measures, such as enhanced hygiene, isolation, contact tracing, restrictions on social contacts and migration by air or train, are maintained continuously.
Results and Discussion
Corresponding to the basic premise of GLM, the cumulative COVID-19 cases curve of each country consists of a single peak of high incidence, resulting in a sigmoid curve with a single turning point (Figure). For all localities, high correlations between observed and predicted incidence were found (R2>0.99, p value<2.2e−16). To evaluate the forecasting performance of the model, the total number of cases on April 10th were estimated based on the observed incidence during the initial stage of the pandemic. A subset data of 35 days was used for China (January 21st to February 24th 2020), Iran (February 26th to March 31st 2020) and the Philippines (February 19th to March 24th 2020). In Taiwan, due to a smaller scale of cases, dataset of 40 days (February 25th to April 4th 2020) was required for optimal fit to the model. Forecasting the total number of cases on April 10th 2020 (Figure), was done 46, 10, 17 and 6 consecutive days ahead for China, Iran, the Philippines and Taiwan, respectively. In all cases, the predicted cumulative number of cases was similar to the one officially reported (Table). The predicted number was 81797 for China, 68225 for Iran, 4430 for the Philippines and 425 for Taiwan, while the observed total number of cases was 81953, 68192, 4195 and 382, respectively. The maximum predicted cumulative incidence, K, was estimated to be 81797 for China, 97576 for Iran, 6300 for the Philippines, and 479 for Taiwan [see Table for 95% confidence interval (CI)]. The earliest time for the current COVID-19 pandemic to cease, was evaluated to occur after 73 days (April 2nd) in China, 147 days (July 21st) in Iran, 122 days (June 19th) in the Philippines, and 80 days (May 14th) in Taiwan. A mean serial interval of 5.8 days (6) was used to calculate R0, the basic reproduction number of COVID-19 infections. R0 estimates were 3.59, 1.86, 2.99 and 2.26 in China, Iran, the Philippines and Taiwan, respectively, similar to those published previously (7) and to that of SARS (8). These early stages R0 are likely to decrease with control measurements policies continuation. Indeed, predicting COVID-19 dynamics based on its initial growth phase, revealed that the turning point of each country has occurred closer to the lower limit of the 95% CI and even earlier than expected, due to the control measurements effectiveness. The turning point was estimated as day 19.4±0.27 in China, 37.1±3.6 in Iran, 47.4±11.6 in the Philippines and 31.2±1.5 in Taiwan, whereas the observed highest number of daily cases in China (14108), Iran (3186), the Philippines (538) and Taiwan (27) has occurred on day 23, 34, 42, and 23, respectively (Table).
Forecasting COVID-19 pandemic is challenging in the context of an outbreak caused by novel pathogen for which its natural history and modes of transmission are unknown. Since the GLM is trained on the existing data and is designed to fit the development of epidemic curves, rather than EID estimation only, it could provide a good fit to the limited available COVID-19 epidemiological data to characterize the transmission dynamics process and the trajectory of COVID-19 pandemic along with the impact of interventions (9). This prediction is conditional to intervention measures continuation. Changes of the current policies or human behavior may affect the actual contact rate and the subsequent development of the epidemic. Additionally, testing kits deficiency can lead to poor diagnosis and incomplete data, which may implicate the model robustness. Even so, forecasting epidemic size and peak time may help clarify what the future holds, and could be useful to make long-term strategic decisions regarding the distribution of testing and treatment facilities that may be required, and may be helpful to assess the extent of protective and medical al equipment needed for the near future.
Data Availability
See Appendix Table
Acknowledgments
We thank the Adelis Foundation, Leona M. and Harry B. Helmsley Charitable Trust, Jeanne and Joseph Nissim Foundation for Life Sciences, Tom and Sondra Rykoff Family Foundation Research and the Raymond Burton Plant Genome Research Fund for supporting the AA lab activity. A.A. is the incumbent of the Peter J. Cohn Professorial Chair. The authors report no funding related to this research and no competing interests.