Predicting Trends of Coronavirus Disease (COVID-19) Using SIRD and Gaussian-SIRD Models ======================================================================================= * Ahmad Sedaghat * Amir Mosavi ## Abstract Eruption of COVID-19 patients in 215 countries worldwide have urged for robust predictive methods that can detect as early as possible size and duration of the contagious disease and also providing precision predictions. In many recent literatures reported on COVID-19, one or more essential parts of such investigation were missed. One of crucial elements for any predictive method is that such methods should fit simultaneously as many data as possible; these data could be total infected cases, daily hospitalized cases, cumulative recovered cases and deceased cases and so on. Other crucial elements include sensitivity and precision of such predictive methods on amount of data as the contagious disease evolved day by day. To show importance of these aspects, we have evaluated the standard SIRD model and a newly introduced Gaussian-SIRD model on development of COVID-19 in Kuwait. It is observed that SIRD model quickly pick up main trends of COVID-19 development; but Gaussian-SIRD model provides precise prediction at longer period of time. Keywords * Coronavirus disease * COVID-19 * outbreak model * Gaussian-SIRD model * SIRD model * epidemiological model ## I. Introduction COVID-19 outbreak was first reported as contiguous disease created by novel corona virus in Wuhan, China in late December 2019. Soon after, the world has faced with the most rapidly transmitting pandemic disease which halted our normal way of living with 11,312,265 infected and 531,257 deceased cases worldwide on 5th July 2020 [1]. Quick and precise prediction of endemic/pandemic contagious diseases dynamics is very important for health authorities and governments to tackle the disease in the most efficient and economical way. COVID-19 have caused many businesses broken down and several million people worldwide out of their jobs. SIR model is well known and widely used method introduced by Kermack andMcKendrick [2] in epidemiological studies. SIR method is based on 3-set of ordinary differential equations expressing susceptible, infected, and removed populations in a community with constant total population. SIR model assumes that sex, age, social behaviour, and similar factors has no effects on development of a contagious disease. Exact solutions to SIR model may improve predictive capabilities of this method by employing optimization techniques. However, there are a limited exact solution to SIR model in simplified forms. Bailey [3] provided a simplified form of SIR equations. Bohner et al. [4] reported exact solution to Bailey’s model of SIR equations. In his work, recovered population is not considered in susceptible and infection equations. Harko et al. [5] considered birth and death rates in an exact solution to SIR model; however, the validity of their method was not evaluated against actual epidemic data. Maliki [6] and Shabbir et al. [7] reported separately exact solutions to some simplified SIS and SIR equations. These simplified models do not include the removed populations and only consider susceptible and infected populations. In the present work, SIRD model consist of susceptible, infected, recovered, and deceased populations correspond to 4-set of ordinary differential equations (ODE) are considered. In the new Gaussian-SIRD model, a Gaussian function is assumed for infected population; hence, 4-set of explicit analytical solutions are found for SIRD populations. MATLAB optimization are applied to fit the SIRD and Gaussian-SIRD models with COVID-19 data in Kuwait. Goodness of fit functions for both methods were evaluated using the coefficient of determination (R2). Sensitivity and precision of both methods are compared for COVID-19 development in Kuwait and advantageous and pitfalls of each method are discussed and conclusions of this study are drawn. ## II. Materials and methdos To tackle a pandemic, correct evaluation of transmission and growth rates of disease and affected populations are essential. SIR model is the widely used method in epidemiology studies. In SIR type models, total population size is considered constant during an endemic/pandemic. SIR model does not consider human factors including sex of patients, age and social behaviour of patients, and also location of infected cases. These are shortcomings of SIR model for predicting pandemics such as COVID-19 for which strong dependency observed in individual age or sex and social behaviour [8, 9]. ### A. SIRD Model A modified version of SIR model to include deceased population referred as SIRD model here. SIRD model consist of 4-set of ordinary differential equations on susceptible, infected, recovered, and deceased population as follows [10, 11]: ![Formula][1] ![Formula][2] ![Formula][3] ![Formula][4] It is assumed that initial total population (*N*) is constant during an endemic/pandemic: ![Formula][5] In equations (1-5), *β* is the transmission rate, *γ**R* is the recovery rate, *γ**D* is the death rate in an endemic/pandemic. *γ* (= *γ**R* + *γ**D*) expresses the removing rate of recovered and deceased populations from susceptible population. An important factor in studying epidemiological field is called the reproduction number *R* (= *β*/*γ*), which indicates severity of an outbreak. If *R* value is larger than one then it is expected that infection will be spread in susceptible population. The larger *R* will indicate harder to control of spread of infection. Observing equations (1-5), an analytical solution to equation (2) for infected population will be sufficient to determine explicit solutions for all other populations in equations (1-5). In the next section, we introduce an analytical solution to SIRD model assuming a Gaussian distribution for infected populations. ### B. Gaussian-SIRD Model Normal or Gaussian distribution function is widely used in statistics to deal with continuous probability of real value of a random variable [12]. The normal probability density function is often used in studying natural and social subjects when real value of the random variables is not known. Gaussian distribution function of two parameter family (*μ, σ*) may be reformulated for infected population of a pandemic as follows [13]: ![Formula][6] In equation (6), *I*(*t*) is the Gaussian normal distribution function of daily infected population, *N* is the total population, *μ* is the mean (or expectation) of the distribution and *σ* is the standard deviation. The *μ*, and *σ* are model parameters and can be obtained by fitting Gaussian equation (6) with infectious population from a pandemic data. Exact solutions to recovered (R) and deceased (D) is simply obtained by plugging equation (6) into equations (3) and (4) to obtain cumulative solution as follows [13]: ![Formula][7] ![Formula][8] In equations (6) and (7), the error function erf(*x*) is a special function and is evaluated numerically. Substituting equation (6) into equation (1), one may obtain a solution for susceptible population as follows: ![Formula][9] ![Formula][10] By changing the total susceptible population (N) in SIRD model, the recovered and deceased will be massively changed. To fine-tune expected recovered and deceased cases, we propose to use variable recovery rate (*γ**R*) and deceased rate (*γ**D*) as follows: ![Formula][11] ![Formula][12] The power factor (n) is any number and can be adequately fine-tuned to best fit the recovered and deceased populations data. The above set of analytical equations (6-8, 11) provide a simple analytical solution to an endemic/pandemic data. We have applied optimization to get best value of coefficients in SIRD and Gaussian-SIRD models. MATLAB algorithm (lsqcurvefit) [14] was applied to find best coefficients in SIRD model equations (1-5). Also, MATLAB algorithm (fminsearch) [15] was used to find best fit coefficients in Gaussian-SIRD model equations (6-8, 11). The goodness of fitted COVID-19 data are examined using the coefficient of determination (R2) [16]. ### C. Evaluation on goodness of fit: Regression coefficient Regression coefficient (R2) is a statistical measure to compare predicted values (*y*) from SIRD or Gaussian-SIRD models here against actual data (*x*) for COVID-19. The evaluation of R2 is done separately for each population: susceptible, infected, recovered, and death as follows [16]: ![Formula][13] ![Graphic][14] in equation (14) is the average of actual COVID-19 data values. Bette fit functions indicated the regression coefficient (R2) close to unity. ### D. Sensitivity analysis To measure sensitivity of a predictive method, a comparison of actual data values (*x*) and predicted values (*y*) can be expressed in percentage of error as follows [16]: ![Formula][15] Error percentage values closer to zero provides a guide on sensitivity and precisions of these methods. ## III. Results We have investigated sensitivity of SIRD and Gaussian-SIRD models on predicting population size and peak day of infectious using 20, 40, 60, 80, 100, and 116 subsequent days of COVID-19 data in Kuwait [17]. Results of optimized SIRD [18] and Gaussian-SIRD models are presented and compared in Fig. 1 for susceptible, infected, recovered, and deceased populations. ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20240499/F1/graphic-15.medium.gif) [](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/F1/graphic-15) ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20240499/F1/graphic-16.medium.gif) [](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/F1/graphic-16) ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20240499/F1/graphic-17.medium.gif) [](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/F1/graphic-17) Figure 1. SIRD and Gaussian-SIRD model results for 20, 40, 60, 80, 100, and 116 subsequent days using optimized coefficients for fitting COVID-19 in Kuwait (18 June 2020). As seen in Figs.1a-1d, the peak day of infection is overpredicted by Gaussian-SIRD model. Gaussian-SIRD model cannot predicts correctly the peak of infection day until 60 days after outbreak. Figs. 1a-1h indicate that SIRD model have started from an overpredicted size of active infected cases until it merges to smaller size; whilst Gaussian-SIRD behaviour is opposite started from small population size to more realistic size after 80 days. Figs. 1a-1j also indicate that SIRD model have predicted a nearly fixed population size for recovered cases; but Gaussian-SIRD model overpredicted the recovered cases until 100 days from the outbreak. Figs. 1k-1l, shows that both methods have nearly predicated same results after 116 days from outbreak; although better fit is observed using Gaussian-SIRD model to COVID-19 data. Table 1 shows accuracy and optimized coefficients of SIRD and Gaussian-SIRD equations for 20, 40, 60, 80, 100, and 116 consequent days of COVID-19 data in Kuwait. COVID-19 data were obtained from the ministry of health (MOH) web-link in Kuwait. View this table: [TABLE 1.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/T1) TABLE 1. Optimized parameters and regression coefficients obtained for SIRD and Gaussian-SIRD models using 20, 40, 60, 80, 100, and 116 subsequent days of COVID-19 in Kuwait (18 June 2020). From Table 1, it is observed that for 20 days data SIRD model cannot provide any meaningful value for regression coefficient; however, Gausian-SIRD model cannot provide only for death cases (zero case data). Also, in SIRD model, the total population of N=50,000 is used; but Gaussian-SIRD model requires larger total population size of N=700,000 to properly fit COVID-19 data. As seen in Table 1, SIRD model gives a large re-production number at start of the pandemic; therefore, it is a best candidate for detecting severity of a pandemic. Gaussian-SIRD method is somehow insensitive in terms of detecting severity of the pandemic and cannot be a good indicator. It gives the re-production number greater than one after 100 days of outbreak. Figure 2 compares results of SIRD and Gaussian-SIRD models on predicting peak day of infection of COVID-19 in Kuwait. ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20240499/F2/graphic-19.medium.gif) [](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/F2/graphic-19) ![](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20240499/F2/graphic-20.medium.gif) [](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/F2/graphic-20) Figure 2. Comparison of SIRD and Gaussian-SIRD models on precision and sensitivity for predicting peak day of infection using 20, 40, 60, 80, 100, and 116 subsequent days of COVID-19 data in Kuwait (18 June 2020). Fig. 2a shows that SIRD model started with low susceptible, high active infected, low recovered, high deceased and high total infected cases using 20 days of COVID-19 data. The values are gradually moderated as days of pandemic passed by. All high values are giving true warning on spread of disease; although not realistic. In contrast, Gaussian-SIRD model in Fig. 2b started with least alarming results except with large number of total infected cases with zero death cases on peak day of infection. Fig. 2c shows that how SIRD model predictions are moderated gradually by adding more COVID-19 data, whilst Fig. 2d shows that Gaussian-SIRD model have picked up true dynamics of the pandemic after 60 days. Fig. 2e indicates that predicted SIRD populations are widely changes as time evolved; however, Fig. 2f indicate that Gaussian-SIRD model provide more precise prediction on population size after 60 days. Fig. 2g indicates that percentage error of SIRD model remain high for deceased population except after 116 days of data; although other population size prediction errors are not converging to zero after 116 days of data. This is maybe due to high sensitivity of SIRD model to data values than Gaussian-SIRD model (see Fig. 2h). The Gaussian-SIRD model converged to error values below 10% except for deceased population after 116 days. Table 2 compares percentage error on prediction of population sizes on peak day of infection using both methods. Gaussian-SIRD model offers more precise fit after 116 days of COVID-19; yet too late for any practical use. View this table: [TABLE 2.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20240499/T2) TABLE 2. Sensitivity of SIRD and Gaussian-SIRD models on predicting precise population sizes on peak day of infection as COVID-19 evolved on 20, 40, 60, 80, 100, and 116 subsequent days of the pandemic in Kuwait (18 June 2020). ## IV. Conclutions Construction of analytical solutions to SIR type models may provide quick and precise guide on development of endemic/pandemic diseases to assist health organization and policy makers on controlling crisis. In this paper, a new method is introduced to integrate Gaussian distribution function within SIRD model to study endemic/pandemic data. Gaussian distribution function is a widely used statistical tool in natural and social sciences. Sensitivity and precision of both SIRD and Gaussian-SIRD methods are investigated here using COVID-19 data in Kuwait and the following conclusions are drawn: * Accuracy trends of SIRD model using 20, 40, 60, 80, 100, and 116 days of COVOD-19 data in Kuwait are assessed and it is found that sensitivity of SIRD model is high which gives quickly right alarm even with 20 days of data. * Gaussian-SIRD model requires a large susceptible population to accurately track development of COVID-19; yet the sensitivity of model is very low to be considered as a quick tool for detecting alarming condition. * SIRD model predictions on size of exposed population is not very accurate and error percentage remain high even after 116 days of COVID-19 outbreaks. * Gaussian-SIRD model showed some merits in terms of precision provided below 10% error on population sizes after 116 days of COVID-19 outbreak. Integrating suitable probability density functions within SIR type models provide analytical solutions to epidemic/pandemic situations. These analytical solutions can be easily optimized to fit epidemic/pandemic data to provide fast and robust predictions.. ## Data Availability Data Online available here: [https://www.worldometers.info/coronavirus/](https://www.worldometers.info/coronavirus/) [https://www.worldometers.info/coronavirus/](https://www.worldometers.info/coronavirus/) [https://www.worldometers.info](https://www.worldometers.info) * Received November 29, 2020. * Revision received November 29, 2020. * Accepted November 30, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. [1].Worldometer, [https://www.worldometers.info/world-population/kuwait-population/](https://www.worldometers.info/world-population/kuwait-population/), Accessed 15 June 2020. 2. [2].Kermack, W. O. and McKendrick, A. G. “A Contribution to the Mathematical Theory of Epidemics.” Proc. Roy. Soc. Lond. A 115, 700–721, 1927. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1098/rspa.1927.0118&link_type=DOI) 3. [3]. N.T.J. Bailey, The Mathematical Theory of Infectious Diseases and Its Applications, second ed., Hafner Press [Macmillan Publishing Co., Inc.] New York, 1975, p. xvi+413. 4. [4]. Martin Bohner, Sabrina Streipert, Delfim F.M. Torres, Exact solution to a dynamic SIR model, Nonlinear Analysis: Hybrid Systems, Volume 32, 2019, Pages 228–238. 5. [5]. Tiberiu Harko, Francisco S.N. Lobo, M.K. Mak, Exact analytical solutions of the Susceptible-Infected-Recovered (SIR) epidemic model and of the SIR model with equal death and birth rates, Applied Mathematics and Computation, Volume 236, 2014, Pages 184–194. 6. [6]. S O Maliki, Analysis of Numerical and Exact solutions of certain SIR and SIS Epidemic models, Journal of Mathematical Modelling and Application, 2011, Vol. 1, No. 4, 51–56. 7. [7]. G. Shabbir, H. Khan, and M. A. Sadiq, A note on Exact solution of SIR and SIS epidemic models, arxiv:1012.5035, 22 Dec 2010. 8. [8]. Weston C. Roda, Marie B. Varughese, Donglin Han, Michael Y. Li, Why is it difficult to accurately predict the COVID-19 epidemic? Infectious Disease Modelling, Volume 5, 2020, Pages 271–281. 9. [9]. Lin Jia, Kewen Li, Yu Jiang, Xin Guo, Ting zhao, Prediction and analysis of Coronavirus Disease 2019, arxiv:2003.05447v2, arXiv 2020. 10. [10].Jones, D. S. and Sleeman, B. D. Ch. 14 in Differential Equations and Mathematical Biology. London: Allen & Unwin, 1983. 11. [11].Hethcote H (2000). “The Mathematics of Infectious Diseases”. SIAM Review. 42 (4): 599–653. Bibcode:2000, SIAMR.42.599H. doi:10.1137/s0036144500371907. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1137/s0036144500371907&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=WOS:00016567&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20240499.atom) 12. [12].Amari, Shun-ichi; Nagaoka Hiroshi, (2000). Methods of Information Geometry. Oxford University Press. ISBN 978-0-8218-0531-2. 13. [13].Bryc, Wlodzimierz (1995). The Normal Distribution: Characterizations with Applications. Springer-Verlag. ISBN 978-0-387-97990-8. 14. [14].MathWorks, [https://www.mathworks.com/help/optim/ug/lsqcurvefit.html](https://www.mathworks.com/help/optim/ug/lsqcurvefit.html), Accessed on 18 June 2020. 15. [15].MathWorks, [https://www.mathworks.com/help/matlab/ref/fminsearch.html](https://www.mathworks.com/help/matlab/ref/fminsearch.html), Accessed on 18 June 2020. 16. [16].Devore, Jay L. (2011). Probability and Statistics for Engineering and the Sciences (8th ed.). Boston, MA: Cengage Learning. pp. 508–510. ISBN 978-0-538-73352-6. 17. [17].COVID-19 updates, State of Kuwait live, [https://corona.e.gov.kw/En/](https://corona.e.gov.kw/En/), Accessed 27 May 2020. 18. [18]. Ahmad Sedaghat, Negar Mostafaeipour, Seyed Amir Abbas Oloomi, Prediction of COVID-19 Dynamics in Kuwait using SIRD Model, Integrative Journal of Medical Sciences, [https://doi.org/10.15342/ijms.7.170](https://doi.org/10.15342/ijms.7.170), June 2020. [1]: /embed/graphic-1.gif [2]: /embed/graphic-2.gif [3]: /embed/graphic-3.gif [4]: /embed/graphic-4.gif [5]: /embed/graphic-5.gif [6]: /embed/graphic-6.gif [7]: /embed/graphic-7.gif [8]: /embed/graphic-8.gif [9]: /embed/graphic-9.gif [10]: /embed/graphic-10.gif [11]: /embed/graphic-11.gif [12]: /embed/graphic-12.gif [13]: /embed/graphic-13.gif [14]: /embed/inline-graphic-1.gif [15]: /embed/graphic-14.gif