Regression Models for Predictions of COVID-19 New Cases and New Deaths Based on May/June Data in Ethiopia ========================================================================================================= * Alemayehu Siffir Argawu ## Abstract As the 15 of June 2020, we have 7,984,067 total COVID-19 cases, globally and 435,181 total deaths. Ethiopia was ranked 2nd and 15th in the table by 176 new cases and by 3,521 total new cases from African countries. Then, this study aimed to predict COVID-19 new cases and new deaths based on May/June data in Ethiopia using regression model. In this study, I used Pearson’s correlation analysis and the linear regression model to predict COVID-19 new cases and new deaths based on the available data from 12th May to 10th June 2020 in Ethiopia. There was a significant positive correlation between COVID-19 new cases and new deaths with different related variables. In the regression models, the simple linear regression model was a better fit the data of COVID-19 new cases and new deaths than as compared with quadratic and cubic regression models. In the multiple linear regression model, variables such as the number of days, the number of new laboratory tests, and the number of new cases from AA city significantly predicted the COVID-19 new cases. In this model, the number of days and new recoveries significantly predicted new deaths of COVID-19. The number of days, daily laboratory tests, and new cases from Addis Ababa city significantly predicted new COVID-19 cases, and the number of days and new recoveries significantly predicted new deaths from COVID-19. According to this analysis, if strong preventions and action are not taken in the country, the predicted values of COVID-19 new cases and new deaths will be 590 and 12 after two months (after 9th of August) from now, respectively. The researcher recommended that the Ethiopia government, Ministry of Health and Addis Ababa city administrative should give more awareness and protections for societies, and they should also open more COVID-19 laboratory testing centers. Generally, the obtained results of this study may help Ethiopian decision-makers put short-term future plans to face this epidemic. Keywords * COVID-19 Cases * Days * Tests * Correlation * Regression Model ## 1. Introduction Corona virus disease (COVID-19) is an infectious disease that is caused by severe acute respiratory syndrome known as corona virus. The virus was first identified on 31 December 2019 in the city of Wuhan, which is the capital of Hubei Province in China. Some of the common signs of COVID-19 include fever, shortness of breath and dry coughs. Other uncommon symptoms include muscle pain, mild diarrhea, abdominal pain, sputum production, loss of smell, and sore throat. On 11 of March 2020, the WHO announced that it was a global pandemic [1–5]. On 15th of June 2020, as the Worldometer corona virus updates information reported that we have 8,028,253 total COVID-19 cases, 436,276 total deaths with 5.4% fatality rate and 4,148,128 total recovered with 51.7% recovered rate as globally. It was distributed from highest to lowest ranks of the new cases and deaths by the World Regions as follows: North America has led by 2,480,701 total new cases (1st) and 144,979 total deaths (2nd), Europe has 2,398,779 total new cases (2nd) and 188,001 total deaths (1st), Asia has 1,616,962 total new cases (3rd) and 40,248 total deaths (4th), South America has 1,425,696 total new cases (4th) and 60,457 total deaths (3rd), Africa has 244,578 total new cases (5th) and 6,490 total deaths (5th), and the last in both ranks is Oceania, which has 8,931 cases and 124 deaths. In the report, the male and female cases were 71% and 29%, respectively [6]. On this date, Ethiopia was ranked as the 2nd, 15th, 16th and 23rd on the table by 176 new cases, by 3,521 total COVID-19 cases, by 60 total deaths (1.7%) and by 620 total recovered (17.6%) as compared from African countries as the Worldometer corona virus updates information showed [6]. The report also showed that Ethiopia was listed on the 27th place by the capacity of COVID-19 laboratory tests. It was 1,629 tests per 1,000,000 populations. This is bad news for Ethiopia. Currently, the Ethiopian population is near 115 million. This is the fact that Ethiopia has a very low proportion of COVID-19 laboratory tests compared with other countries tests. This report indicated that Ethiopia needs increasing efforts and strategies to increase daily laboratory tests. Otherwise, she will be the next ‘‘African USA’’. Then, this study aimed to achieve the following objectives: 1. To predict COVID-19 new cases based on the number of days, and daily laboratory tests using a regression model. 2. To predict new deaths based on the number of days, new cases, and new recovered using a regression model. ## 2. Method ### 2.1. Data The COVID-19 new case report data were collected from Ethiopia Ministry of Health and Ethiopian Public Heath Institution reports from their face book and telegram pages [7, 8]. 14th of March 2020 was the first date that COVID-19 was confirmed in Ethiopia. The time period of data was from the 12th of May to the 10th of June 2020 (for the last 30 days). The data included the total number of new cases, date of recorded, number of new total COVID-19 cases, number of new deaths, number of new recoveries, persons who have contacted infected cases, number of male total COVID-19 cases, number of new cases from AA city, and others. In this study, I used Pearson’s correlation analysis and the linear regression model to predict COVID-19 new cases based on the available data from 12th May to 10th June 2020 in Ethiopia. ### 2.2. Regression Model Regression analysis is a statistical technique for investigating and modeling the relationship between variables. Applications of regression are numerous and occur in almost every field, including engineering, life and biological sciences, and other sciences. In fact, regression analysis may be the most widely used statistical technique. The regression model has many variants such as linear regression, polynomial regression, and others [9]. ### 2.2.1. Linear Regression Model The fitted multiple linear regression model was used to determine the most predicator variables for COVID-19 new cases and new deaths from 12th May to 10th June 2020 in Ethiopia. ![Formula][1] Where: Y is COVID-19 new cases or new deaths. And X1, X2, and X3 are independent variables with their corresponding coefficients (b1, b2, and b3). And, b is the intercept coefficient in the model. ### 2.2.2. Polynomial Regression Model The fitted cubic regression model was used to estimate the parameters of independent variables as X, X2, and X3. All the estimated parameters (b1, b2, and b3) shown the change of Y when the independent variable changed from x to x+1. ![Formula][2] Where: Y is COVID-19 new cases or new deaths. And X is an independent variable with corresponding regression coefficient b(s). And, b is the intercept coefficient in the model. ## 3. Result ### 3.1. Frequency Statistics of COVID-19 Cases #### Total number of COVID-19 laboratory tests The total numbers of COVID-19 laboratory tests were 158,521 and 121,938 from 14th of March to 10th of June and 12th of May to 10th of June, respectively. This indicates that the testing rate was increased to 77%. The newly reported COVID-19 laboratory tests peaked at 6,092 on June 07, 2020. #### Prevalence of COVID-19 cases The total COVID-19 cases were 2,506 and 2,257 from 14th of March to 10th of June and from 12th of May to 10 of June, respectively. This indicates that new cases from 12th May to 10th June (duration of 30 days) were *increased by 10 times* from 14th March to 11th May (duration of 60 days). Thus, the prevalence of COVID-19 new cases increased from 158 to 185 per 10,000 laboratory tests per day from 12th May to 10th June in Ethiopia, 2020. The new cases peaked at 190 cases on June 09, 2020. #### Death rate The crude mortality or death rate was 1.4% (35) and 1.3% (30) from 14 to 10 June and from 12 May to 10 June, respectively. This showed that the death of COVID-19 was *6 times greater from* 12th May to 10th June than from 14 March to 11 May in Ethiopia, 2020. New deaths peaked at 7 deaths on June 7th in 2020. #### Recovery rate The recovery rates were 16% (401) and 13% (294) from 14 March to 10 June and from 12 May to 10 June, respectively. In addition, 73.3% of the total recovered cases were reported from 12th May to 10th June in Ethiopia, 2020. View this table: [Table 1.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T1) Table 1. Percentage and rates of COVID-19 cases from 14th of March to 10 of June and from 12th May to 10th June in Ethiopia, 2020 ### 3.2. COVID-19 New Cases by Regions and Genders Table 2 below illustrates the percentages of COVID-19 cases by gender and region from 12th May to 10th June (for the last 30 days) in Ethiopia, 2020. From the total number of 2,257 COVID-19 new cases, the majorities (64%) were males and 36% of them were females. This indicated that male group was infected less in Ethiopia as compared with world male cases was 71% [6]. Addis Ababa city has covered the majority (74%) of the Pandemic. The Somali region (7%) has taken the 2nd highest coverage of the virus. Oromia and Amhara regions have equally shared the COVID-19 new cases (each 6%). However, Tigray and other regions (such as SNNP, Afar, Harare, etc.) have taken only 3% of COVID-19 new cases by each. In addition, foreign natives have received only 1% of the distribution. From the history of infected cases, patients’ contacts and travel histories not known accounted for the majority (54%, 594) of the COVOD-19 total COVID-19 cases. The patients’ travel history from abroad and contacts with other new cases covered 26% (283) and 20% (218), respectively. All these histories of the cases were considered from May 12th to June 2nd, 2020. View this table: [Table 2.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T2) Table 2. Prevalence of COVID-19 new cases by gender and region from 12th May to 10th June (for the last 30 days) in Ethiopia, 2020 ### 3.3. Descriptive Statistics of COVID-19 Cases Table 3 below demonstrates the descriptive statistics of the COVID-19 new cases from 12th May to 10th June in Ethiopia, 2020. View this table: [Table 3.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T3) Table 3. Descriptive statistics of the COVID-19 cases from 12th May to 10th June (for the last 30 days) in Ethiopia, 2020 #### Daily laboratory tests The average value of COVID-19 conducted laboratory tests was 4,065 per day with its min (1,775) and max (6,187) in the given duration. #### COVID-19 new cases The average value of COVID-19 new cases was 75 per day, with minimum and maximum values of 2 and 190, respectively. AA city had recorded the highest COVID-19 new cases (56) per day in the given duration. The average values of the minimum and maximum ages of COVID-19 new cases were 9.4 years and 71 years with their smallest and largest ages of 1 month and 115 years, respectively. ### 3.4. Correlation Analysis for COVID-19 New Cases The correlation analysis showed that there were significantly positive correlations between COVID-19 new cases and the number of days, daily laboratory tests, new cases of males, new cases of females, new cases from AA city, new cases from foreign natives, new cases with unknown contact and travel histories, and new cases with contact with infected persons (table 4). View this table: [Table 4.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T4) Table 4. Correlation analysis between COVID-19 New Cases and related variable ### 3.5. Correlation Analysis for New Deaths Additionally, the correlation analysis for new deaths showed that there were significantly positive correlations between the number of new deaths and the number of days, COVID-19 new cases, daily laboratory tests, new cases from AA city, new recoveries, new cases of males, new cases of females and maximum age of new cases. However, there was a significantly negative correlation between new deaths and the minimum age of new cases (Table 5). View this table: [Table 5.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T5) Table 5. Correlation analysis for the relationship b/n New Deaths of COVID19 and related variables ### 3.6. Regression Model for COVID-19 New Cases Table 6 showed that the linear regression model has the highest F-value (120.7) and the smallest MSE value (637.4) as compared with quadratic and cubic models. And, the number of days was significant predictor for new cases in the linear regression model (p-value of 0.000). But, this variable and its two expressions were not predictors in the quadratic and cubic regression model. View this table: [Table 6.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T6) Table 6. Regression model summary and parameter estimates to predict COVID-19 New Cases View this table: [Table 6.](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/T7) Table 6. Regression model summary and parameter estimates to predict New Deaths Thus, the fitted linear regression model was much better than the quadratic and cubic models. However, the three models have a similar R square values as 81% and 82% variations of COVID-19 new cases was explained by the models (figure 1). Then, the estimated linear regression equation is ![Formula][3] ![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F1.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F1) Figure 1: Simple line graph of COVID-19 new cases by the number of days This implied that the new cases will be increased to 585 after 100 days. And the table also showed that a daily laboratory test was significant predictor for new cases in the linear regression model (p-value of 0.000). But, this variable and its two expressions were not predictors in the quadratic and cubic regression model. And, the fitted linear regression model has the highest F-value (19.5) and but not the smallest MSE value (1993.3) as compared with quadratic and cubic models. Thus, the fitted linear regression model was much better than the quadratic and cubic models. However, the cubic regression model has a better R square value as 46% variations of COVID-19 new cases was explained by the model. And, the linear regression model explained 41% of the variations (figure 2). From the table, the fitted linear regression equation is ![Formula][4] ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F2) Figure 2: Simple line graph of COVID-19 new cases by daily laboratory tests This reported that the new cases will be raised to 3,400 if 100,000 laboratory tests were conducted daily. Similarly, new case from Addis Ababa city was predicted significantly the new cases in the linear regression model (p-value of 0.000) with R2 = 93% of the new cases variations was explained by cases from Addis Ababa city (figure 3). Then, the estimated linear regression equation is ![Formula][5] ![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F3.medium.gif) [Figure 3:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F3) Figure 3: Simple line graph of COVID-19 new cases by new cases from AA city This suggested that the country’s new cases will be increased to 12,000 if 10,000 new cases were found in AA city. ### 3.7. Regression Model for New Deaths Table 7 presented that the linear regression model has the highest F-value (20.2) and the smallest MSE value (1.7) as compared with quadratic and cubic models. And, the number of days was significant predictor for new deaths in the linear regression model (p-value of 0.000). But, this variable and its two expressions were not predictors in the quadratic and cubic regression model. Thus, the fitted linear regression model was much better than the quadratic and cubic models. However, the quadratic or cubic models have a better R2 = 48% and the linear regression model has R2 = 42% of variation of new death was explained by the models (figure 4). Then, the estimated linear regression equation is ![Formula][6] ![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F4.medium.gif) [Figure 4:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F4) Figure 4: Simple line graph of COVID-19 new deaths by the number of days This suggested that the new deaths will be increased to 12 after 100 days. And the table also showed that a new case was significant predictor for new deaths in the linear regression model (p-value of 0.001). And, the fitted linear regression model has the highest F-value (19.5) and the smallest MSE value (2.02) as compared with quadratic and cubic models. Thus, the fitted linear regression model was much better than the quadratic and cubic models. However, the cubic regression model has a better R square value as 36% variations of COVID-19 new deaths was explained by the model. And, the linear regression model explained 31% of the variations (fig 5). From the table, the fitted linear regression equation is ![Formula][7] ![Figure 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F5.medium.gif) [Figure 5:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F5) Figure 5: Simple line graph of new deaths by new cases This indicated that the new deaths will be raised to 16 if 1,000 new cases were found. Similarly, new recovered was predicted significantly the new deaths in the linear regression model (p-value of 0.000) with R2 = 57% of new deaths variations was explained by new recovered (figure 6). Then, the estimated linear regression equation is ![Formula][8] ![Figure 6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F6.medium.gif) [Figure 6:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F6) Figure 6: Simple line graph of new deaths by new recovered This implied that the new deaths will be increased to 100 if 1,000 patients recovered from the virus. ### 3.8. Multiple Linear Regression Model for COVID-19 New Cases In this model, COVID-19 new cases were predicted significantly by the number of days, daily laboratory tests and new cases from Addis Ababa city at the 5%, 10% and 1% levels of significance, respectively. The fitted MLR for COVID-19 New Cases, ![Formula][9] 1. COVID-19 new cases are predicted to increase 135 when the number of days increases by 100 days while holding other variables constant. 2. COVID-19 new cases are predicted to increase 503 when the daily laboratory tests rise by 100,000 tests while holding other variables constant. 3. COVID-19 new cases are predicted to increase 881 when the new cases from Addis Ababa city increase by 10,000 tests while holding other variables constant. In addition, it is predicted to be 0 (negative cases not applicable) when the three variables are zero. In the model, R2 = 96% of the variation in COVID-19 new cases was explained by the model (predictors). In the hypothesis test (F-statistic = 205.1 with DF = 3 & 26, p-value = 0.000), there was enough evidence to reject the null hypothesis that all the model’s coefficients are 0. The residual standard error = 12.16 shows how far the observed total COVID-19 new cases (Y-values) are from the predicted total COVID-19 new cases (![Graphic][10])(Figure 7, R-software output). ![Figure 7:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F7.medium.gif) [Figure 7:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F7) Figure 7: R Output of MLR for estimating the parameter to predict COVID-19 New Cases ### 3.9. Multiple Linear Regression (MLR) Model for New Deaths Due to COVID-19 In this model, new deaths due to COVID-19 were predicted significantly by the number of days and new recoveries at the 10% and 1% levels of significance, respectively. The fitted MLR for new deaths due to COVID-19, ![Formula][11] 1. New deaths due to COVID-19 are predicted to increase by 6 when the number of days increases by 100 days while holding other variables constant. 2. New deaths are predicted to increase 79 when the new recoveries rise by 1,000 tests while holding other variables constant. 3. In addition, it is predicted to be 0 (negative deaths not applicable) when the two variables are zero. In the model, R2 = 62.4% of the variation in new deaths was explained by the model (predictors). In the hypothesis test (F-statistic = 22.4 with DF = 2 & 27, p-value = 0.000), there was enough evidence to reject the null hypothesis that all the model’s coefficients are 0. The residual standard error = 1.07 shows how far the observed new deaths (Y-values) are from the predicted new deaths ![Graphic][12](Figure 8, R-software output). ![Figure 8:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F8.medium.gif) [Figure 8:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F8) Figure 8: R-Output of MLR for estimating the parameter to predict new deaths ![Figure 9:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/09/07/2020.09.04.20188094/F9.medium.gif) [Figure 9:](http://medrxiv.org/content/early/2020/09/07/2020.09.04.20188094/F9) Figure 9: R-Output of Multiple Linear Regression Assumptions ## 4. Discussion In the correlation analysis for COVID-19, new cases had significant and positive correlations with the number of days (r = 0.901), daily laboratory tests (r = 0.641), new recoveries (r = 0.389), new cases from males (r = 0.985), new cases from females (r = 0.964), new cases from AA city (r = 0.965), new cases from foreign natives (r = 0.416), new cases had unknown contact and travel histories (r = 0.958), and new cases had contacts with infected persons (r = 0.534). In the correlation analysis, new deaths due to COVID-19 were significantly and positively correlated with the number of days (r = 0.648), COVID-19 new cases (r = 0.555), daily laboratory tests (r = 0.445), new cases from AA city (r = 0.533), new recoveries (r = 0.753), new cases of male (r = 0.53), new cases of female (r = 0.562) and the maximum age of new cases (r = 0.400, weak). However, it was significantly and negatively correlated with the minimum age of new cases (r = –0.426). As worldwide, a one study shown that there was a strong significant relationship between new deaths and new cases (r = 0.766, p = 0.01) per country. And, a similar study from Philippine showed that there was a correlation significantly between new cases and deaths (r = 0.440, p = 0.002). Other study from Algeria found there dates had correlations with daily new COVID-19 cases, and with daily new deaths [10–12]. And other highly qualified study from Indian found that the number of daily deaths was significantly and positively correlated with the number of recoveries (r = 0.28, p = 0.042), and the number of tests was significant and positively with COVID-19 cumulative infected cases (r = 0.59, p = 0.000) and with cumulative deaths (r = 0.55, p = 0.000). And, daily infections had significantly and positively correlated with daily deaths (r = 0.49, p = 0.000) [13]. The simple linear regression model was a better fit the data of COVID-19 new cases and new deaths than quadratic and cubic regression models. In this fitted model, COVID-19 new cases were significantly predicted by the number of days (*B* = 5.85), daily laboratory tests (*B* = 0.034), and new cases from Addis Ababa city *(B* = 1.2) at 5% level of significance. And, new deaths were significantly predicted by the number of days (*B = 0.12*), new cases *(B* = 0.016), and new recovered (*B = 0.1*) at 5% level of significance. A study from Indian found that the linear regression growth model was more specific to predict the number of affected cases of COVID-19 than the exponential growth model. These models used for forecasting in long term intervals. And, other study from Indian showed that the linear model was the best fitting model for Region III from May 3rd to May 15th. Another study from India used simple linear regression analysis of the number of deaths as a function of the number of confirmed cases. In this study, the coefficient of determination (R2) was calculated to be 0.997, which implies a strong linear correlation between confirmed and dead cases. Whereas, the quadratic model was the best fitting model for Region II from April 6th to May 2nd [14–16]. In the multiple linear regression modl, COVID-19 new cases were predicted significantly by the number of days, daily laboratory tests and new cases from Addis Ababa city at the 5%, 10% and 1% levels of significance, respectively. Thus, COVID-19 new cases are predicted to increase 135, 503 and 881 when the number of days increases by 100 days, the daily laboratory tests increase by 100,000 tests, and the new cases from Addis Ababa city increase by 10,000 tests while holding other variables constant. In the multiple linear regression model, new deaths due to COVID-19 were predicted significantly by the number of days and new recoveries at the 10% and 1% levels of significance, respectively. Thus, new deaths due to COVID-19 were predicted to increase 6 and 79 when the number of days increased by 100 days and the new recoveries rose by 1,000 tests while holding other variables constant, respectively. This study agreed with a study from Kenya showing that there was a correlation between COVID-19 new cases and contact persons made by the confirmed status as well as the number of flights from foreign countries to Kenya. The study using univariate analysis of the generalized linear model showed that contact persons in Kenya had 0.265 effects on COVID-19 cases in Kenya. In the multivariate analysis, the contact persons and flights to Kenya had 0.278 and 3,309 effects on COVID-19 cases in Kenya at the 5% and 10% levels of significance, respectively. Researchers in Kenya also used the compound Poisson regression model, which showed that as the COVID-19 day increased by 235, the COVID-19 new cases were projected to 83,418 new cases [17]. A study from India used a linear regression analysis to predict the average week 5 and 6 death counts. In the study, there was a strong correlation between weeks 5 and 6 death counts with total cases, active cases, recoveries, and week 4 death counts. Despite this, the week 4 variables (total cases, active cases, and recoveries) were not significantly predicted by weeks 5 and 6 deaths count. However, the week 4 death counts significantly predicted the week 5 death count. Therefore, my study agreed with this study on the correlation analysis but not on the linear regression analysis [18]. ### Limitations of The study The main limitation of this analysis was that the data were not found together as collectively for all the previous reports and were taken from the face book and telegram pages of Ethiopia Ministry of Health. Second, limiting my analysis was that some data values were missed to report for 8 dates (such as the contact and travel history of the cases). ### Strength of the Study Despite all the limitations, the greatest strength of this study was the very high adjusted R2 found in the predictive model. Three predictors for COVID-19 new cases were found in the multiple linear regression model, and its assumptions were fitted. In addition, there was cross-validation with two different software programs (R and SPSS). ## 5. Conclusion and Recommendation ### 5.1. Conclusion There were 2,506 total COVID-19 cases and 35 deaths due to COVID-19 with a crude mortality of 1.4% from 14th march to 10th June in Ethiopia, 2020. However, the total cases and total deaths of COVID-19 were 10 times and 6 times more, respectively, from the 12th of May to the 10th of June compared to the 14th of March to the 11th of May 2020 in Ethiopia. In the correlation analysis, the COVID-19 new cases were significantly correlated with the number of days, daily laboratory tests, new recoveries, new cases of males, new cases of females, new cases from Addis Ababa city, new cases with unknown contact and travel histories, and new cases that had contacts with infected persons. In the correlation analysis, the new deaths due to COVID-19 were significantly correlated with the number of days, daily laboratory tests, COVID-19 new cases, new recoveries, new cases of males, new cases of females, new cases from Addis Ababa city, maximum age of new cases, and minimum age of new cases. In the simple linear regression, variables such as the number of days (coefficient *B* = 5.9), daily laboratory tests (coefficient *B* = 0.0334), new recoveries (*B* = 1.83), new cases of males (*B* = 1.58), new cases of females (*B* = 2.4), new cases from Addis Ababa city *(B* = 1.2), new cases from foreign natives (*B* = 19.2), new cases that had contact with other infected persons *(B* = 2.38), and new cases with unknown contact and travel histories *(B* = 1.2) significantly predicted COVID-19 new cases. In the simple linear regression model, variables such as the number of days (*B = 0.124*), COVID-19 new cases *(B* = 0.016), daily laboratory tests (*B = 0.001*), new recoveries (*B = 0.104*), new cases from Addis Ababa city (*B = 0.019*), new cases of males (*B = 0.025*), new cases of females (*B = 0.041*), minimum age of new cases (*B = –0.104*) and maximum age of new cases (*B = 0.037*) significantly predicted new deaths. In the multiple linear regression model, variables of the number of days, daily laboratory tests and new cases from Addis Ababa city significantly predicted COVID-19 new cases at the 5%, 10% and 1% levels of significance, respectively. Thus, COVID-19 new cases are predicted to increase 135, 503 and 881 when the number of days increases by 100 days, the daily laboratory tests increase by 100,000 tests, and the new cases from Addis Ababa city increase by 10,000 tests while holding other variables constant. In the multiple linear regression model, variables of the number of days and new recoveries are predicted new deaths due to COVID-19 at the 10% and 1% levels of significance, respectively. Thus, new deaths due to COVID-19 were predicted to increase 6 and 79 when the number of days increased by 100 days and the new recoveries rose by 1,000 tests while holding other variables constant, respectively. Finally, according to this analysis, if strong preventions and action are not taken in the country, the predicted values of COVID-19 new cases and new deaths will be 590 and 12 after two months (after 9th of August) from now, respectively. ### 5.2. Recommendation Even if Ethiopia has taken strong measures, including complete lockdown of both its internal and external borders and announced the command posts and keeping social isolation for the last three months, the number of new cases and deaths due to COVID-19 new cases were highly increased day to day. The research has predicted the total number of COVID-19 new cases where it is easy to see how it is likely to progress in the future. The above information should help the government make plans on how to deal with pandemics, especially when dealing with the current situation in Ethiopia. The prevalence of the disease and its crude mortality from 12th May to 10th June 2020 increased more and more, and Ethiopia might be one of the top countries from Africa by leading this Pandemic for the next months if very strong necessary measures will not be taken into consideration. The government must come up with more isolation beds, more trained health care professionals, and more mass education and campaigns with the aim of ensuring that the public has information about how to stop the spread of the virus. Let us consider that a huge population of the Ethiopia population lives in rural areas, more education as well as infrastructure must be done in the rural area to make preparations in case the COVID-19 finds its way more for the districts and the villages. Moreover, the Ethiopia government, the Ministry of Health and Regional Governments (especially the AA city administrative and Somali region) should give more awareness and protections collaboratively for societies, and they should also open more COVID-19 laboratory testing health centers in different areas of the country to ensure that those health centers can test more persons as the number of days increases, and the number of new cases will be highly increased. With these preventive and curative measures, the severity of COVID-19 will be limited when compared to other countries, such as the USA, South Africa and Egypt, which are now leading in the number of new cases and deaths in the world and in Africa. This research work will be extended after looking for the spread of the disease instantaneously by using a comparison of linear regression and nonlinear regression models. ## Data Availability The data are available if required. ## Data Availability The data are available if required. ## Funding The research work was not supported by anyone. ## Conflict of interest The author declares that I have no competing interests. ## Author contributions All the research’s sections were performed by A.S. Argawu. I was the only author of the research. ## Acknowledgments The data was collected from the daily reports of Ethiopia Ministry of Health’s, and Ethiopian Public Health Institute’s face book pages and telegram addresses from 12th May to 10th June 2020. So, I acknowledged them too much. ([https://www.facebook.com/EthiopiaMoH](https://www.facebook.com/EthiopiaMoH)). ## Abbreviations AA : Addis Ababa city CI : Confidence Interval MLR : Multiple Linear Regression SLR : Simple Linear Regression * Received September 4, 2020. * Revision received September 4, 2020. * Accepted September 7, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. [1]. Y. Wang, M. Hu, Q. Li, X.-P. Zhang, G. Zhai, and N. Yao, “Abnormal respiratory patterns classifier may contribute to large-scale screening of people infected with COVID-19 in an accurate and unobtrusive manner,” 2020, [Online]. Available: [http://arxiv.org/abs/2002.05534](http://arxiv.org/abs/2002.05534) 2. [2].WHO and W. (PRC) Aylward, Bruce (WHO); Liang, “Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19),” WHO-China Jt. Mission Coronavirus Dis. 2019, vol. 2019, no. February, pp. 16–24, 2020, [Online]. Available: [https://www.who.int/docs/default-source/coronaviruse/who-china-joint-mission-on-covid-19-final-report.pdf](https://www.who.int/docs/default-source/coronaviruse/who-china-joint-mission-on-covid-19-final-report.pdf) 3. [3]. Z. Hu, Q. Ge, S. Li, L. Jin, and M. Xiong, “Artificial Intelligence Forecasting of Covid-19 in China,” no. April, 2020, [Online]. Available: [http://arxiv.org/abs/2002.07112](http://arxiv.org/abs/2002.07112) 4. [4]. Z. Tao, J. Tian, Y. Pei, M. Yuan, Y. Zhang, and F. Dai, “A new coronavirus associated with human respiratory disease in China,” vol. 579, no. January, 2020, doi: 10.1038/s41586-020-2008-3. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-020-2008-3&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32015508&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F09%2F07%2F2020.09.04.20188094.atom) 5. [5].W. H. Organisation, “WHO Director-General’s opening remarks at the media briefing on COVID-19-11 March 2020,” WHO Dir. Gen. speeches, no. March, p. 4, 2020, [Online]. Available: [https://www.who.int/dg/speeches/detail/who-director-general-s-opening-remarks-at-the-mediabriefing-on-covid-19\---|11-march-2020](https://www.who.int/dg/speeches/detail/who-director-general-s-opening-remarks-at-the-mediabriefing-on-covid-19\---|11-march-2020) 6. [6].Worldometer, “Coronavirus Cases,” Worldometer, pp. 1–22, 2020, doi: 10.1101/2020.01.23.20018549V2. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1101/2020.01.23.20018549V2&link_type=DOI) 7. [7].E. G. Organization and S. All, “Ministry of Health, Ethiopia Albums,” vol. 2020, pp. 1–55, 2020. 8. [8].E. Medical and S. All, “Ethiopian Public Health Institute Albums,” vol. 2020, pp. 1–34, 2020. 9. [9]. G. G. Douglas C. Montgomery, Elizabeth A. Peck, Introduction to Linear Regression Analysis. A JOHN WILEY & SONS, INC., PUBLICATION, 2012. 10. [10]. E. J. Montalban, R. Canja, R. Caluban, and M. Van Buladaco, “Correlational Analysis of COVID-19 Deaths and Recoveries per Country,” no. June, 2020, doi: 10.2139/ssrn.3624543. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2139/ssrn.3624543&link_type=DOI) 11. [11]. J. A. Tadle, C. M. Regalado, R. J. De Jesus, and M. Van Buladaco, “Correlational Analysis of COVID-19 Number of Daily New Cases and Number of Daily Deaths per Day: A Case in the Philippines,” SSRN Electron. J., no. January, 2020, doi: 10.2139/ssrn.3624526. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2139/ssrn.3624526&link_type=DOI) 12. [12]. B. Anwar et al., “iMedPub Journals Correlation between New Daily COVID-19 Cases, New Daily Deaths and Google Trends for Masks in Algeria,” vol. 2020, no. Figure 1, pp. 10–12, 2020, doi: 10.36648/1989-5216.12.4.319. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.36648/1989-5216.12.4.319&link_type=DOI) 13. [13]. H. Yonar, “Modeling and Forecasting for the number of cases of the COVID-19 pandemic with the Curve Estimation Models, the Box-Jenkins and Exponential Smoothing Methods,” Eurasian J. Med. Oncol., vol. 4, no. 2, pp. 160–165, 2020, doi: 10.14744/ejmo.2020.28273. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.14744/ejmo.2020.28273&link_type=DOI) 14. [14]. A. Maharana, “Predicting the COVID-19 positive cases in India with concern to Lockdown by using Mathematical and Machine Learning based Models Abstract: Keywords: Introduction:,” 2020. 15. [15]. V. K. Sharma and U. Nigam, “Modelling of Covid-19 cases in India using Regression and Time Series models,” no. around 168500, 2020. 16. [16]. P. Chauhan, A. Kumar, P. Jamdagni, and D. Physics, “Regression Analysis of COVID-19 Spread in India and its Different,” vol. 2020, no. June, pp. 1–14, 2020. 17. [17]. J. O. Odhiambo, P. Ngare, P. Weke, and R. O. Otieno, “Modelling of COVID-19 Transmission in Kenya Using Compound Poisson Regression Model,” J. Adv. Math. Comput. Sci., no. May, pp. 101–111, 2020, doi: 10.9734/jamcs/2020/v35i230252. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.9734/jamcs/2020/v35i230252&link_type=DOI) 18. [18]. S. Ghosal, S. Sengupta, M. Majumder, and B. Sinha, “Prediction of the number of deaths in India due to SARS-CoV-2 at 5–6 weeks,” Diabetes Metab. Syndr. Clin. Res. Rev., vol. 14, no. 4, pp. 311–315, 2020, doi: 10.1016/j.dsx.2020.03.017. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.dsx.2020.03.017&link_type=DOI) [1]: /embed/graphic-1.gif [2]: /embed/graphic-2.gif [3]: /embed/graphic-12.gif [4]: /embed/graphic-14.gif [5]: /embed/graphic-16.gif [6]: /embed/graphic-18.gif [7]: /embed/graphic-20.gif [8]: /embed/graphic-22.gif [9]: /embed/graphic-24.gif [10]: /embed/inline-graphic-1.gif [11]: /embed/graphic-26.gif [12]: /embed/inline-graphic-2.gif