Abstract
Infectious disease outbreaks pose a significant threat to human health worldwide. The outbreak of pandemic coronavirus disease 2019 (COVID-2019) has caused a global health emergency. Identification of regions with high risk for COVID-19 outbreak is a major priority of the governmental organizations and epidemiologists worldwide. The aims of the present study were to analyze the risk factors of coronavirus outbreak and identify areas with a high risk of human infection with virus in Fars Province, Iran. A geographic information system (GIS)-based machine learning algorithm (MLA), support vector machine (SVM), was used for the assessment of the outbreak risk of COVID-19 in Fars Province, Iran. The daily observations of infected cases was tested in the third-degree polynomial and the autoregressive and moving average (ARMA) models to examine the patterns of virus infestation in the province and in Iran. The results of disease outbreak in Iran were compared with the data for Iran and the world. Sixteen effective factors including minimum temperature of coldest month (MTCM), maximum temperature of warmest month (MTWM), precipitation in wettest month (PWM), precipitation of driest month (PDM), distance from roads, distance from mosques, distance from hospitals, distance from fuel stations, human footprint, density of cities, distance from bus stations, distance from banks, distance from bakeries, distance from attraction sites, distance from automated teller machines (ATMs), and density of villages – were selected for spatial modelling. The predictive ability of an SVM model was assessed using the receiver operator characteristic – area under the curve (ROC-AUC) validation technique. The validation outcome reveals that SVM achieved an AUC value of 0.786 (March 20), 0.799 (March 29), and 86.6 (April 10) a good prediction of change detection. The growth rate (GR) average for active cases in Fars for a period of 41 days was 1.26, whilst it was 1.13 in country and the world. The results of the third-degree polynomial and ARMA models revealed an increasing trend for GR with an evidence of turning, demonstrating extensive quarantines has been effective. The general trends of virus infestation in Iran and Fars Province were similar, although an explosive growth of the infected cases is expected in the country. The results of this study might assist better programming COVID-19 disease prevention and control and gaining sorts of predictive capability would have wide-ranging benefits.
Introduction
In December 2019 several pneumonia infected cases were reported in Wuhan, China [1-2]. In January 2020, a novel coronavirus (2019-nCoV) that was later formally named COVID-19 was approved in Wuhan [3]. It was announced that the disease is a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The virus elevated concerns within China as well as the global community as it was believed to be transmitted from human to human [4]. Initially, China witnessed the largest outbreak in Hubei and other nearby provinces. The spread in China was controlled soon thereafter through stringent preventive measures, but other parts of the world (Europe, the Middle East, and the United States) were increasingly affected by the outbreak through transmission by infected travellers from China. A similar outbreak soon followed in other Asian countries [5]. Its global spread to more than 150 countries led to the declaration in mid-March 2020 that COVID-19 was a pandemic [6]. By April 10, 2020, there were nearly 1.70 million cases worldwide with 102684 deaths attributed to COVID-19 [7]. Currently, the United States has the largest number of confirmed cases, while Italy has reported the highest number of casualties [7-8]. Iran with 68,192 recorded cases and 4232 deaths is the most affected country in the Middle East (as of April 10, 2020) and infected cases are expected to surge in the coming days [7, 9]. The outbreak of COVID-19 has disrupted and depressed the world economy, whereas Iran is among the most severely affected by massive economic losses, largely compounded by politically motivated sanctions imposed by other governments [10]. The problem has been exacerbated as no specific medicine is yet realized for COVID-19 disease treatment, though there are a few pre-existing drugs that are being tested, so regions are presently concentrating their efforts on maintaining the infection rate in a level that assists to reduce virus spread [11]. This has led to most states imposing lockdowns, encouraging social distancing, and restricting the sizes of gatherings to limit transmission [12]. There is a pressing necessity for scientific communities to aid governments in their efforts to control and prevent transmission of the virus [13].
During previous virus outbreaks stemming from Zika, influenza, West Nile, Dengue, Chikungunya, Ebola, Marburg, and Nipah, geographic information systems (GISs) have played significant roles in providing significant insight via risk mapping, spatial forecasting, monitoring spatial distributions of supplies, and providing spatial logistics for management [13]. In this current situation, risk mapping is critical and may be used to aid governments’ need for tracking and management of the disease as it spread in places with the highest risk. Sánchez-Vizcaíno et al. [14] used a multi-criteria decision making (MCDM) model to map the risk of Rift Valley fever in Spain. Traditional statistical techniques had been also used to detect the risk of outbreak [14]. Reeves et al. [15] employed an ecological niche modelling (ENM) technique for mapping the transmission risk of MERS-CoV; the Middle Eastern name for the coronavirus known as SARS-CoV-2. Similar techniques have been in the Nyakarahuka et al. [16] study to map Ebola and Marburg viruses risks in Uganda. They assessed the importance of environmental covariates using the maximum entropy model.
More recently, the use of machine learning algorithms (MLAs) for mapping the risk of transmission of viruses has been increasing which is due to the demonstrated superior (and more accurate) predictive abilities of the MLA models over traditional methods [17]. Jiang et al. [18] employed three MLAs – backward propagation neural network (BPNN), gradient boosting machine (GBM), and random forest (RF) – to map the risk of an outbreak of Zika virus. Tien Bui et al. (2019) compared different MLAs – artificial neural network (ANN) and support vector machine (SVM) with ensemble models including adaboost, bagging, and random subspace – for modelling malaria transmission risk. Similarly, GBM, RF, and general additive modelling (GAM) were used by Carvajal et al. [19] to model the patterns of dengue transmission in the Philippines. Mohammadinia et al. [20] employed geographically weighted regression (GWR), generalized linear model (GLM), SVM, and ANN to develop a forecast map of leptospirosis; GWR and SVM produced highly accurate predictions. The literature shows that very few studies have tried to use GIS for analysis of COVID-19 outbreak in human communities. Kamel Boulos and Geraghty [21] described the use of online and mobile GIS for mapping and tracking COVID-19 whilst Zhou et al. [13] revealed the challenges of using GIS for SARS-CoV-2 big data sources. To our knowledge, there has been no study with focus on mapping the outbreak risk of the COVID-19 pandemic. The aims of the present study were to analyze the risk factors of coronavirus outbreak and test the SVM model for mapping areas with a high risk of human infection with virus in Fars Province, Iran. The outcome of the present study lays a foundation for better programming and understanding the factors that accelerate virus spread for use in disease control plans in human communities.
Materials and methods
Study area
The study area is in the southern part of Iran with an area of 122608 square kilometres located between 27°2°′ and 31°42′ N and between 50°42 ′ and 55°36′ E. Fars is the fourth largest province in Iran (7.7 % of total area) with a population density of 4851274 (based on in 2016 report). Fars Province is divided into 36 counties, 93 districts, and 112 cities (Fig 1).
Methodology
The multi-phased workflow implemented in this investigation (Fig. 2) is described comprehensively below.
Preparation of location of COVID-19 active cases
A dataset of active cases of COVID-19 in Fars was prepared to analyse the relationships between the locations of active cases and the effective factors that may be useful for predicting outbreak risk. The data utilized in this research was collected on April 10, 2020 from Iranian’s Ministry of Health and Medical Education (IMHME).
Preparation of effective factors
Choosing the appropriate effective factors to predict the risk of pandemic spread is vital as its quality affects the validity of the results [17]. Since, there have been no previous studies of risk for COVID-19 distribution, the selection of effective factors is a quiet challenging task. Ongoing research on the pandemic has revealed that local and community-wide transmission of the virus largely happens in public places where the most people are likely to come into contact with largest number of potential carriers of the infection [22]. Wang et al. [23] indicated that meteorological conditions, such as rapidly warming temperatures in 439 cities around the world resulted in a decline of COVID-19 cases. Accordingly, in this research, we selected sixteen most relevant effective factors for the outbreak risk mapping of COVID-19 in Fars Province of Iran, which includes minimum temperature of coldest month (MTCM), maximum temperature of warmest month (MTWM), precipitation in wettest month (PWM), precipitation of driest month (PDM), distance from roads, distance from mosques, distance from hospitals, distance from fuel stations, human footprint, density of cities, distance from bus stations, distance from banks, distance from bakeries, distance from attraction sites, distance from automated teller machines (ATMs) and density of villages. All the effective factors employed in this research are generated using the ArcGIS 10.7.
A few studies have established that variation in temperature would impact the transmission of COVID-19 [23]. It has been also reported that alteration in temperature would have impacted the SARS outbreak, which was caused by the identical type of coronavirus as SARS-CoV-2 [24]. Recently, Ma et al. [2] disclosed that surge in temperature and humidity conditions have resulted in the decline of death caused by SARS-CoV-2. Thus, climatic factors such as temperature and precipitation can have an impact in the outbreak of SARS-CoV-2. The temperature and precipitation data namely MTWM, MTCM, PDM and PCM of Fars Province is acquired from world climatic data (https://www.worldclim.org/). In this study, the MTWM of the Fars Province ranges from 27.7°C to 41.8°C (Fig 3a) whereas MTCM ranges between −15.3°C and 10.4°C (Fig 3b). The PWM of the study area varies between 28 mm and 86 mm (Fig 3c) and also the PDM is presented in Fig 3d.
The proximity to various public places including roads, mosques, hospitals, fuel stations, bus stations, banks, bakeries, attraction sites, and ATMs where people come in close contact to each other can also be considered as significant factors that influence the distribution of COVID-19. The distance from roads ranges from 0 to 45 in the study area (Fig 3e) whereas the distance from mosques varies between 0 and 0.71 (Fig 3f) and the distance from fuel stations spans 0 to 0.67 (Fig 3g). The distance from bus stations, banks, bakeries, attraction sites, and ATMs of Fars Province have the minimum value of 0 and maximum value of 1.31, 0.68, 0.97, 0.79, and 0.78 respectively (Fig 3h – 3l). Since, humans are the potential carriers of the COVID-19, the use of human footprint (HFP) can aid in understanding the terrestrial biomes on which humans have more influence and access [25]. In this study, HFP of the study area is acquired from the Global Human Footprint Dataset. The HFP of Fars Province ranges from 6 to 78 (Fig 3m) where the minimum value represents the places having least access by humans and the maximum value refers to those regions having highest human influence and access. The density of population is also considered to be an important factor for the spread of the disease [26-27]. Gilbert et al., [28] revealed that the number of COVID-19 cases were proportional to the population density in Africa. Accordingly, in this research, density of cities and villages were assessed and the outcome displays that density of cities in Fars Province ranges between 0 and 0.60 (Fig 3n) while the density of villages varies from 0 to 0.58 (Fig 3o). The distance from hospitals ranged from 0 to 1.11 (Fig 3p).
Evaluation of variable importance using ridge regression
The association among the location of COVID-19 active cases and effective factors were evaluated using ridge regression in order to assess the significance of individual effective factor in predicting the outbreak risk [17]. To our knowledge, no previous study in epidemic outbreak risk mapping have utilized ridge regression in determining the significance of effective factors. However, the ridge regression algorithm has been utilized for modelling purposes in various fields [29]. It was first given by Hoerl and Kennard [30] which exploits L2 norm of regularization for lessening the model complication and controlling overfitting. Ridge regression was also developed to avoid the excessive instability and collinearity problem caused by least square estimator [31]. The ‘caret’ package (https://cran.r-project.org/web/packages/caret/caret.pdf) of R 3.5.3 was utilized for assessing the variable importance using ridge regression.
Machine learning algorithm (MLA)
Support vector machine
SVM is an extensively exercised MLA in diverse fields of research that functions on the principle of statistical learning concept and structural risk minimization given by Vapnik [32], which is utilized for classification as well as regression intricacies [33-34]. SVM has a high efficacy in classifying both linearly separable and inseparable data classes [35]. It utilizes an optimal hyperplane to distinguish linearly divisible data whereas kernel functions are employed for transforming inseparable data into a higher dimensional space so that it can be easy categorized [36]. Assume a calibration dataset to be (sm, tm), where m is 1, 2, 3…, x; sm refers to the sixteen independent factors; tm denotes 0 and 1 that resembles risk and non-risk classes and × represents the total amount of calibration data. This algorithm tries to obtain an optimal hyperplane for classifying the aforementioned classes by utilizing the distance between them, which can be formulated as follows [37]: where, ‖p ‖ denotes the rule of normal hyperplane; a refers to a constant. When Lagrangian multiplier (λm) and cost function is introduced, the expression can be given as follows [38]:
In case of inseparable dataset, a slack covariate δm is added into the previous Eq. (2) that is provided as follows [32]:
Accordingly, the Eq. (3) can be described as follows [32]:
Moreover, SVM contains four kernel functions (linear, polynomial, radial basis function: RBF and sigmoid) for making an optimal margin in case of inseparable dataset [32]. Mohammadinia et al. [20] revealed that RBF kernel type produces high prediction accuracy than other kernel types for epidemic outbreak risk mapping. Thus, in this study, RBF is used for creating decision boundaries and the kernel function is expressed as follows [32]: where, K(za, zb) refers to kernel function and v represents its parameter.
Analysis of growth rate for active and death cases of COVID-19
In this study, the growth rate (GR) of active and death cases around the world, Iran, and Fars Province were evaluated using the data acquired from WHO and IMHME between February 26, 2020 and April 10, 2020 for active cases and from March 3, 2020 to April 10, 2020 for death cases.
Validation of outbreak risk map
The cross-checking of calibrated model using untouched testing data is vital for determining the scientific robustness of the prediction [33]. In this research, we utilized ROC-AUC curve values for the validation of COVID-19 outbreak risk map generated using SVM model. It is a widely utilized validation technique for analysing the predictive ability of a model [35]. A model is determined to be perfect, very good, good, moderate and poor if the AUC values were 1.0-0.9, 0.9-0.8, 0.8-0.7, 0.7-0.6 and 0.6-0.5 respectively [39].
Models for infection cases trend
The behavior of the variable infection cases was captured by a third-degree polynomial or cubic specification as follows:
Where Infection(t) represents the total infected cases in day t and t denotes the days starting from 19th of February for Iran and one week later for Fars province. Also, other specifications including quadratic as well as fourth-degree polynomial specifications were examined and based on the predictions, the cubic form was selected against other specifications. In the literature, this form of the specification has been applied by Aik et al. [40] to examine the Salmonellosis incidence in Singapore. We also used an ARMA model to compare the process generating the variable for Iran and Fars province. This model includes two processes: Autoregressive (AR) and Moving Average (MA) process. An ARMA model of order (p,q) can be written as [41]:
Where × is the dependent variable and ε is the white noise stochastic error term. In the applied model, × shows the total infected cases and t is the days starting from the first day of happening infection cases. Benvenuto et al. [42] also applied an ARIMA model to predict the epidemiological trend of COVID-2019.
Results
Outcome of the variable importance analysis
The analysis of variable importance using ridge regression revealed that distance from bus stations, distance from hospitals, and distance from bakeries have the highest significance whereas distance from ATMs, distance from attraction sites, distance from fuel stations, distance from mosques, distance from road, MTCM, density of cities and density of villages exhibit moderate importance. The effective factors such as distance from banks, MTWM, HFP, PWM and PDM were the least influential factors (Fig 4).
COVID-19 outbreak risk map using SVM
The COVID-19 outbreak risk map generated using SVM displays that risk of SARS-CoV-2 ranges from −0.25 to 1.22 (March 29) and −0.35 to 1.21 (April 10) where −0.25 and −0.35 represents the lower risk of SARS-CoV-2 outbreak and 1.22 and 1.21 indicates the regions of Fars Province which is likely to experience a higher risk of COVID-19 outbreak (Fig 5, a-b). It can be observed from Fig 5b (April 10) that Shiraz County and its surrounding counties including Firouzabad, Jahrom, Sarvestan, Arsanjan, Marvdasht, Sepidan, Abadeh, Khorrambid, Rostam, Larestan and Kazeron of Fars Province has the highest risk of being the epicentre of SARS-CoV-2 outbreak. Apart from which counties like Eghlid, and Fasa also lie in the high risk zone.
Outcome of growth rate analysis
The results of GR of active cases in world, Iran, and Fars Province are presented in Fig 6. Our results displayed that the highest active cases in world, Iran, and Fars Province was related to March 11 (GR=1.95), Feb 26 (GR=2.41), and March 15 (GR=4.8), respectively. Also, the outcome stated that GR average of active cases in world, Iran, and Fars Province reported since March 1 to April 10 was 1.13, 1.13, and 1.25, respectively. Our observations demonstrated that the highest GR of active cases in Fars Province was on March 16 (GR=4.80), March 09 (GR=3.20), March 20 (GR=2.40), March 22 (GR=2.10), April 1st (GR=2.10), and March 26 (GR=1.90). On the other hand, the analyses indicated that between February 27 and February 29, the GR of active cases was zero in Fars Province, followed by a GR value of 0.3 in March 14, March 19, and March 21, whereas the lowest GR of active cases in world and Iran observed on March 4 (GR=0.89) and March 3 (GR=0.67) respectively.
Results of death cases in world, Iran, and Fars Province are given in Fig 7.
In total of 1762 active cases of COVID-19 in Fars Province, 42 died between February 24 and April 10. The highest GR of death cases in Fars Province was reported on March 24 (GR=4.00), March 26 (GR=3.00), March 22 (GR=2.00), March 4 (GR=2.00), and April 5 (GR= 2.00). Our analyses showed that since March 5 to March 11, March 15 to March 21, March 28 to April 4, and April 5 to April 8, the GR of death cases was equal to zero. Although the deaths on March 31, April 3, April 7, and April 10 were 3, 2, 4, and 1, respectively, the daily growth rate is zero. Also, average of the GR in Fars Province during 41 days was 0.49, whereas this rate in world and Iran was observed as 1.15 and 1.10, respectively. Fig 7 shows that the highest GR of death cases in world and Iran was nearly equal during March 08 (GR=2.17) and March 03 (GR=2.50). In contrast, the lowest rate of death case was observed on March 09 (GR=0.87), April 08 (GR=0.87), and March 04 (GR=0.60).
Results of active cases in 31 provinces of Iran country by March 25 is presented in Fig 8. Observations indicate that the number of active cases in the 100,000 population vary from 0.4 to 13.1. This figure also shows that provinces of Bushehr and Fars have the lowest cumulative rate of active cases, whereas the highest rate was observed in Qom, Semnan, Mazandaran, Gilan, and Golestan. The Qom Province was the first place in Iran where the outbreak of COVID-19 was recorded.
A comparison among age class of death cases in China, Iran, and Fars Province is presented in Table 1. Percentage of death cases in China was related to February 29, whereas for Iran and Fars Province it is related to March 14 and March 31, respectively. Following Table 1 show that age class > 50 years old lie in the highest class of death rate. So, this age class of above 50 years is highly sensitive to COVID-19.
Validation outcome of outbreak risk map
The ROC-AUC curve cross-validation technique is utilized in this research for validating the COVID-19 outbreak risk map generated by SVM. The model achieved an AUC value of 0.786 and a standard error of 0.031 indicating a good predictive accuracy when cross-verified using the remaining 30% testing dataset collected on March 20, 2020 (Fig 9 and Table 2).
When tested with active case locations on March 29, 2020, the model achieved an increased AUC value of 0.799 which proves the stable and good forecast precision of the outbreak risk map (Fig 10 and Table 3). Also, change detection on April 10, 2020 show that accuracy of the built models is increased to 86.6% (AUC=0.868) (Fig 11 and Table 4).
Comparison of Fars province and Iran infection cases
Two tools have been applied to compare the general trend of infection in Fars province and Iran. The first one is a third-degree polynomial model that is presented in Fig 12. Another quantitative model is an ARMA presented in Table 5. Fig 12 shows the trend of infection cases in Iran and Fars province, where predicted values extraordinarily keep pace with the actual values. R 2 values also indicate that estimated models have significant predictive power. The infection cases are increasing over the selected horizon.
The first derivative of the estimated model which turns it to a second-degree polynomial equation, represents the daily infection cases. Based on the daily infection model, there is a turning point for both Iran and provincial cases. It was found that the turning point for provincial daily infection is 75. In other words, after 75 days the decreasing trend in the daily infection is expected.
The corresponding value for Iran is 211 that is much higher than the provincial one. There are some evidences showing that a turning point in infection is expected. For instance, it has been reported for SARS incidence [43], HAV [44], ARI [45], and for A (H1N1)v [46]. It is worth noting that a turning point means that after passing the peak it is expected to show a deceasing trend. In the 38th day of infection, Fars province accounts for around 2.84% of the total Iranian cases while its population share is more than 6% (Statistical Center of Iran, 2016). Regarding the values obtained for turning points and the infection share, the measures taken by the provincial government may be considered more effective than those taken in other provinces as a whole. However, it should be taken into consideration that Fars province experienced its first infection cases 7 days after Qom and Tehran, provinces that are considered as starting point for virus outbreak in Iran. This might have given the provincial governmental body and the households to take measures to cope with the widespread outbreak. It is worth noting that the comparison of the specified models is more appropriate to investigate the effectiveness of the measures taken by the corresponding health body rather than using it to predict the future values.
The ARMA time series models for infection variables of the Fars province and Iran are presented in Table 5. These models may show the generating process of the variables in time horizon. It is worth noting that in order to have more comparable models, a 38-day time horizon is selected. This is the period of time that data are available, staring on 19th of February for Iran and one week later for Fars province. As shown in Table 4, the both series are generated by an ARMA (2, 1) process. However, the absolute values of the AR terms for Fars province are lower than those of Iran, indicating a slower process of increasing trend for Fars province compared to those of Iran. However, regarding the values for AR roots, the autoregressive (AR) process for both models isn’t explosive. Benvenuto et al. [42] also found that COVID-2019 spread tends to reveal slightly decreasing spread. In addition, Heteroscedasticity (ARCH) were found to be insignificant in both models, indicating that the infection cases tend to show insignificants fluctuations. This is the fact that is not easily captured in the trends shown in Fig 12. Generally speaking, the diagnostic statistics indicate that the estimated models are acceptable since Q-statistics reveal that the residuals are not significantly correlated and the Jarque Berra statistic support the normality of residuals at conventional significance level. Also, ARCH effect was not significant, indicating a low volatility in the infection cases trend. In addition, all AR and MA roots were found to lie inside the unit circle, indicating that ARMA process is (covariance) stationary and invertible.
Discussion
There is a great necessity for new robust scientific outcomes that could aid in containing and preventing the COVID-19 pandemic from spreading. The spatial mapping of COVID-19 outbreak risk can aid governments and policy-makers in implementing strict measures in certain regions of a city or a country where the risk of outbreak is very high. It is therefore crucial to identify the regions that would have high outbreak risk through predictive modelling with the help of machine learning algorithms (MLAs). In recent times, MLAs have demonstrated promising results in forecasting the epidemic outbreak risk [17]. In this research, the SVM model showing good forecast accuracy was used for mapping the outbreak risk of COVID-19. Similarly, Mohammadinia et al. [20] revealed that GWR and SVM had the highest precision in mapping the occurrence of leptospirosis. Ding et al. [47] employed three MLAs including SVM, RF and GBM for mapping the transmission risk assessment of mosquito-borne diseases and disclosed that all three MLAs acquired excellent validation outcome. Machado et al. [48] also applied RF, SVM and GBM in modelling the porcine epidemic diarrhoea virus and demonstrated 90% specificity values in case of SVM. Tien Bui et al. [17] stated that SVM achieved an AUC value of 0.968 in mapping the susceptibility to malaria. The ability to classify inseparable data classes is the greatest benefit of SVM model [49]. It is among the most precise and robust MLA [50]. SVM can be useful and has higher prediction accuracy when it comes to handling a small dataset. However, Huang and Zhao [51] demonstrated that SVM also yields excellent precision in predictive modelling when a large dataset is utilized. The algorithm have a very low probability to overfit and is not disproportionately impacted by noisy data [49]. Behzad et al. [52] revealed that SVM had huge capacity in simplification and had enduring forecast accuracy. It should be also noted that the predictive exactness of SVM model largely depends on the choice of kernel function [50]. Among the four kernel functions of SVM, RBF has been proved to generate high accuracy models [49]. SVM includes diverse kinds of categorization functions which are responsible for assessing overfitting and simplifying data that needs a minor tuning of model parameters [53]. The significance of each effective factor employed in this research is assessed using ridge regression. Since, there is no previous study in COVID-19 that outlines the proper effective factors. The outcome of this research can be very helpful for scientists in experimenting the same and additional effective factors for COVID-19 outbreak risk mapping. The proximity factors including distance from bus stations, distance from hospitals, distance from bakeries were most influential in forecasting the COVID-19 outbreak risk whereas other proximity factors such as distance from ATMs, distance from attraction sites, distance from fuel stations, distance from mosques and distance from road had the moderate influence which is followed by MTCM, density of cities and density of villages. It should be noted that climatic factors including MTWM, PWM and PDM had the least significance in mapping the outbreak risk. From this, it can be concluded that precipitation factors PWM and PDM are not associated with the transmission of COVID-19 in Fars Province whereas in case of temperature factors MTCM had moderate influence in mapping COVID-19 outbreak risk but MTWM exhibited a least significance. This outcome reveals that proximity factors had high influence in the transmission of SARS-CoV-2. In addition, the study conducted disclosed that increase in temperature will not decline the SARS-CoV-2 cases, although it has been also revealed that increase in temperature and absolute humidity could decrease the death of patients affected by COVID-19 [54]. A third-degree polynomial and ARMA models were applied to examine the behaviour of infection in Fars province and Iran. The general trend of infection in Iran and Fars province are similar while more explosive behavior for Iran’s cases is expected. The methodology and effective factors used in this research can be adapted in studies investigated in other parts of the world for preventing and controlling the outbreak risk of COVID-19.
Conclusions
Mapping of SARS-CoV-2 outbreak risk can aid decision makers in drafting effective policies to minimize the spread of the disease. In this research, GIS based SVM was used for mapping the COVID-19 outbreak risk in Fars Province of Iran. Sixteen effective factors including MTCM, MTWM, PWM, PDM, distance from roads, distance from mosques, distance from hospitals, distance from fuel stations, human footprint, density of cities, distance from bus stations, distance from banks, distance from bakeries, distance from attraction sites, distance from automated teller machines (ATMs) and density of villages were selected along with the locations of active cases of SARS-CoV-2. The results of ridge regression revealed that distance from bus stations, distance from hospitals, and distance from bakeries had the highest significance and the outcome was utilized in mapping the outbreak risk of the pandemic with the help of SVM. The generated model had good predictive accuracy of 0.786 and 0.799 when verified with the locations of active cases during March 20 and March 29, 2020. The Iranian government should take restrict preventive measures for controlling the outbreak of SARS-CoV-2 in Shiraz as a tourism destination and the counties having high risk. Based on the results of polynomial and an ARMA model, the infection behavior is not expected to reveal an explosive process, however; the general trend of infection will last for several months especially in the Iran as a whole. A more slowly trend is expected in Fars Province, demonstrating extensive home quarantine and travel and movement restrictions were good strategies for disease control in Fars province. The main policy implication is that the infection cases, to some extent, may be controlled using more effective measures. Although, the estimated models may be used to predict the infection in following days, however; this contribution is less significant than the other implications derived from them. Generally speaking, it is expected to encounter a decreasing trend, however; this may be reversed if the ongoing attempts are slowed down, pointing out the need to keep the measures like quarantine or even to try more restricting attempts.
Data Availability
All the data are available in the manuscript
Competing interests
The authors declare that they have no competing interests.
Funding
Shiraz University, Iran, Grant No. 96GRD1M271143.
Availability of data and materials
All data and materials used in this work were publicly available.
Ethics approval and consent to participate
The ethical approval or individual consent was not applicable
Authors’ contributions
HRP, SP, BH, ZF, NS, MHT, BH, SB, and JPT contributed to study design, the literature search, data collection, data analysis, software working, and writing of this article. All authors read and approved the final draft of the manuscript.
Consent for publication
Not applicable.