Abstract
Dengue fever is an infectious disease caused by Flavivirus transmitted by Aedes mosquito. This disease predominantly occurs in the tropical and subtropical regions. With no specific treatment, the most effective way to prevent dengue is vector control. The dependence of Aedes mosquito population on meteorological variables make prediction of dengue infection possible using conventional statistical and epidemiologic models. However, with increasing average global temperature, the predictability of these models may be lessened employing the need for artificial neural network. This study uses artificial neural network to predict dengue incidence in the entire Philippines with humidity, rainfall, and temperature as independent variables. All generated predictive models have mean squared logarithmic error of less than 0.04.
1. Introduction
Dengue fever is an infectious disease caused by Flavivirus transmitted by the vector mosquito Aedes. The dengue virus has 4 serotypes (DENV-1, DENV-2, DENV-3, DENV-4) which mostly occur in the urban and sub-urban areas in tropical and subtropical regions. It is estimated that 50 million people have dengue infection globally per annum. In the Philippines, an estimated 170,000 cases occurred annually. Currently, there are no specific treatments for dengue and the most effective way to prevent it is through vector control. [1-6]
The 2 most important vectors for dengue transmission are: Aedes aegypti and Aedes albopictus. The lifecycle of these vectors is divided into egg, larva, pupa, and adult stages; which is heavily influenced by different meteorological, geological, and anthropological variables. The typical adult mosquito lays egg just above the waterline of a stagnant water. It will take 48 hours in warm, humid environment for the embryo to develop. Once developed, the eggs are tolerant to desiccation up to more than a year.[1] The adult mosquito then typically emerges after 10 days. The adult female mates and takes blood meal necessary for egg maturation. The blood meal biting activity takes place in the morning and afternoon. [7, 8] However, blood meals also do occur at night in lighted rooms. [3]
Temperature, rainfall, and relative humidity affects the transmission of dengue. [9] Annual rainfall of more than 200 cm provides conducive growth of Aedes population. [10] Mosquito population growth is more abundant in sea level up to 500 meters above sea level, although they can thrive up to 1,200 meters. [11] Climate variability strongly influence dengue epidemic. Maximal temperature of more than 32°C, maximal relative humidity of more than 95% influence the incubation period, feeding frequency, and longevity of Aedes. [12] There is low mosquito mortality in temperature between 15°C to 30°C. Pupae development occurs in less than 1 day in 32°C to 34°C but takes 4 days in 22°C. [13-17]
With the dependence of Aedes mosquito population dynamics on weather variables; climate change, undoubtedly, will have its impact on the spread of dengue infection. It is estimated that the average global temperature will increase by 2°C to 4.5°C by year 2100. [18] There are many researches that predicted dengue infection using weather variables. Different statistical models were employed such as: Poisson regression, autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA) [19, 22], negative binomial, quasi-likelihood regression [19, 20], distributed lag non-linear model (DLNM) [21]. There are also attempts to employ machine learning techniques such as random forest and gradient boosting to make dengue infection predictions. [23-25] These predictive models have varying degree of predictions on dengue incidence.
With the ease of accessibility and less expensive computing power available nowadays, there is increased application of artificial neural network in making predictions in different areas of science. Coupled with the use of programming and software packages such as Python and TensorFlow, this research will attempt to predict dengue incidence in the Philippines using artificial neural network.
2. Materials and Methods
2.1. Study Setting
The study was conducted in all 17 administrative regions in the Philippines: Region 1 to 12 (including 4-A and 4-B), Autonomous Region of Muslim Mindanao (ARMM), Cordillera Autonomous Region (CAR), and CARAGA. The Philippines has two seasons: wet (June to October) and dry (November to May). However, there are 4 climate types in the Philippines based on modified Coronas classification: Type I (dry from November to April, we from May to October), Type II (seasonal rainfall from November to December), Type III (same as Type I but with maximum rainfall from May to October), and Type IV (evenly distributed rainfall annually). [26]
2.2. Data Collection
Data on dengue incidence is freely available in the Department of Health (DOH) website as PDF reports. These dengue reports are sum of all case definition of a dengue case with or without the confirmation of polymerase chain reaction. This is attributed to the limited resources available especially in remote and rural areas where dengue case definition is based on signs and symptoms only. The meteorological data: humidity (as %), rainfall (as mm), temperature (as °C) were requested from the Philippine Atmospheric, Geophysical and Astronomical Services Administration (PAGASA) and were received as excel files. Missing dengue values were addressed by getting the average value from the same region from different year period but of the same week time frame. Missing meteorological values were filled using the average value from same weather station from different year period but of the same month time frame. These files from DOH and PAGASA were encoded to CSV files for data analysis. Dengue data were reported on a weekly basis from each administrative regions while the meteorological data were reported on a monthly basis from each weather stations where most administrative regions have 2 to 3 weather stations. To reconcile these differences, the weekly tally of dengue were summed up to reflect a month value while the meteorological values were averaged out from the weather stations to reflect the regional value. The data on dengue and weather variables is from year 2013 to 2018.
2.3. Artificial Neural Network
Data were analyzed in Python 3 using Jupyter Notebook as the interface while also employing several libraries such as Numpy, Pandas, Keras, and Tensorflow. A training and test set were created for each region. The training set consists of data from 2013 to 2017 and the test set contains data from 2018. An artificial neural network was created with 1 input layer, 6 hidden layers, and 1 output layer as shown in Figure 1. The input layer is composed of humidity, rainfall, and temperature. The output layer is composed of dengue incidence.
The artificial neural network uses rectified linear unit (ReLu) as the activation function and adaptive moment estimation (Adam) as an optimiser. The artificial neural network was trained with batch size of 24 in 500 epochs for each administrative region. The created model from the training set was used to predict the dengue values in the test set. The prediction was evaluated using mean squared logarithmic error (MSLE).
3. Results
Table 1 shows the descriptive statistics of monthly humidity, rainfall, temperature, and dengue incidence for each administrative region. ARMM has the lowest average dengue incidence of 157.26 while Region 4-A has the highest at 2,616.97. NCR has the lowest average humidity of 75.24% while CAR has the highest at 87.25%. Region 12 has the lowest average rainfall of 81.99 mm, while Region 7 has the highest at 1,130.61 mm. CAR has the lowest average temperature of 19.47°C, while Region 11 has the highest at 28.7°C.
The resulting predictive models from the artificial neural network has a mean squared logarithmic error of less than 0.04 in all administrative regions. Table 2 provides the values of MSLE in each region. Region 9 has the lowest MSLE of 0.006 while Region 2 has the highest at 0.0376. Figure 2 provides comparison of the actual and predicted dengue incidence in each administrative regions using the generated predictive models.
4. Discussion
Although there is low MSLE in each administrative region, visual inspection of the actual and predicted dengue incidence revealed that there are predictive models that are better than the other. It should be noted that the predictive models are unique to each region since they are trained separately and have their own artificial neural network even though they have the same network architecture. The administrative regions that have visually performed well are: ARMM, CAR, Region 1, 3, 4-A, and 4-B. The predictive models also had inefficiencies in identifying the peak dengue incidence which are evident in Region 2, 5, 10, and ARMM. The worst predictive models are in CARAGA, NCR, Region 2, 7, 8, 9, 11, and 12.
Predictions made by an artificial neural network is different from statistical or epidemiological modelling. Artificial neural network utilizes collection of artificial neurons that take the weighted inputs, pass it through an activation function, to produce an output. [27] In this study, there are 3 inputs: humidity, rainfall, and temperature; and there is 1 output which is a single value of dengue incidence. The idea is, at any given values of the 3 input variables, what will be the resulting dengue incidence. The rectified linear unit (ReLU) [28] is the activation function used to avoid having negative values for the dengue incidence. These multiple units and layers of computation can make better predictions.
There are several limitations that was encountered in this study. The study encompasses entire administrative region, which means micro-climate variability from each city or municipality can be a contributing factor to the dengue incidence. The meteorological variables provided by PAGASA was limited to 3 (humidity, rainfall, temperature), although the request includes flood occurrence and average sunlight. These 3 significant meteorological variables appeared in the researches done in the Philippines. [23, 29] However, flood occurrence may help in dengue incidence prediction because flushing can potentially reduce dengue incidence. [30] Average sunlight have inconclusive relationship to dengue incidence using statistical model. [31] However, this might be proven otherwise if artificial neural network is use.
The impact of climate change can influence the transmission of dengue to other places other than the tropical and subtropical regions. By the end of this century, dengue epidemic potential for Aedes aegypti could occur in 10 European cities (Madeira, Malaga, Athens, Rome, Nice, Paris, London, Amsterdam, Berlin, Stockholm) with continued current rate of greenhouse gas emission. [32] The complexities of weather and climate influences on dengue transmission are not easily modelled with statistical approach [33] which makes artificial neural network more helpful in predicting dengue incidence more accurately.
5. Conclusion
Close fidelity on the predicted and actual dengue incidence in some administrative region in the Philippines prove that artificial neural network can be implemented for predicting dengue. Further work can be done in optimizing the artificial neural network architecture: number of neurons, number of hidden layers, and additional input meteorological variables (flood occurrence, average sunlight). It is recommended that future research be focused on the city or municipal level of dengue cases and weather variables measurement to be able to create a local public health policy.
Data Availability
The dengue data is available in the Department of Health website. The weather variables are requested from Philippine Atmospheric, Geophysical, and Astronomical Services in the Philippines.
Footnotes
Table 1: Region 4-B humidity mean, SD, and max revision. Table 2: Region 4-B MSLE value, revision. Figure 2: Region 4-B, revision.