PT - JOURNAL ARTICLE AU - Onovo, Amobi Andrew AU - Atobatele, Akinyemi AU - Kalaiwo, Abiye AU - Obanubi, Christopher AU - James, Ezekiel AU - Gado, Pamela AU - Odezugo, Gertrude AU - Ogundehin, Dolapo AU - Magaji, Doreen AU - Russell, Michele TI - Using Supervised Machine Learning and Empirical Bayesian Kriging to reveal Correlates and Patterns of COVID-19 Disease outbreak in sub-Saharan Africa: Exploratory Data Analysis AID - 10.1101/2020.04.27.20082057 DP - 2020 Jan 01 TA - medRxiv PG - 2020.04.27.20082057 4099 - http://medrxiv.org/content/early/2020/05/02/2020.04.27.20082057.short 4100 - http://medrxiv.org/content/early/2020/05/02/2020.04.27.20082057.full AB - Introduction Coronavirus disease 2019 (COVID-19) is an emerging infectious disease that was first reported in Wuhan1,2, China, and has subsequently spread worldwide. Knowledge of coronavirus-related risk factors can help countries build more systematic and successful responses to COVID-19 disease outbreak. Here we used Supervised Machine Learning and Empirical Bayesian Kriging (EBK) techniques to reveal correlates and patterns of COVID-19 Disease outbreak in sub-Saharan Africa (SSA).Methods We analyzed time series aggregate data compiled by Johns Hopkins University on the outbreak of COVID-19 disease across SSA. COVID-19 data was merged with additional data on socio-demographic and health indicator survey data for 39 of SSA’s 48 countries that reported confirmed cases and deaths from coronavirus between February 28, 2020 through March 26, 2020. We used supervised machine learning algorithm, Lasso for variable selection and statistical inference. EBK was used to also create a raster estimating the spatial distribution of COVID-19 disease outbreak.Results The lasso Cross-fit partialing out predictive model ascertained seven variables significantly associated with the risk of coronavirus infection (i.e. new HIV infections among pediatric, adolescent, and middle-aged adult PLHIV, time (days), pneumococcal conjugate-based vaccine, incidence of malaria and diarrhea treatment). Our study indicates, the doubling time in new coronavirus cases was 3 days. The steady three-day decrease in coronavirus outbreak rate of change (ROC) from 37% on March 23, 2020 to 23% on March 26, 2020 indicates the positive impact of countries’ steps to stymie the outbreak. The interpolated maps show that coronavirus is rising every day and appears to be severely confined in South Africa. In the West African region (i.e. Burkina Faso, Ghana, Senegal, Cote d’Iviore, Cameroon, and Nigeria), we predict that new cases and deaths from the virus are most likely to increase.Interpretation Integrated and efficiently delivered interventions to reduce HIV, pneumonia, malaria and diarrhea, are essential to accelerating global health efforts. Scaling up screening and increasing COVID-19 testing capacity across SSA countries can help provide better understanding on how the pandemic is progressing and possibly ensure a sustained decline in the ROC of coronavirus outbreak.Funding Authors were wholly responsible for the costs of data collation and analysis.Competing Interest StatementThe authors have declared no competing interest.Funding StatementFunding Authors were wholly responsible for the costs of data collation and analysis.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData used for this analysis was obtained from the COVID-19 Data Resource Hub established by the Tableau community and included near real-time data compiled by Johns Hopkins University. Additional data from socio-demographic and health indicator surveys was derived from web resources of the World Bank, UNICEF, WHO and UNAIDS. All data used are publicly available, and sources are cited throughout. https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data