RT Journal Article SR Electronic T1 Predicting the Evolution of COVID-19 Mortality Risk: a Recurrent Neural Network Approach JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.12.22.20244061 DO 10.1101/2020.12.22.20244061 A1 Villegas, Marta A1 Gonzalez-Agirre, Aitor A1 Gutiérrez-Fandiño, Asier A1 Armengol-Estapé, Jordi A1 Carrino, Casimiro Pio A1 Pérez Fernández, David A1 Soares, Felipe A1 Serrano, Pablo A1 Pedrera, Miguel A1 García, Noelia A1 Valencia, Alfonso YR 2021 UL http://medrxiv.org/content/early/2021/01/11/2020.12.22.20244061.abstract AB Background The propagation of COVID-19 in Spain prompted the declaration of the state of alarm on March 14, 2020. On 2 December 2020, the infection had been confirmed in 1,665,775 patients and caused 45,784 deaths. This unprecedented health crisis challenged the ingenuity of all professionals involved. Decision support systems in clinical care and health services management were identified as crucial in the fight against the pandemic.Methods This study applies Deep Learning techniques for mortality prediction of COVID-19 patients. Two datasets with clinical information (medication, laboratory tests, vital signs etc.) were used. They are comprised of 2,307 and 3,870 COVID-19 infected patients admitted to two Spanish hospital chains. Firstly, we built a sequence of temporal events gathering all the clinical information for each patient. Next, we used the temporal sequences to train a Recurrent Neural Network (RNN) model with an attention mechanism exploring interpretability. We conducted extensive experiments and trained the RNNs in different settings, performing hyperparameter search and cross-validation. We ensembled resulting RNNs to reduce variability and enhance sensitivity.Results We assessed the performance of our models using global metrics, by averaging the performance across all the days in the sequences. We also measured day-by-day metrics starting from the day of hospital admission and the outcome day and evaluated the daily predictions. Regarding sensitivity, when compared to more traditional models, our best two RNN ensemble models outperform a Support Vector Classifier in 6 and 16 percentage points, and Random Forest in 23 and 18 points. For the day-by-day predictions from the outcome date, the models also achieved better results than baselines showing its ability towards early predictions.Conclusions We have shown the feasibility of our approach to predict the clinical outcome of patients infected with SARS-CoV-2. The result is a time series model that can support decision-making in healthcare systems and aims at interpretability. The system is robust enough to deal with real world data and it is able to overcome the problems derived from the sparsity and heterogeneity of the data. In addition, the approach was validated using two datasets showing substantial differences. This not only validates the robustness of the proposal but also meets the requirements of a real scenario where the interoperability between hospitals’ datasets is difficult to achieve.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work has been funded by the State Secretariat for Digitalization and Artificial Intelligence (SEDIA) to carry out specialised technical support activities in supercomputing within the framework of the Plan TL 23 signed on 14 December 2018. The tasks done by Hospital Universitario 12 de Octubre were supported by PI18/00981, funded bythe Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation2017-2020 and the European Regional Development Funds (FEDER)Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The secondary anonymized data from HM Hospitales were downloaded from https://www.hmhospitales.com/coronavirus/covid-data-save-lives/english-version which are subject to the CBE (Comité de Bioética de España) https://saib.es/wp-content/uploads/Informe-CBE-investigacion-COVID-19.pdf. The dataset required application for access, such application was filled and we received the approval and accces details via email on May 7, 2020. Usage of anonymized data from the "Hospital Universitario 12 de Octubre" was approved by the hospital's IRB (CEIm number: 20/666) and shared with BSC under a Collaboration Agreement. All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesHM is available in the provided link. H12O data is subject to legal restrictions and cannot be currently distributed. Code is available under MIT license. https://www.hmhospitales.com/coronavirus/covid-data-save-lives https://github.com/PlanTL-SANIDAD/covid-predictive-model