PT - JOURNAL ARTICLE AU - Halasz, Geza AU - Sperti, Michela AU - Villani, Matteo AU - Michelucci, Umberto AU - Agostoni, Piergiuseppe AU - Biagi, Andrea AU - Rossi, Luca AU - Botti, Andrea AU - Mari, Chiara AU - Maccarini, Marco AU - Pura, Filippo AU - Roveda, Loris AU - Nardecchia, Alessia AU - Mottola, Emanuele AU - Nolli, Massimo AU - Salvioni, Elisabetta AU - Mapelli, Massimo AU - Deriu, Marco Agostino AU - Piga, Dario AU - Piepoli, Massimo TI - Predicting clinical outcomes in the Machine Learning era: The Piacenza score a purely data driven approach for mortality prediction in COVID-19 Pneumonia AID - 10.1101/2021.03.16.21253752 DP - 2021 Jan 01 TA - medRxiv PG - 2021.03.16.21253752 4099 - http://medrxiv.org/content/early/2021/03/20/2021.03.16.21253752.short 4100 - http://medrxiv.org/content/early/2021/03/20/2021.03.16.21253752.full AB - Background Several models have been developed to predict mortality in patients with COVID-19 pneumonia, but only few have demonstrated enough discriminatory capacity. Machine-learning(ML) algorithms represent a novel approach for data-driven prediction of clinical outcomes with advantages over statistical modelling. We developed the Piacenza score, a ML-based score, to predict 30-day mortality in patients with COVID-19 pneumonia.Methods 852 patients (mean age 70years, 70%males) were enrolled from February to November 2020. The dataset was randomly splitted into derivation and test. The Piacenza score was obtained through the Naïve Bayes classifier and externally validated on 86 patients. Using a forward-search algorithm the following six features were identified: age; mean corpuscular haemoglobin concentration; PaO2 /FiO2 ratio; temperature; previous stroke; gender. In case one or more of the features are not available for a patient, the model can be re-trained using only the provided features.We also compared the Piacenza score with the 4C score and with a Naïve Bayes algorithm with 14 variables chosen a-priori.Results The Piacenza score showed an AUC of 0.78(95% CI 0.74-0.84, Brier-score 0.19) in the internal validation cohort and 0.79(95% CI 0.68-0.89, Brier-score 0.16) in the external validation cohort showing a comparable accuracy respect to the 4C score and to the Naïve Bayes model with a-priori chosen features, which achieved an AUC of 0.78(95% CI 0.73-0.83, Brier-score 0.26) and 0.80(95% CI 0.75-0.86, Brier-score 0.17) respectively.Conclusion A personalized ML-based score with a purely data driven features selection is feasible and effective to predict mortality in patients with COVID-19 pneumonia.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNoneAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:AUSL Piacenza ethics committeeAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesIf request, we agree to publicly share the study data and analysis source code.