Effectiveness, Explainability and Reliability of Machine Meta-Learning Methods for Predicting Mortality in Patients with COVID-19: Results of the Brazilian COVID-19 Registry
Abstract
Objective To provide a thorough comparative study among state-of-the-art machine learning methods and statistical methods for determining in-hospital mortality in COVID-19 patients using data upon hospital admission; to study the reliability of the predictions of the most effective methods by correlating the probability of the outcome and the accuracy of the methods; to investigate how explainable are the predictions produced by the most effective methods.
Materials and Methods De-identified data were obtained from COVID-19 positive patients in 36 participating hospitals, from March 1 to September 30, 2020. Demographic, comorbidity, clinical presentation and laboratory data were used as training data to develop COVID-19 mortality prediction models. Multiple machine learning and traditional statistics models were trained on this prediction task using a folded cross-validation procedure, from which we assessed performance and interpretability metrics.
Results The Stacking of machine learning models improved over the previous state-of-the-art results by more than 26% in predicting the class of interest (death), achieving 87.1% of AUROC and macro F1 of 73.9%. We also show that some machine learning models can be very interpretable and reliable, yielding more accurate predictions while providing a good explanation for the ‘why’.
Conclusion The best results were obtained using the meta-learning ensemble model – Stacking. State-of the art explainability techniques such as SHAP-values can be used to draw useful insights into the patterns learned by machine-learning algorithms. Machine-learning models can be more explainable than traditional statistics models while also yielding highly reliable predictions.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was funded by CAPES and FAPEMIG
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study protocol was approved by the Brazilian National Commission for Research Ethics (CAAE 30350820.5.1001.0008). Individual informed consent was waived due to the severity of the situation and the use of deidentified data, based on medical chart review only.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors.
Subject Area
- Addiction Medicine (403)
- Allergy and Immunology (712)
- Anesthesia (207)
- Cardiovascular Medicine (2969)
- Dermatology (253)
- Emergency Medicine (445)
- Epidemiology (12806)
- Forensic Medicine (12)
- Gastroenterology (830)
- Genetic and Genomic Medicine (4621)
- Geriatric Medicine (423)
- Health Economics (732)
- Health Informatics (2941)
- Health Policy (1073)
- Hematology (393)
- HIV/AIDS (932)
- Medical Education (430)
- Medical Ethics (116)
- Nephrology (475)
- Neurology (4408)
- Nursing (238)
- Nutrition (649)
- Oncology (2295)
- Ophthalmology (652)
- Orthopedics (260)
- Otolaryngology (327)
- Pain Medicine (281)
- Palliative Medicine (84)
- Pathology (502)
- Pediatrics (1199)
- Primary Care Research (502)
- Public and Global Health (7004)
- Radiology and Imaging (1544)
- Respiratory Medicine (921)
- Rheumatology (444)
- Sports Medicine (386)
- Surgery (491)
- Toxicology (60)
- Transplantation (212)
- Urology (182)