RT Journal Article SR Electronic T1 Development and evaluation of a machine learning-based in-hospital COvid-19 Disease Outcome Predictor (CODOP): a multicontinental retrospective study JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.09.20.21263794 DO 10.1101/2021.09.20.21263794 A1 Klén, Riku A1 Purohit, Disha A1 Gómez-Huelgas, Ricardo A1 Casas-Rojo, José Manuel A1 Antón Santos, Juan Miguel A1 Núñez-Cortés, Jesús Millán A1 Lumbreras, Carlos A1 Ramos-Rincón, José Manuel A1 Young, Pablo A1 Ramírez, Juan Ignacio A1 Titto Omonte, Estela Edith A1 Artega, Rosmery Gross A1 Canales Beltrán, Magdy Teresa A1 Valdez, Pascual A1 Pugliese, Florencia A1 Castagna, Rosa A1 Funke, Nico A1 Leiding, Benjamin A1 Gómez-Varela, David YR 2021 UL http://medrxiv.org/content/early/2021/11/29/2021.09.20.21263794.abstract AB Background More contagious SARS-CoV-2 virus variants, breakthrough infections, waning immunity, and sub-optimal rates of COVID-19 vaccination account for a new surge of infections leading to record numbers of hospitalizations and deaths in several European countries. This is a particularly concerning scenario for resource-limited countries, which have a lower vaccination rate and fewer clinical tools to fight against the next pandemic waves. There is an urgent need for clinically valuable, generalizable, and parsimonious triage tools assisting the appropriate allocation of hospital resources. We aimed to develop and extensively validate CODOP, a machine learning-based tool for accurately predicting the clinical outcome of hospitalized COVID-19 patients.Methods CODOP was built using modified stable iterative variable selection and linear regression with lasso regularisation. To avoid generalization problems, CODOP was trained and tested with three time-sliced and geographically distinct cohorts encompassing 40 511 blood-based analyses of COVID-19 patients from more than 110 hospitals in Spain and the USA during 2020-21. We assessed the discriminative ability of the model using the Area Under the Receiving Operative Curve (AUROC) as well as horizon and Kaplan-Meier risk stratification analyses. To reckon the fluctuating pressure levels in hospitals through the pandemic, we offer two online CODOP calculators suited for undertriage or overtriage scenarios. We challenged their generalizability and clinical utility throughout an evaluation on a cohort of patients hospitalized in five hospitals from three Latin American countries.Findings CODOP uses 12 clinical parameters commonly measured at hospital admission and associated with the pathophysiology of COVID-19. CODOP reaches high discriminative ability up to nine days before clinical resolution (AUROC: 0·90-0·96, 95% CI 0·879-0·970), it is well calibrated, and it enables an effective dynamic risk stratification during hospitalization. The two CODOP online calculators demonstrate their potential for triage decisions when challenged with the distinctive Latin American evaluation cohorts (73-100% sensitivity and 84-100% specificity).Interpretation The high predictive performance of CODOP in geographically disperse patient cohorts and the easiness-of-use, strongly suggest its clinical utility as a global triage tool, particularly in resource-limited countries.Funding The Max Planck Society.Evidence before this study We have searched PubMed for articles about the existence of in-hospital COVID-19 mortality predictive models, using the search terms “coronavirus”, “COVID-19”, “risk”, “death”, “mortality”, and “prediction”, focusing on studies published between March 1, 2020 and 31 August, 2021. The studies we identified generally used small-medium size cohorts of patients that are geographically restricted to small regions of the developed world (many times, to the same city). We haven’t found studies that challenged their models in extended cohorts of patients from very distinct health system populations, particularly from resource-limited countries. Further, most of the previous models are rigid by not acknowledging the fluctuating availability of hospital resources during the pandemic (e.g., beds, oxygen supply). These and other limitations have been pointed out by expert reviews indicating that published in-hospital COVID-19 mortality predictive models are subject to high risk of bias, report an over-optimistic performance, and have limited clinical value in assisting daily triage decisions. A parsimonious, accurate and extensively validated model is yet to be developed.Added value of this study We analysed clinical data from different cohorts totalling 21 607 COVID-19 patients treated in more than 110 hospitals in Spain and the USA during three different pandemic waves extending from February 2020 to April 2021. The new CODOP in-hospital mortality prediction model is based on 11 blood biochemistry parameters (representing main biological pathways involved in the pathogenesis of SARS-CoV-2) plus Age, all of them commonly measured upon hospitalization. CODOP accurately predicted mortality risk up to nine days before clinical resolution (AUROC: 0·90-0·96, 95% CI 0·879-0·970), it is well calibrated, and it enables an effective dynamic risk stratification during hospitalization. We offer two online CODOP calculator subtypes (https://gomezvarelalab.em.mpg.de/codop/) tailored to overtriage and undertriage scenarios. The online calculators were able to reach the desired prediction performance in five independent evaluation cohorts gathered in hospitals of three Latin American countries from March 7th 2020 to June 7th 2021.Implications of all the available evidence We present here a highly accurate, parsimonious and extensively validated COVID-19 in-hospital mortality prediction model, derived from working with the largest number and the most geographically extended representation of patients and health systems to date.The rigorous analytical methods, the generalizability of the model in distinct world regions, and its flexibility to reckon with the changing availability of hospital resources point to CODOP as a clinically useful tool potentially improving the outcome prediction and the management of COVID-19 hospitalized patients.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe Max Planck Society supports the payment of the article processing fees. No other funding supported the study. The funders of the had no role in study design, data collection, data analysis, interpretation of data, writing of the report, or in the decision to submit the paper for publication.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The use of anonymized clinical data of all patients with COVID-19 in this study has been approved by all institutional ethical review boards for each institution participating in this study: the Provincial Research Ethics Committee of Malaga (Spain), the Ethical Committee of the Hospital de Infecciosas F. J. Muniz, the Ethical Committee of the Hospital Britanico, the Bioethical Committee of the Instituto Hondureno de Seguridad Social, the Ethical Committee of the Caja Pretolera de Salud, and the Ethical Committee of the Hospital San Juan de Dios.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data that support the findings of this study are available on request from either the SEMI-COVID-19 Scientific Committee and the Registry Coordinating Centre or the different Latin American hospitals.