PT - JOURNAL ARTICLE AU - Roland, Theresa AU - Böck, Carl AU - Tschoellitsch, Thomas AU - Maletzky, Alexander AU - Hochreiter, Sepp AU - Meier, Jens AU - Klambauer, Günter TI - Machine Learning based COVID-19 Diagnosis from Blood Tests with Robustness to Domain Shifts AID - 10.1101/2021.04.06.21254997 DP - 2021 Jan 01 TA - medRxiv PG - 2021.04.06.21254997 4099 - http://medrxiv.org/content/early/2021/04/09/2021.04.06.21254997.short 4100 - http://medrxiv.org/content/early/2021/04/09/2021.04.06.21254997.full AB - We investigate machine learning models that identify COVID-19 positive patients and estimate the mortality risk based on routinely acquired blood tests in a hospital setting. However, during pandemics or new outbreaks, disease and testing characteristics change, thus we face domain shifts. Domain shifts can be caused, e.g., by changes in the disease prevalence (spreading or tested population), by refined RT-PCR testing procedures (taking samples, laboratory), or by virus mutations. Therefore, machine learning models for diagnosing COVID-19 or other diseases may not be reliable and degrade in performance over time. To countermand this effect, we propose methods that first identify domain shifts and then reverse their negative effects on the model performance. Frequent re-training and reassessment, as well as stronger weighting of more recent samples, keeps model performance and credibility at a high level over time. Our diagnosis models are constructed and tested on large-scale data sets, steadily adapt to observed domain shifts, and maintain high ROC AUC values along pandemics.Competing Interest StatementThe authors have declared no competing interest.Clinical Trialapproval number: 1104/2020 (ethics committee of the Johannes Kepler University, Linz)Funding StatementThis project was funded by the Medical Cognitive Computing Center (MC3) and AI-MOTION (LIT-2018-6-YOU-212). We thank the projects Medical Cognitive Computing Center (MC3), AI-MOTION (LIT-2018-6-YOU-212), DeepToxGen (LIT-2017-3-YOU-003), AI-SNN (LIT-2018-6-YOU-214), DeepFlood (LIT-2019-8-YOU-213), PRIMAL (FFG-873979), S3AI (FFG-872172), DL for granular flow (FFG-871302), ELISE (H2020-ICT-2019-3 ID: 951847), AIDD (MSCA-ITN-2020 ID: 956832). We thank Janssen Pharmaceutica, UCB Biopharma SRL, Merck Healthcare KGaA, Audi.JKU Deep Learning Center, TGW LOGISTICS GROUP GMBH, Silicon Austria Labs (SAL), FILL Gesellschaft mbH, Anyline GmbH, Google, ZF Friedrichshafen AG, Robert Bosch GmbH, Software Competence Center Hagenberg GmbH, T\"{U}V Austria, and the NVIDIA Corporation. We thank Franz Grandits, Innosol for the daily download of the age distribution data of the newly infected COVID"~19 patients from BMSGPK.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:ethics committee of the Johannes Kepler University, Linz, Austria (approval number: 1104/2020)All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data set is not available for public use due to data privacy reasons. Code is provided at https://github.com/ml-jku/covid. https://github.com/ml-jku/covid