PT - JOURNAL ARTICLE AU - Cruz, Beatriz Garcia Santa AU - Bossa, Matías Nicolás AU - Sölter, Jan AU - Husch, Andreas Dominik TI - Public Covid-19 X-ray datasets and their impact on model bias - a systematic review of a significant problem AID - 10.1101/2021.02.15.21251775 DP - 2021 Jan 01 TA - medRxiv PG - 2021.02.15.21251775 4099 - http://medrxiv.org/content/early/2021/02/19/2021.02.15.21251775.short 4100 - http://medrxiv.org/content/early/2021/02/19/2021.02.15.21251775.full AB - Computer-aided-diagnosis for COVID-19 based on chest X-ray suffers from weak bias assessment and limited quality-control. Undetected bias induced by inappropriate use of datasets, and improper consideration of confounders prevents the translation of prediction models into clinical practice. This study provides a systematic evaluation of publicly available COVID-19 chest X-ray datasets, determining their potential use and evaluating potential sources of bias.Only 5 out of 256 identified datasets met at least the criteria for proper assessment of risk of bias and could be analysed in detail. Remarkably almost all of the datasets utilised in 78 papers published in peer-reviewed journals, are not among these 5 datasets, thus leading to models with high risk of bias. This raises concerns about the suitability of such models for clinical use.This systematic review highlights the limited description of datasets employed for modelling and aids researchers to select the most suitable datasets for their task.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the Luxembourg National Research Fund (FNR) COVID-19/2020-1/14702831/AICovIX/Husch grant. Beatriz Garcia Santa Cruz is supported by the FNR within the PARK-QC DTU (PRIDE17/12244779/PARK-QC). Andreas Husch is partially supported by the Fondation Cancer Luxembourg. The authors would like to thank Prof. Peter Gemmar, Trier University of Applied Sciences, Trier, Germany and Prof. Frank Hertel, Centre Hospitalier de Luxembourg, Luxembourg, for discussions.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:No ethics neededAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data used is publicly available