Abstract
Objective To challenge clinicians and informaticians to learn about potential sources of bias in medical machine learning models through investigation of data and predictions from an open-source severity of illness score.
Methods Over a two-day period (total elapsed time approximately 28 hours), we conducted a datathon that challenged interdisciplinary teams to investigate potential sources of bias in the Global Open Source Severity of Illness Score. Teams were invited to develop hypotheses, to use tools of their choosing to identify potential sources of bias, and to provide a final report.
Results Five teams participated, three of which included both informaticians and clinicians. Most (4/5) used Python for analyses, the remaining team used R. Common analysis themes included relationship of the GOSSIS-1 prediction score with demographics and care related variables; relationships between demographics and outcomes; calibration and factors related to the context of care; and the impact of missingness. Representativeness of the population, differences in calibration and model performance among groups, and differences in performance across hospital settings were identified as possible sources of bias.
Discussion Datathons are a promising approach for challenging developers and users to explore questions relating to unrecognized biases in medical machine learning algorithms.
Author summary Disadvantaged groups are at risk of being adversely impacted by biased medical machine learning models. To avoid these undesirable outcomes, developers and users must understand the challenges involved in identifying potential biases. We conducted a datathon aimed at challenging a diverse group of participants to explore an open-source patient severity model for potential biases. Five groups of clinicians and informaticians used tools of their choosing to evaluate possible sources of biases, applying a range of analytic techniques and exploring multiple features. By engaging diverse participants with hands-on data experience with meaningful data, datathons have the potential to raise awareness of potential biases and promote best practices in developing fair and equitable medical machine learning models.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
CMH and HH are supported by NIH Grant 5R01NS118716-03. CMH is also supported by 5K23HD099331. TP and LAC are supported by Nationals Institute of Health Grant OT2OD032701. LAC is also funded by NIH Grants R01 EB017205 and DS-I Africa U54 TW012043-01 and the National Science Foundation through ITEST \#2148451. EPC is supported by NLM Training Grant T15LM007059.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
An ethics statement has been added to the methods section.
Data Availability
All data produced in the present study are available upon reasonable request to the authors.