ABSTRACT
Objectives This study aimed to develop and validate a machine learning model to predict deterioration using Australian hospital data, paying particular attention to the role of predictors not included in current scoring systems.
Design Retrospective cohort study using electronic health records from a large metropolitan health service.
Setting General hospital wards, excluding the Emergency Department, Intensive Care Unit, or Palliative Care.
Participants Inpatients over the age of 18.
Main Outcome Measures The primary outcomes of deterioration were mortality and ICU transfer within 24 hours of a newly available observation. A Gradient Boosted Tree model was estimated using patient demographics, vital signs, pathology results, and linear trends. Resulting feature importance was investigated using Shapley values. The model performance was validated against existing scoring systems, including Between the Flags (BTF) and the Modified / National Early Warning Score (MEWS/NEWS).
Results A Gradient Boosted Tree was developed from 121,608 patients and tested in 20,605 patients. The model, named aWARE, demonstrated higher discriminative ability (AUROCmortality=0.93, AUROCICU transfer=0.84), and calibration when compared to baseline scores. Overall, the 10 most influential features unique between both outcomes were age, oxygen saturation to inspired oxygen ratio, respiratory rate, white cell count, venous lactate, heart rate to systolic blood pressure ratio, albumin, oxygen saturation, urea and heart rate. Of these, only 3 are included in BTF.
Conclusion The machine learning model proposed in this study identified more deteriorating patients and produced less false positive alerts than Between the Flags. Feature importance highlighted the deficit between strong predictors of deterioration and the parameters used in current scoring systems.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This research has been partly supported by the NSW Agency for Clinical Innovation (ACI) Grant Scheme 23/24. It's contents are the responsibility of the authors and their institutions and do not necessarily reflect the views of the ACI.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics approval for this study was provided by the Sydney Local Health District Human Research Ethics Committee under the project Data derived Risk assessment using the Electronic Medical record through Application of Machine Learning (DREAM). Local reference number: CH62/6/2018-203. REGIS reference number: 2019/PID09922.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Secondary contact: Jonathan Daniel Greenberg, j.greenberg{at}student.unsw.edu.au
Declarations of interest: None
Table 3 revised; Results section minor revision.
Data Availability
All data produced in the present work are contained in the manuscript