PT - JOURNAL ARTICLE AU - Levy, Joshua J. AU - Lima, Jorge F. AU - Miller, Megan W. AU - Freed, Gary L. AU - O’Malley, A. James AU - Emeny, Rebecca T. TI - Investigating the potential for machine learning prediction of patient outcomes: a retrospective study of hospital acquired pressure injuries AID - 10.1101/2020.03.29.20047084 DP - 2020 Jan 01 TA - medRxiv PG - 2020.03.29.20047084 4099 - http://medrxiv.org/content/early/2020/04/08/2020.03.29.20047084.short 4100 - http://medrxiv.org/content/early/2020/04/08/2020.03.29.20047084.full AB - Background While recent research efforts to reduce pressure ulcers in the clinical context have focused on key retrospective characteristics, little work has focused on creating real-time predictive models to prevent this avoidable hospital-acquired injury. Furthermore, existing machine learning heuristics often fail to surpass traditional statistical models or provide individual-level risk assessments with explanations for each patient. Thus, we sought to compare the predictive performance of five machine learning and traditional statistical modeling techniques to predict the occurrence of Hospital Acquired Pressure Injuries (HAPI).Methods Electronic Medical Record (EMR) information was collected from 57,227 hospitalizations, containing 241 positive HAPI cases, acquired from Dartmouth Hitchcock Medical Center from April 2011 to December 2016. The five classifiers were trained to predict HAPI incidence and performance was assessed using the C-statistic or Area Under the Receiver Operating Curve (AUC).Results Logistic Regression was the best modeling approach (AUC=0.91±0.034). We report discordance between predictors deemed important by the machine learning models compared to traditional statistical model. We provide means to visually assess factors important to every patient’s prediction, regardless of the modeling approach, through Shapley Additive Explanations.Conclusions Machine learning models will continue to inform decision making processes but should be compared to traditional modeling approaches to ensure proper utilization. Disagreements between important predictors found by traditional and machine learning modeling approaches can potentially confuse clinicians and as such need to be reconciled. Future efforts to analyze time-stamped, prospective medical record data will be enhanced by patient-specific details. These developments represent important steps forward in developing real-time predictive models that can be integrated and readily deployed in electronic medical record systems to reduce unnecessary harm.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe Dartmouth Clinical and Translational Science Institute supported RTE under the award number UL1TR001086 from the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health (NIH). JJL is supported by the Burroughs Wellcome Fund Big Data in the Life Sciences training grant at Dartmouth. The funding bodies above did not have any role in the study design, data collection, analysis and interpretation, or writing of the manuscript.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe EMR dataset curated from Dartmouth-Hitchcock records contains information that could compromise research participant privacy/consent and thus cannot be released due to HIPAA regulations. An IRB approval is required for on-site access and review of the data.HAPIHospital Acquired Pressure InjuriesEMRElectronic Medical RecordsAUCArea Under the Receiver Operating Curve; C-StatisticLOSLength of StayORTime in Operating RoomMICEMultiple Imputation by Chained EquationsSMOTESynthetic Minority Over-Sampling TechniqueCRPC-Reactive Protein LevelsNPOPatient’s Diet Taken by MouthSHAPShapley Additive Explanations