Abstract
Hospital-acquired infections (HAIs) contribute to increased mortality rates and extended hospital stays. Patients with complex neurological impairments, secondary to conditions such as acquired brain injury or progressive degenerative conditions are particularly prone to HAIs and often have the worst resulting clinical outcomes and highest associated cost of care. Research indicates that the prompt identification of such infections can significantly mitigate mortality rates and reduce hospitalisation duration. The current standard of care for timely detection of HAIs for inpatient acute and post-acute care settings in the UK is the National Early Warning Score v02 (NEWS2). NEWS2, despite its strengths, has been shown to have poor prognostic accuracy for specific indications, such as infections. This study developed a machine learning (ML) based risk stratification tool, utilising routinely collected patient electronic health record (EHR) data, encompassing over 800+ patients and 400k+ observations collected across 4-years, aimed at predicting the likelihood of infection in patients within an inpatient care setting for patients with complex acquired neurological conditions. Built with a combination of historical patient data, clinical coding, observations, clinician reported outcomes, and textual data, we evaluated our framework to identify individuals with an elevated risk of infection within a 7-day time-frame, retrospectively over a 1-year “silent-mode” evaluation. We investigated several time-to-event model configurations, including manual feature-based and data-driven deep generative techniques, to jointly estimate the timing and risk of infection onset. We observed strong performance of the models developed in this study, achieving high prognostic accuracy and robust calibration from 72–6 hours prior to clinical suspicion of infection, with AUROC values ranging from 0.776–0.889 and well-calibrated risk estimates exhibited across those time intervals (IBS<0.178). Furthermore, by assigning model-generated risk scores into distinct categories (low, moderate, high, severe), we effectively stratified patients with a higher susceptibility to infections from those with lower risk profiles. Post-hoc explainability analysis provided valuable insights into key risk factors, such as vital signs, recent infection history, and patient age, which aligned well with prior clinical knowledge. Our findings highlight our framework’s potential for accurate and explainable insights, facilitating clinician trust and supporting integration into real-world patient care workflows. Given the heterogeneous and complex patient population, and our under-utilisation of the data recorded in routine clinical notes and lab reports, there are considerable opportunities for performance improvement in future research by expanding our model’s multimodal capabilities, generalisability, and additional model personalisation steps.
Competing Interest Statement
A.P.C is an employee and shareholder at Sanome, UK. T.P. is a shareholder at Sanome and was employed at Sanome during the completion of this work. P.A is an employee and a shareholder of PatientSource. L.B is an employee of RHN. S.D. is an employee of RHN.
Funding Statement
This study was funded by Sanome Ltd.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki and received favourable ethical opinion from NHS Research Ethics Committee 24/NE/0008 supported by the Confidentiality Advisory Group 23/CAG/0110. All data handling and analysis were performed in compliance with confidentiality guidelines and data protection regulations. The use of retrospective patient data was ethically reviewed, ensuring the study adhered to principles of transparency, data privacy, and minimal risk to participants.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
tom.pease{at}sanome.com
philip.ashworth{at}patientsource.co.uk
lbradley{at}rhn.org.uk
sduport{at}rhn.org.uk
Data Availability
All data referenced in this manuscript are hospital data and are therefore not publicly available due to patient privacy and confidentiality restrictions. Access may be granted on request to the corresponding author, subject to relevant institutional and ethical approvals.