ABSTRACT
Background Real-time prediction is key to prevention and control of healthcare-associated infections. Contacts between individuals drive infections, yet most prediction frameworks fail to capture the dynamics of contact. We develop a real-time machine learning framework that incorporates dynamic patient contact networks to predict patient-level hospital-onset COVID-19 infections (HOCIs), which we test and validate on international multi-site datasets spanning epidemic and endemic periods.
Methods Our framework extracts dynamic contact networks from routinely collected hospital data and combines them with patient clinical attributes and background contextual hospital data to forecast the infection status of individual patients. We train and test the HOCI prediction framework using 51,157 hospital patients admitted to a UK (London) National Health Service (NHS) Trust from 01 April 2020 to 01 April 2021, spanning UK COVID-19 surges 1 and 2. We then validate the framework by applying it to data from a non-UK (Geneva) hospital site during an epidemic surge (40,057 total inpatients) and to data from the same London Trust from a subsequent period post surge 2, when COVID-19 had become endemic (43,375 total inpatients).
Findings Based on the training data (London data spanning surges 1 and 2), the framework achieved high predictive performance using all variables (AUC-ROC 0·89 [0·88-0·90]) but was almost as predictive using only contact network variables (AUC-ROC 0·88 [0·86-0·90]), and more so than using only hospital contextual (AUC-ROC 0·82 [0·80-0·84]) or patient clinical (AUC-ROC 0·64 [0·62-0·66]) variables. The top three risk factors we identified consisted of one hospital contextual variable (background hospital COVID-19 prevalence) and two contact network variables (network closeness, and number of direct contacts to infectious patients), and together achieved AUC-ROC 0·85 [0·82-0·88]. Furthermore, the addition of contact network variables improved performance relative to hospital contextual variables on both the non-UK (AUC-ROC increased from 0·84 [0·82–0·86] to 0·88 [0·86–0·90]) and the UK validation datasets (AUC-ROC increased from 0·52 [0·49–0·53] to 0·68 [0·64-0·70]).
Interpretation Our results suggest that dynamic patient contact networks can be a robust predictor of respiratory viral infections spreading in hospitals. Their integration in clinical care has the potential to enhance individualised infection prevention and early diagnosis.
Funding Medical Research Foundation, World Health Organisation, Engineering and Physical Sciences Research Council, National Institute for Health Research, Swiss National Science Foundation, German Research Foundation.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
AM was supported in part by a scholarship from the Medical Research Foundation National PhD Training Programme in Antimicrobial Resistance Research (MRF-145-0004-TPG-AVISO), as well as by the National Institute for Health Research Academy. RLP was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Project-ID 424778381-TRR 295. MO received funding from the Swiss National Science Foundation. AH is a National Institute for Health Research Senior Investigator. AH is also partly funded by the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Healthcare Associated Infections and Antimicrobial Infections in partnership with Public Health England, in collaboration with, Imperial Healthcare Partners, University of Cambridge and University of Warwick (NIHR grant code: NIHR200876). AM, RLP, and MB acknowledge funding from EPSRC grant EP/N014529/1 to MB, supporting the EPSRC Centre for Mathematics of Precision Healthcare. The underlying investigation also received financial support from the World Health Organization (WHO). The research was also supported by the NIHR Imperial Biomedical Research Centre and by the iCARE environment and used the iCARE team and data resources. The views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the National Institute for Health Research, the Department of Health and Social Care or Public Health England or those of the WHO.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Patient data were extracted and de-identified from the business intelligence system (London), iCare (London), and from in-house electronic health records (Geneva). All analyses were approved by ethical committees (London: Imperial College London NHS Trust service evaluations [Ref:386,379,473] and ethics approval under 15\_LO\_0746; Geneva: Cantonal Ethics Committee [no. CCER 2020-00827]).
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵* m.barahona{at}imperial.ac.uk
Data Availability
The processed anonymised training and testing dataset used in this study can be available upon reasonable request to the corresponding author. Patient pathways will not be provided as these are withheld by the corresponding author's organisation to preserve patient privacy. Data from the Imperial Clinical Analytics Research and Evaluation (iCARE) platform used in this study may be available to researchers at request. External validation data sources will not be provided as these are withheld by owners. Data regarding hospital COVID-19 admissions is freely available via the NHS website. The code of the method is freely available as an (R package with examplar data sets.