Abstract
Background Catheter-associated urinary tract infections (CA-UTIs) significantly increase clinical burdens. Identifying patients at high-risk of CA-UTIs is crucial in clinical practice. In this study, we developed and externally validated an explainable, prognostic prediction model of CA-UTIs among hospitalized individuals receiving urinary catheterization.
Methods We applied a retrospective cohort paradigm to select data from a clinical research database covering three hospitals in Taiwan. We developed a prediction model using data from two hospitals and used the third hospital’s data for external validation. We selected predictors by a multivariate regression analysis through applying a Cox proportional-hazards model. Both statistical and computational machine learning algorithms were applied for predictive modeling: (1) ridge regression; (2) decision tree; (3) random forest (RF); (4) extreme gradient boosting; and (5) deep-insight visible neural network. We evaluated the calibration, clinical utility, and discrimination ability to choose the best model by the validation set. The Shapley additive explanation was used to assess the explainability of the best model.
Results We included 122,417 instances from 20-to-75-year-old subjects with multiple visits (n=26,401) and multiple orders of urine catheterization per visit (n=35,230). Fourteen predictors were selected from 20 candidate variables. The best prediction model was the RF for predicting CA-UTIs within 6 days. It detected 97.63% (95% confidence interval [CI]: 97.57%, 97.69%) CA-UTI positive, and 97.36% (95% CI: 97.29%, 97.42%) of individuals that were predicted to be CA-UTI negative were true negatives. Among those predicted to be CA-UTI positives, we expected 22.85% (95% CI: 22.79%, 22.92%) of them to truly be high-risk individuals. We also provide a web-based application and a paper-based nomogram for using the best model.
Conclusions Our prediction model was clinically accurate by detecting most CA-UTI positive cases, while most predicted negative individuals were correctly ruled out. However, future studies are needed to prospectively evaluate the implementation, validity, and reliability of this prediction model among users of the web application and nomogram, and the model’s impacts on patient outcomes.
Competing Interest Statement
The authors have declared no competing interest.
Clinical Protocols
https://github.com/herdiantrisufriyana/ml_nomogram
Funding Statement
This work was supported by: (1) the Postdoctoral Accompanies Research Project from the National Science and Technology Council in Taiwan [grant no. NSTC111-2811-E-038-003-MY2] to HS; (2) the Ministry of Science and Technology in Taiwan [grant nos. MOST110-2628-E-038-001 and MOST111-2628-E-038-001-MY2] to ECYS; and (3) the Higher Education Sprout Project from the Ministry of Education in Taiwan [grant nos. DP2-111-21121-01-A-05 and DP2-TMU-112-A-13] to ECYS. These funding bodies had no role in the study design; in the collection, analysis, and interpretation of the data; in the writing of the report; or in the decision to submit the article for publication. The authors declare that they have no competing interests.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The Taipei Medical University Joint Institutional Review Board gave ethical approval for this work (TMU-JIRB no.: N202209030).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data that support the findings of this study are available from the Clinical Data Center, Office of Data Science, Taipei Medical University, Taiwan, but restrictions apply to the availability of these data, which were used under license for the current study (access approval no.: A202206008), and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the Clinical Data Center. To get this permission, one need to request an access from the Clinical Data Center (https://ods.tmu.edu.tw/). We shared the programming codes for all analyses in this study (https://github.com/herdiantrisufriyana/colab_uti), including predictive modeling.