ABSTRACT
Background There is a lack of tools specifically designed to assess mortality risk in patients with atrial fibrillation (AF). The aim of this study was to utilize machine learning methods for identifying pertinent variables and developing an easily applicable prognostic score to predict 1-year mortality in AF patients.
Methods This single-center retrospective cohort study based on the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database focused on patients aged 18 years and older with AF. The study thoroughly scrutinized patient data to identify and analyze variables, encompassing demographic variables, comorbidities, scores, vital signs, laboratory test results, and medication usage. The variable importance from XGBoost guided the development of a logistic model, forming the basis for an AF scoring model. Decision curve analysis was used to compare the AF score with other scores. Python and R software were used for data analysis.
Results A cohort of 59,595 AF patients was obtained from the MIMIC-IV database; these patients were predominantly elderly (median age 77.3 years) and male (55.6%). The XGBoost model effectively predicted 1-year mortality (Area under the curve (AUC): 0.833; 95% confidence intervals: 0.826-0.839), underscoring the significance of the Charlson Comorbidity Index (CCI) and the presence of metastatic solid tumors.
The CRAMB score (Charlson comorbidity index, readmission, age, metastatic solid tumor, and blood urea nitrogen maximum) outperformed the CCI and CHA2DS2-VASc scores, demonstrating superior predictive value for 1-year mortality. In the test set, the area under the ROC curve (AUC) for the CRAMB score was 0.756 (95% confidence intervals: 0.748-0.764), surpassing the CCI score of 0.720 (95% confidence intervals: 0.712-0.728) and the CHA2DS2-VASc score of 0.609 (95% confidence intervals: 0.600-0.618). Decision curve analysis revealed that the CRAMB score had a consistently positive effect and greater net benefit across the entire threshold range than did the default strategies and other scoring systems. The calibration plot for the test set indicated that the CRAMB score was well calibrated.
Conclusions This study’s primary contribution is the establishment of a benchmark for utilizing machine learning models in construction of a score for mortality prediction in AF. The CRAMB score was developed by leveraging a large-sample population dataset and employing XGBoost models for predictor screening. The simplicity of the CRAMB score makes it user friendly, allowing for coverage of a broader and more heterogeneous AF population.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by the Real World Study Project of Hainan Boao Lecheng Pilot Zone (Real World Study Base of NMPA) (No. HNLC2022RWS017).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Approval for the MIMIC-IV database was granted by the Massachusetts Institute of Technology (Cambridge, MA) and Beth Israel Deaconess Medical Center (Boston, MA), with consent obtained for the initial data collection.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data used in this study can be accessed on PhysioNet's website at https://physionet.org/content/mimiciv/2.2/.
List of abbreviations
- ABC-death
- Age, biomarkers (N-terminal pro B-type natriuretic peptide, troponin T, growth differentiation factor-15), and clinical history of heart failure
- AF
- Atrial fibrillation
- AUC
- Area under the curve
- BASIC-AF risk score
- Biomarkers, age, ultrasound, intraventricular conduction delay, and clinical history
- BUN
- blood urea nitrogen
- CCI
- Charlson Comorbidity Index
- CHA2DS2-VASc
- congestive heart failure, hypertension, age, diabetes mellitus, prior stroke or TIA or thromboembolism, vascular disease, age, sex category
- CI
- Confidence interval
- CRAMB
- Charlson comorbidity index, readmission, age, metastatic solid tumor, and maximum blood urea nitrogen
- DCA
- Decision curve analysis
- HR
- hazard ratio
- ICU
- Intensive care unit
- MIMIC-IV
- Medical Information Mart for Intensive Care-IV
- OR
- Odds ratio
- ROC
- Receiver operating characteristic