Abstract
Background Atrial fibrillation (AF) ablation is an effective treatment for reducing episodes and improving quality of life in patients with AF. However, in some patients there are only modest long-term AF-free rates after AF ablation. There is a need to address the limited benefits some patients experience by developing predictive algorithms to improve AF ablation outcomes.
Objective The authors aim to utilize machine learning models on claims data to explore if innovative coding models may lead to better patient outcomes than use of traditional stroke risk score prediction.
Methods The Merative MarketScan® Research Medicare data was used to examine claims for AF ablation. To predict 1-year AF-free outcomes after AF ablation, logistic regression and XGBoost models were used. Model predictions were compared with established risk scores CHADS2 and CHA2DS2-VASC. These models were also assessed on subgroups of patients with paroxysmal AF, persistent AF, and both AF and atrial flutter from 2015 onwards.
Results The sample included 14,521 patients with claims for AF ablation. XGBoost achieves an area under the receiver operating characteristic curve (AUC) of 0.525, 0.521, and 0.527 for the entire AF ablation population, female, and male, respectively. Machine learning models perform the best for the paroxysmal AF subgroup using ICD codes, demographic information, and comorbidity indexes, achieving an AUC of 0.546.
Conclusion Machine learning models outperform CHADS2 and CHA2DS2-VASC in all AF ablation patient groups (whole population, female, and male). Using patient data for those who had their AF ablation on or after 2015, machine learning models perform best in all subgroups and the population, indicating that including ICD codes in machine learning models may improve performance.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was supported by the National Institute of Health / National Heart, Lung, and Blood Insitute award R21HL156184.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Emory University's Institutional Review Board waived ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Funding: This work was supported by grant R21HL156184 (PI: Vicki Stover Hertzberg) from the National Institutes of Health to Emory University, Atlanta, Georgia.
Data Availability
The machine learning models produced in the present study are available upon reasonable request to the authors. The code is also publicly released in a GitHub repository.
Abbreviations
- AF
- Atrial fibrillation
- AUC
- Area under the curve
- CCI
- Charlson comorbidity index
- Com
- Comorbidity Indices
- Demo
- Demographic characteristic(s)
- ECG
- Electrocardiogram
- ECI
- Elixhauser comorbidity index
- EHR
- Electronic health record(s)
- I Table
- Inpatient Admission Table
- ICD
- International Classification of Disease
- ML
- Machine learning
- O Table
- Outpatient Services Table
- ROC
- Receiver operating characteristic
- S Table
- Inpatient Services Table
- SD
- Standard Deviation
- US
- United States
- XGB
- XGBoost
- XGBoost
- Extreme Gradient Boosting