Abstract
Background The Gensini score (GS) provides a good assessment of the degree of coronary plate loading. However, its clinical significance has been little explored.
Methods In this retrospective cohort study, we implemented model development and performance comparison on database of The Fourth Affiliated Hospital of Zhejiang University School of Medicine (2019.1-2020.12). The patients were followed up for 2 years. Follow-up endpoint was the occurrence of MACCEs. We extracted clinical baseline data from each ACS patient within 24 hours of hospital admission and randomly divided the datasets into 70% for model training and 30% for model validation. Area under the curve (AUC) was used to compare the prediction performance of XGBoost, SGD and KNN. A decision tree model was constructed to predict the probability of MACCEs using a combination of weight features picked by XGBoost and clinical significance.
Results A total of 361 ACS patients who met the study criteria were included in this study. It could be observed that the probability of a recurrent MACCEs within 2 years was 25.2%. XGboost had the best predictive efficacy (AUC:0.97). GS has high clinical significance. Then we used GS, Age and CK-MB to construct a decision tree model to predict the probability model of MACCEs reoccurring, and the final AUC value reached 0.771.
Conclusions GS is a powerful indicator for assessing the prognosis of patients with ACS. The cut-off value of GS in the decision tree model provides a reference standard for grading the risk level of patients with ACS.
1. Introduction
Currently, the incidence of Acute Coronary Syndrome (ACS) remains high globally. As an acute episode of coronary heart disease (CAD), ACS has a relatively higher mortality and morbidity [1,2]. The Framingham Heart Study’s 10-year follow-up data revealed that the incidence of ACS events increases dramatically with age [3,4].
For ACS, percutaneous coronary intervention (PCI) has been formalized as an effective treatment [5,6]. Patients with ACS often have a combination of clinical risk factors, such as hypertension, cerebral infarction, diabetes mellitus, smoking, and so on. Therefore they have a higher probability of recurrent Major Adverse Cardiac and Cerebrovascular Events (MACCEs) after emergency PCI [7,8].
Scholars at home and abroad have launched a large number of studies to investigate the prediction of long-term MACCEs in patients with ACS. They developed indicators like TACE/ADAM17, Triglyceride-glucose index, HEART SCORE, Systemic immune-inflammation index (SII) and so on [7,9-11]. These predictors have achieved relatively satisfactory results. Unfortunately, none of them are directly linked to the ACS, but rather an indirectly mediated.
The Gensini score (GS) provides an objective assessment of the degree of coronary plaque load [12,13]. Wang KY et al. study has revealed the predictive value of GS in patients with CAD undergoing PCI [14]. However, they have not yet developed a valid GS-based predictive model for patients with CAD. Nowadays, Machine Learning (ML) models based on Artificial Intelligence (AI) theory are relatively mature in medicine [15,16]. ML has the advantage of being good at extracting non-linear relationships between different variables and showing researchers the weighting relationships by feature maps.
Adoption of the above background, we propose to apply ML approaches to elucidate the prognostic value of GS for ACS patients. Further more, to construct a GS-based prediction model the probability of a recurrent MACCEs within 2 years for ACS patients.
2. Methods
2.1 Date Source
All patients in the database of the Chest Pain Center of the Fourth Affiliated Hospital of Zhejiang University School of Medicine who were seen between 2019.1-2020.12 were included.
2.2 Ethical Considerations
The Ethics Committee of the Fourth Affiliated Hospital of Zhejiang University School of Medicine approved the study. The study adhered to the Declaration of Helsinki.
2.3 Study population
A total of 455 patients who needed medical attention for acute chest pain were exported from the database.The exclusion criteria for this study were as follows:
the patient has already received a previous coronary intervention
Chest pain of non-cardiac origin [19]
Seriously missing basic data
Abandonment of interventional treatment for their own reasons, opting for conservative treatment
Acute respiratory and cardiac arrest leading to failure of resuscitation
In this retrospective study, we examined whether MACCEs occurred within 2 years as the primary outcome. The follow-up date was up to 2023.2. We regularly followed up patients with ACS who underwent PCI at 1, 3, 6, 12, and 24 months, and if MACCEs occurred during this period, the follow-up was stopped immediately. At the duration of follow-up, the patient’s clinical information stored in the electronic medical record system was again evaluated and confirmed, including the patient’s previous underlying diseases (Hypertension, Diabetes Mellitus, Coronary Artery Disease,etc.), as well as the patient’s current clinical symptoms. The emergency coronary angiography and coronary stent implantation procedures were performed by 2 experienced coronary interventionists in our cardiology.
Routine hematology, liver and renal function, cardiac enzyme profile, NT-proBNP and troponin levels were recorded within 24 hours of the patient’s initial admission, and all of the above were available from our laboratory. All follow-up information was available in the Chest Pain Center database, and data that were not available were supplemented by telephone contact.
A total of 361 patients ultimately met the criteria for this study.
2.4 Gensini Score
For patients who met the study criteria, we performed GS calculations based on coronary angiography results. The degree of stenosis of coronary lesions was assessed using the coronary Gensini Score as follows: first, the largest stenotic lesions in each branch artery were quantitatively assessed on the basis of coronary angiography, and the degree of stenosis ≤25%, 26% to 50%, 51% to 75%, 76% to 90%, 91% to 99%, and 100% were recorded as 1, 2, 4, 8, 16 and 32 points, respectively. Then, according to the different coronary branches, the above scores were multiplied by the corresponding values:① Left Main artery (LM)×5.0. ② Left anterior descending branch (LAD): proximal LAD×2.5, middle LAD×1.5, distal LAD×1.0. ③ First diagonal branch (D1)×1.0, second diagonal branch (D2)×0.5. ④ Left levator branch (LCX): proximal LCX×2.5, middle LCX×1.5, middle LCX×1.5, and distal LCX×1.0. ⑤ Obtuse marginal branch (OM)×1.0. ⑥ Left posterior lateral branch (PL)×0.5. ⑦ Right coronary artery (RCA)×1.0. ⑧ Posterior descending branch (PDA)×1.0. Finally, the total score of each lesion was summed to obtain the patient’s coronary Gensini score. In this study, the degree of coronary stenosis evaluated with GS was classified on a 4-point scale: none (GS=0), mild (0 GS≥20), moderate (20≤GS<50) or severe (GS≥50).
2.5 Statistical analysis
All statistical analysis and calculations were performed using R software (version 4.3.1) and Python (version 3.9.0). Categorical variables are expressed as total numbers and percentages. Continuous variables are expressed as mean±standard deviation. P < 0.05 was considered statistically significant for this study.
Given the multidimensional characteristics among the variables, we first used Pearson’s heat map to observe the correlation among the variables and used the t-test method for data downscaling [20]. Three ML models -XGBoost, SGD, and KNN were used to develop predictive models [21-25]. The downscaled data is fed into a subsequent ML model. The predictive performance of each model was evaluated by the AUC. In addition, we calculated the Recall, Positive Predictive Value (PPV), Negative Predictive Value (NPV), and F1 Score. In addition, to evaluate the utility of the decision models by quantifying the net benefit at different threshold probabilities, we performed Decision Curve Analysis (DCA) [26,27]. Considering the single-center data, we also used K-fold cross-validation to check the stability of our model.
We use the optimal predictive model to observe the weighting characteristics of the variables. Selected features were combined with clinical significance to construct a Decision Tree model to predict the incidence of MACCEs in patients over a 2-year period [28]. And its decision-making performance is evaluated by AUC. If necessary, we will focus on the prognosis of this indicator using survival analysis curves.
The flow of this study is shown schematically in Figure 1.
3. Results
3.1 Patient Characteristics
As seen in Table 1, a total of 361 patients were included in the study. 91 patients (25.2%) had MACCEs events within 2 years. Among the Table 1, GS showed a statistically significant difference within the two groups mentioned above (P<0.001). To facilitate machine learning, we set the time-years in which MACCEs occur as independent labels as the annotation of the data.
(Y:Yes, N:=None)
3.2 Correlation between variables in this study
As can be seen in Figure 2, LDH was highly correlated with the ALT variable (coefficient was 0.96), Total Cholesterol was highly correlated with HDL and LDL (coefficients were 0.77 and 0.93, respectively), and NLR was highly correlated with SII (coefficient was 0.9). The above heat map provides the basis for subsequent data downscaling.
3.3 Development and validation of models
We preprocessed and cleaned the data and randomly divided the dataset into 70% training (n=252) set and 30% testing set (n=109) . Test sets are often representative of a model’s ability to generalize. From Figure 3, Figure 4 and Table 2, it can be seen that the XGBoost model has the best performance. The accuracy was 0.963, the AUC value was 0.97, and the clinical net benefit curves were higher than the remaining two models.
3.4 Observation of feature weights
As can be seen in Figure 5, the weight share of GS is the most important in the whole model. This provides a theoretical basis for selecting variables for our subsequent modeling.
3.5 Decision Tree Modeling Predicts Likelihood of MACCEs in 2 Years
Based on the weighting characteristics of the XGBoost screening variables and combining the clinical significance, we selected the following variables for model construction: GS, Age, and CK-MB. The Decision Tree (DT) represents a mapping relationship between object attributes and values, so its Cut-off value for each node contributes to the grading of the condition’s hazard level. We assessed the predictive efficacy of the DT by the ROC curve, which has a considerable AUC value of 0.771. In Figure 8, we have utilized the Cut-off values of the decision tree to plot the survival curves with respect to GS.
4. Discussions
This study utilizes advanced ML methods, precise criteria were applied to the collected clinical characterization variables. Takes advantage of the fact that ML is good at capturing the characteristics of non-linear data. The main features of XGBoost are its simplicity, robustness, and the ability to automatically fill in missing values [22]. It tends to have superior predictive performance in datasets with small samples. While the inner workings of ML remain elusive to us, the Features graph can still help us reveal the characteristics of the data variables. The graph of the weighted characteristics of the variables revealed that GS is a very valuable indicator in predicting the prognosis of ACS patients. The results of this study coincide with the research by Wang et al [14].
The advantage of GS is that it can capitalize on the experience of the coronary operator to quantitatively score the specific degree of stenosis in the coronary artery. Its scoring can be refined down to tiny coronary branches, so it provides a very objective assessment of a person’s plaque load. But this score does not address bifurcation, calcification, and tortuous lesion characteristics. Thus, GS is in a sense more appropriate for patients after emergency PCI where the resident has an approximate estimate of the patient’s prognosis.
However, some of the more established indicators were not remarkable in this study. Examples include SII, NLR, PLR, etc.[11,29]. It may be considered that some patients go directly to the hospital with chest pain and the inflammatory storm in the organism has not yet reached its peak [30,31]. Inflammatory stimuli lead to abnormal activation of platelets, increased thrombus load, and excessive and abnormal endothelial proliferation of blood vessels. The weighted significance of such advanced inflammatory indicators was not found in this small sample study. Perhaps the smaller amount of data caused it.
Acute chest pain of cardiac origin is often characterized by high lethality and their patients themselves tend to have a high number of comorbid clinical conditions. Symptoms such as coronary revascularization, or the occurrence of cerebral infarction or cardiogenic shock are therefore common within a short period of time, so predicting the occurrence of MACCEs is of particular interest. We established this prediction model for this purpose can determine the prognosis of patients more quickly, which can provide a reference for clinical decision-making. In DT, each forked path represents the value of one of the possible attributes, while each leaf node corresponds to the value of the object represented by the path traveled from the root node to that leaf node [28]. And according to the current literature research, there is no relevant study that grades the GS for coronary load with a more rigorous degree of risk. DT fit right in with the original intent of our study.
Based on the Featrues and clinical significance, we selected the above 3 indicators for the following reasons. Kolovou G’research team has demonstrated that age is strongly associated with the development of coronary heart disease [32,33]. Similarly, CK-MB is uniquely specific and sensitive in the diagnosis of ACS [34,35]. Age, CK-MB integrated with GS obtained by coronary angiography results to construct the predictive model. In the DT, the Cut-off value of each node has a more significant result. We can draw on each node value to provide a theoretical basis for grading the severity of coronary heart disease.
And the significance of this prediction model is to actively guide the treatment of patients who do not understand the concept of the condition, to improve their prognosis, and to reduce the incidence of MACCEs within 2 years. Finally, we can see from the survival curves that the lower the GS score, the better the long-term prognosis of the patient, which is consistent with clinical practice.
5. Limitations
There are inevitable limitations to this study. First, the study was single-center data. The generalizability of the model is debatable. Second, there is a flaw in the node setting for the follow-up time. It is possible that a model that records the time from admission time to the occurrence of MACCEs would work better. Third, the follow-up studies were all conducted under ideal conditions based on good patient medical compliance and regular medication intake, and did not include other confounding factors after PCI. We are still working on data collection to increase the robustness of subsequent models.
6. Conclusions
Our study shows that GS is a persuasive metric for determining the long-term prognosis of ACS patients based on machine learning theoretically investigated. And the DT model constructed using GS, Age and CK-MB indicators had good predictive efficacy for the occurrence of MACCES within 2 years in ACS patients after PCI. The Cut-Off value of GS in the DT can provide a theoretical basis for subsequent grading of coronary risk level.
Data Availability
The data that support the findings of this study are available on request from the corresponding author upon reasonable request.
7. Ethics approval and consent to participate
The Ethics Committee of the Fourth Affiliated Hospital of Zhejiang University School of Medicine approved the study.
8. Consent for publication
All authors agreed to the publication of the article.
9. Conflicts of Interest
None declared.
10. Authors’ Contributions
Lixia Chen and Sixiang Jia conceived the study. Sixiang Jia completed the dissertation. Sixiang Jia and Xuanting Mou completed the statistical analysis of the data. Yiting Tu reviewed the article and provided suggestions. Lixia Chen, Wenting Lin and Chao Feng were responsible for data organization and collection. Shudong Xia is responsible for the content of the article in its entirety.
11. Funding
This study was supported by National Natural Science Foundation of China (No.81971688)
13. Availability of data and material
The dataset cannot be shared as it relates to patient privacy.
12. Acknowledgements
Thanks to all the authors for their hard work.
Abbreviations
- ACS
- Acute Coronary Syndrome
- ALT
- Alanine Transaminase
- APTT
- Activated Partial Thromboplastin Time
- AUC
- Area Under the Curve
- CK
- Creatine Kinase
- CK-MB
- Creatine Kinase-Myocardial Band
- CRP
- C-Reactive Protein
- D-D
- D-dimer
- DT
- Decesion Tree
- GS
- Gensini Score
- K
- Kalium
- KNN
- K-Nearest Neighbor
- LDH
- Lactic Dehydrogenase;
- MACCEs
- Major Adverse Cardiac and Cerebrovascular Events
- ML
- Machine Learning
- NPV
- Negative Predictive Value
- NLR
- Neutrophil-to-Lymphocyte Ratio
- PCI
- Percutaneous Coronary Intervention
- PLR
- Platelet-to-Lymphocyte Ratio
- PPV
- Positive Predictive Value
- PT
- Prothrombin time
- ROC
- Receiver Operating Characteristic Curve
- SGD
- Stochastic Gradient Descent
- SII
- Systemic Immune-Inflammation index
- TT
- Thrombin Time