Abstract
Background Ischemic stroke (IS) is a primary cause of disability and mortality globally. More and more reports suggest a strong association between blood pressure, blood glucose, and blood lipids and their metabolic products with IS.
Methods We extracted the genetic tools of blood pressure, blood glucose, and blood lipids and their metabolites as instrumental variables, which were then paired with GWAS data on IS and a Mendelian randomization (MR) analysis was performed to assess the effect of these exposures on the disease. Following the positive results, colocalization analysis was performed to identify shared genes associated with exposures and IS. We then performed differential expression analysis using the GEO dataset to identify the differentially expressed associated genes (DEAGs) from associated shared genes. Additional analyzes were performed on these DEAGs to obtain their importance scores using four machine learning models. A nomogram was created using genes with high importance scores to predict the level of risk assessment between DEAGs and IS.
Results There is a positive correlation between blood pressure, blood glucose and the risk of IS onset, while blood lipids and their metabolic products are positively or negatively correlated with the risk. There are 64 shared genes of blood pressure, blood lipids and their metabolic products with IS. Thirteen DEAGs were obtained, and among which FURIN, MAN2A2, HDDC3, ALDH2, and TOMM40 were identified as feature genes for creating the nomogram which can quantitatively predict the risk of IS onset with the expression of these feature genes. By cluster analysis, we found that DEAGs expression underlying immune inflammation, angiogenesis and development, lipid metabolism, etc.
Conclusion This study suggests a significant association between blood pressure, blood glucose, and blood lipids and their metabolic products with IS, and predicts that these exposures mainly regulate the occurrence, development, and prognosis of IS through mechanisms such as DNA repair, DNA methylation, mitochondrial repair, apoptosis, autophagy, etc.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
All data in this paper are from public databases, so this item is not applicable.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The data underlying this article are available in databases of Open GWAS and GEO at https://gwas.mrcieu.ac.uk/ and https://www.ncbi.nlm.nih.gov/geo/.
Abbreviations
- AUC
- Area Under the Curve
- DNA
- Deoxyribonucleic acid
- DEAGs
- Differentially expressed associated genes
- DEGs
- differentially expressed genes
- XGB
- Extreme Gradient Boosting model
- FDR
- false discovery rate
- GEO
- Gene Expression Omnibus
- GO
- Gene ontology
- GSVA
- gene set variation analysis
- GL
- Generalized Linear
- GWAS
- genome-wide association study
- HDL
- high-density lipoprotein
- IVs
- Instrumental variables
- IEU
- Integrated Epidemiology Unit
- IDL
- intermediate-density lipoprotein
- IVW
- inverse variance weighted
- IS
- Ischemic stroke
- KEGG
- Kyoto Encyclopedia of Genes and Genomes
- LDL
- low-density lipoprotein
- MR
- Mendelian randomization
- NK
- Natural killer
- ANOVA
- one-way analysis of variance
- PP
- posterior probability
- PCA
- Principal Component Analysis
- RF
- Random Forest model
- ROC
- receiver operating characteristic curves
- SNPs
- Single nucleotide polymorphisms
- ssGSEA
- Single-sample gene set enrichment analysis
- SVM
- Support Vector Machine model
- VLDL
- very low-density lipoprotein