Abstract
Stroke is the second leading cause of death and the third leading cause of long-term disability in the world. This study aimed to explore the novel putative causal genetic relationship of stroke with hundreds of complex traits by leveraging genetic data. We used genome-wide association studies (GWAS) data and the latent causal variable method to identify potential causal relationships between stroke and 1,504 complex traits of the UK biobank. We found that 262 traits were genetically correlated with stroke risk at a false discovery rate (FDR <0.05). Of those correlated traits, 28 showed robust evidence of partial genetic causality (GCP) with stroke (|GCP|> 0.60; FDR < 0.05). Our results showed that some conditions, including atrial fibrillation, pulmonary embolism, blood clots in the lung, platelet crit, self-reported deep venous thrombosis, weight gain after depression, and the use of some medications such as insulin, pioglitazone, and gliclazide were inferred to increase stroke risk. On the other hand, greater levels of testosterone, apolipoprotein A, SHBG, and HDL cholesterol decrease the risk of stroke. Also, our results suggest that genetic susceptibility to stroke raises the risk of neck and chest pain and loose teeth. Finally, our findings suggest that cardiac vascular disease, blood clot in lung, deep venous thrombosis, and certain anti-diabetic medications could have a causal role in increasing the risk of stroke, which could be used as novel testable hypotheses for future epidemiological studies.
Introduction
Stroke is the second largest cause of mortality and disability worldwide and a leading cause of cognitive impairment and dementia in aged people (1). Stroke affects around 13.7 million individuals, and about 5.5 million die every year (2,3). Stroke is a neurological condition characterised by blockage of a blood vessel. For instance, blood clots can form in the brain, obstructing blood flow and causing blood vessels to rupture, resulting in haemorrhage in the brain. When the brain′s arteries rupture during a stroke event, the lack of oxygen causes the brain cells to die (4).
Clinical and epidemiological studies have repeatedly discovered a hereditary component to stroke vulnerability along with more typical risk factors, including smoking, hypertension, and diabetes mellitus (5,6). Studies on twins and family history also provide evidence that genetic susceptibility is significant in stroke aetiology (7). The estimated heritability was 37.9% for all ischemic strokes (8). The heritability of stroke subtypes varied, with smallvessel disease having a lower heritability (16.1%), large-vessel disease having a higher heritability (40.3%), and cardioembolic stroke having a medium heritability (32.6%) (8). In genome-wide association studies (GWAS), several significant risk loci have been identified as associated with stroke (9). For instance, cardioembolic stroke was linked to two atrial fibrillation-related loci (PITX2 and ZFHX3), whereas large-vessel stroke was linked to a locus on chromosome 9p21 that was previously associated with coronary artery disease (9-11). In addition, a multi-ancestry GWAS analysis revealed 35 risk loci that are associated with the aetiology of stroke and its subtypes (12,13). Recently, cross-ancestry meta-analyses of GWAS summary data on stroke identified a total of 89 independent loci significantly associated with stroke and stroke subtypes (14). Mendelian randomisation (MR) was used genetic data to understand the causal role of traits on stroke risk (15). For example, blood pressure, atrial fibrillation, venous thrombosis, circulating lipoproteins, LDL (low-density lipoprotein) cholesterol, obesity, type 2 diabetes, insulin resistance, hyperglycaemia, education, BMI, physical activity, alcohol consumption and smoking have been associated with increased risk of stroke using MR (15,16). Mendelian randomisation (MR) and latent causal variable (LCV) approaches examine the influence of genetic liability for a specific trait on the outcome (17,18). Interestingly, the LCV method represents some advantages over MR methods. Firstly, in contrast to conventional MR methods, LCV can accurately differentiate between genetic correlation and full or partial genetic causality. In addition, positive LCV outcomes are more likely to represent true causal effects. Secondly, the genetic causality proportion (gcp-value) can be easily estimated by quantifying the degree of causation, while MR methods, especially MR Egger can be easily confounded by genetic correlations (19).
A genetic correlation might be explained by pleiotropic effects, which can be vertical or horizontal (20). When genetic variants directly influence trait A and trait B, this type of effect is called horizontal pleiotropy (Fig. 1(A)). On the other hand, vertical pleiotropy may be seen as a causal cascade where the influence of a genetic variant on a single trait is explained by its impact on another trait (Fig. 1(B)) (17). Horizontal pleiotropy could lead to high-false positive results in genetic epidemiology to assess the causality (19,21).
Due to the massive public health concerns with serious socio-economic consequences, there is a growing interest in exploring the potential causal genetic risk factors of stroke. Therefore, in the present study, we perform a hypothesis-free exploratory large-scale genetic screening for risk factors of stroke using the LCV method, which is less susceptible to confounding by horizontal pleiotropic effects (19). Our findings support some of the causal associations hypothesized by observational studies and shed light on the relationship between stroke, lifestyle factors, diseases, and health conditions.
Materials and Methods
Stroke data
We obtained GWAS summary statistics for stroke from a meta-analysis of GWAS data by Malik et al. (12) released by the MEGASTROKE consortium (http://megastroke.org/). Briefly, these summary statistics were obtained from a fixed-effects meta-analysis of GWAS in 40,585 stroke cases and 406,111 controls of European ancestry. The details of the study have been described in the original publication (12).
UK biobank cohort datasets
We used 1504 phenotypes available at the Complex Traits Genetics Virtual Lab (CTG-VL) web-based service (https://genoma.io) (22). Most of the GWAS summary statistics accessible in CTG-VL were obtained from Neale′s Lab, which collected the UK Biobank datasets (http://www.nealelab.is/uk-biobank). As a result, the majority of these GWAS summary results represent people of European ancestry, which eliminates potential biases due to genetic ancestry variations (22). GWAS data from UK Biobank were corrected for age, age-squared, inferred sex and genetic ancestry (22).
Latent causal variable
The latent causal variable (LCV) is a model used to assess the causal association of two genetically correlated traits. LCV considers a latent variable (L) that has a causal effect on each trait (19). The GCP value indicates if the genetic correlation is due mainly to horizontal or vertical pleiotropy and is calculated by determining the correlations between latent variable L with trait A and trait B respectively (19). A GCP value of 0 indicates that the genetic correlation is mediated by horizontal pleiotropic effects, suggesting the absence of causal genetic effects. On the other hand, |GCP| = 1 denotes the presence of vertical pleiotropic effects as well as complete genetic causality. A |GCP| > 0.6 is regarded as a robust indicator of potential vertical pleiotropic effects (19). To evaluate LCV outcomes, three things must be considered: the amount of genetic correlation, the magnitude of the GCP estimation, and the directionality of the GCP estimate. The GCP value does not represent the magnitude of prospective causal effects; rather, it represents the proportion of a genetic correlation that could be described by potential causal effects. For example, associations with a low genetic correlation (|rG| < 0.30) and a high |GCP| value near 1 indicate that all the genetic elements overlap between two traits, despite its small genetic correlation, which is likely due to vertical pleiotropic effects. Furthermore, a negative GCP value between stroke and trait A indicates that trait A may have causal genetic effects on stroke, whereas a positive GCP value between stroke and trait A indicates that stroke may have causal genetic effects on trait A.
This study estimated the GCP value between stroke and 1504 complex traits. As outlined in previous studies (23,24), we employed the publicly available phenome-wide analysis methodology in the CTG-VL web platform. The phenome-wide analytic framework was built by using an R script for the LCV approach that the original authors of the method made accessible on GitHub (19). We uploaded GWAS summary statistics for stroke to perform the analysis on the CTG-VL web platform. The phenome-wide analytic pipeline was then utilised to perform LD score regression (25) and LCV analyses (19) to assess genetic correlations and potential putative causal links, respectively. Full instructions for utilising and understanding the publicly accessible phenome-wide analysis methodology can be found elsewhere (23,24). The LCV was performed on all traits that showed a genetic correlation with stroke (false discovery rate, FDR<0.05). Similarly, we considered all GCP estimate values that were significant after multiple testing correction (FDR<0.05). The overall analytical approach that we employed in this study is shown in Fig. 2.
Results
Of the 1,504 complex traits, we identified 262 traits genetically correlated with stroke risk (FDR<0.05; Supplementary Table (S1). Among those genetically correlated traits, 28 could be explained via robust vertical pleiotropic effects with stroke (|GCP|> 0.60; FDR <0.05), while seven showed limited partial genetic causality (|GCP|<0.60; FDR <5%; (Supplementary Table S2 & S3 respectively).
We also found that genetic susceptibility to a variety of diseases, and the use of prescription medication might increase the stroke risk (Fig.3 and Table 1). Specifically, we observed that cardiovascular traits, including, selfreported atrial fibrillation, self-reported pulmonary embolism, blood clot in the lung, atrial fibrillation and flutter (ICD10), blood clot in the leg reported by the doctor, deep venous thrombosis of lower extremities and pulmonary embolism, may increase the risk of stroke. Our study also highlighted that genetic susceptibility to stroke may raise the risk of pain in the throat and chest (ICD10), chest pain or discomfort, and loose teeth (Fig. 3 and Table 2).
Our study also found four complex traits, which decrease the risk of stroke, including HDL cholesterol, apoliprotein, testosterone and SHBD levels for males and females. These four traits showed negative genetic correlations and significant genetic causal proportions (Fig. 4 and Table 3).
Discussion
This study advances our understanding of stroke aetiology by shedding light on the causal architecture of stroke. We leveraged GWAS summary data to examine the potential causal association between stroke and 1,504 different complex traits. Based on genetic evidence, we discovered 28 potential causal relationships. Overall, our findings suggest that certain pathophysiological conditions and vascular disorders may increase the risk of stroke.
Our study revealed that the causal genetic effects of certain traits may influences the risk of stroke. Consistent with previous studies, our study showed that atrial fibrillation (AF) increases the risk of stroke. Recently, two sample MR studies found bi-directional relationships between AF and cardiovascular disease, cognitive impairment, type 2 diabetes mellitus and late-onset Alzheimer’s disease, which suggests AF may have a bidirectional causal relationship with stroke and stroke subtypes (26). Previous studies reported that AF contributes to the pathogenesis of stroke by causing stasis in the left atrium and subsequent brain embolism (27). AF is the most prevalent kind of cardiac arrhythmia, and it is widely known as a significant risk factor for ischemic stroke, transient ischaemic attacks, heart attack, and dementia (28–30). A study showed that about 15–25% of people who suffered an ischemic stroke had AF (31–33), and it also enhanced stroke risk five-fold compared with healthy people (29). However, apart from triggering stroke, AF may be associated with other risk factors that may influence stroke. For instance, age, sex, coronary artery disease, high blood pressure, diabetes, inflammatory disorders, and cardiac embolism are all potential risk factors for both stroke and AF (27). Our findings suggest that a genetic predisposition to AF contributes to stroke risk. Venous thromboembolism (VET), also known as deep vein thrombosis (DVT) and pulmonary embolism (PE), may have a role in circulatory collapse and short- and long-term serious effects on quality of life (34). In particular, VTE is a complex disorder, with increasing age and obesity recognised as prevalent atherosclerotic risk factors for ischemic stroke (35). The incidence of DVT is higher in aged people with age ≥ 65 years (36,37). DVT happens when a thrombus (blood clot) forms in the deep veins, particularly in the lower extremities (legs), which leads to pain, cramping, and swelling in the lower extremities, and DVT is common among patients with stroke (36). However, DVT happens in 10% to 75% of post-stroke patients depending on the diagnosis method and time of evaluation (38). Our findings suggest that genetic variants of VTE and PE influence the risk of stroke. Weight change (weight gain) after depression is a serious public health concern, and depressed individuals have a 58% chance to develop obesity (39). Research conducted in 2021 revealed that patients who had a history of adolescent overweight and obesity as measured by BMI had a higher risk of developing an early-onset ischemic stroke (40). A study on obesity and cryptogenic ischemic stroke found that a high BMI was associated with a higher risk of ischemic stroke in young individuals when smoking, high blood pressure, and diabetes were taken into consideration (41). Our study found the genetic evidence of weight gain after depression may increase the risk of stroke.
Previous studies reported that vascular factors such as type 2 diabetes and hypertension may increase the risk of ischemic and intracerebral haemorrhagic stroke by 80-90% (42). MR studies revealed that genetically predicted type 2 diabetes was associated with ischemic heart disease, stroke, and cardiovascular diseases (43). In our study, cardiac arrhythmias and a blood clot in the leg diagnosed by a doctor increased stroke risk. Similarly, the use of medication for type 2 diabetes increases the risk of stroke. For instance, pioglitazone is used as a cardioprotective drug for type 2 diabetes patients. This drug is used as a proxy for type 2 diabetes. Pioglitazone is used to reduce myocardial infarction, stroke, and cardiovascular mortality by retarding the atherosclerotic process (44). Insulin, for example, is used for anti-hyperglycaemic therapy (45), so it could be used as a proxy for type 2 diabetes. Insulin use was linked to type 2 diabetes, indicating a shared genetic architecture of type 2 diabetes and insulin. Therefore, in this case, type 2 diabetes would be the risk factor for stroke. In addition, gliclazide is a member of the sulphonylurea drug group commonly used in type 2 diabetes treatment to control blood glucose (46). A population-wide study in Korea observed that gliclazide monotherapy increases all-cause mortality and the risk of acute myocardial infarction and stroke (46). Consistently, in the present study, several cardiovascular phenotypes such as atrial fibrillation and flutter, self-reported pulmonary embolism, a blood clot in the leg diagnosed by a doctor, deep venous thrombosis of lower extremities, pulmonary embolism, and weight change during the worst episode of depression (gaining weight) are significantly genetically causally associated with stroke risk. These observations might be used to develop testable hypotheses for future studies.
Neck and chest pain are common consequences among the patients with stroke. Chest pain and discomfort occurred due to a lack of oxygen-rich blood in the heart muscle. It usually happens when one or more arteries in the heart become blocked or narrowed. Previous studies reported that angina pectoris (chest pain) enhances the risk of ischemic stroke (47), and patients with acute stroke may also develop different medical complications, among which chest pain is common (48). Our results found the inferred causal genetic relationship between stroke and chest pain. In addition, stroke survivors have a lower oral health-related quality of life (49). A recent study found that tooth loss was significantly associated with vascular cognitive impairment (VCI) that causes ischemic stroke (50,51). A study conducted on the dental health and cognitive function of 161 people who had acute ischemic strokes and had lower cognitive function test scores and higher levels of VCI tended to lose more teeth (52). We presume that the relationship between stroke and tooth loss could be mediated by potential vertical pleiotropic effects and may potentially be influenced by genetic predisposition and oral health susceptibility.
Our results demonstrated that certain hormones and enzymes play a significant role in stroke prevention. Apolipoprotein A1 (ApoA1) is a protein component of high-density lipoprotein (HDL) and is involved in the transfer of excessive cholesterol from peripheral cells to the liver. Aside from its atheroprotective properties, apoA1 also contains anti-inflammatory and antioxidant properties (53). A previous study found that low amounts of apolipoprotein A1 and high amounts of apolipoprotein B were associated with an increased risk of cerebrovascular stroke and cardiovascular diseases (53–56). High levels of high-density lipoprotein (HDL) cholesterol significantly reduce the risk of stroke (57). On the other hand, high levels of LDL (low-density lipoprotein) enhance the risk of ischemic stroke, and low levels of LDL are linked to the risk of intracerebral haemorrhagic stroke (58). Genome-wide association studies revealed that A1-SD genetically elevated LDL cholesterol enhanced the risk of ischemic stroke and large artery atherosclerosis stroke (59), while A1-SD genetically elevated HDL cholesterol decreased the risk of small vessel stroke (60). In addition, sex hormone-binding globulin (SHBG) is a protein produced by the liver and usually binds to sex hormones such as estrogen, dihydrotestosterone, and testosterone. Previous studies demonstrated that elevated levels of SHBG protein decrease the risk of diabetes, stroke, and vascular diseases, while low levels of SHBG increase the risk of those diseases (61,62). Thus, SHBG might be used as a risk prediction tool for predicting stroke risk.
We acknowledged several limitations in this present study. Firstly, the GWAS summary statistics used in this study were obtained from the MEGASTROKE and UK Biobank cohort, where the participants are all from European ancestry. Thus, our findings should not be consistent with other ancestries unless they are verified using data from diverse populations. Secondly, our studies included 1504 traits, but other causal relationships may still exist with other traits. Thirdly, by definition, the LCV method evaluates the causal relationship between two traits and assumes no bidirectional causality (19). Consequently, it is still debatable whether the LCV method would be able to determine the bidirectional causality between the traits. This assumption must be systematically evaluated in the future. Null findings in this study might be explained in a way that there is no causal association between the phenotypes or there may exist bidirectional causality. In the current study, caution must be taken when interpreting null findings.
Conclusions
In summary, our study provides evidence for potential causal genetic relationships between stroke and other complex traits. Our findings suggest that the role of several cardiovascular traits, such as atrial fibrillation, cardiac arrhythmias increase the risk of stroke. We also observed that the influence of different physiological conditions, such as blood clots in the lung and legs and sudden weight gain after depression, may increase the risk of stroke. Overall, our findings support from the evidence of previous epidemiological studies, and provide novel insights into the causal genetic architecture of stroke, which in turn could be used as testable hypotheses to improve the development of future studies, treatments, and preventative strategies.
Data Availability
All data here analysed are publicly available on MEGASTROKE consortium and UK biobank summary data available on CTG-VL.
Data Availability Statement
All data here analysed are publicly available on MEGASTROKE consortium and UK biobank summary data available on CTG-VL.
Conflict of Interest
“Authors declare no conflict of interest”.
Author contributions
“Tania Islam and Mohammad Ali Moni contributed to the study conception and design. Data collection and analysis were performed by Tania Islam, Luis M García-Marín. The first draft of the manuscript was written by Tania Islam and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.”
Acknowledgments
“The MEGASTROKE project received funding from sources specified at http://www.megastroke.org/acknowledgments.html″. The author list of MEGASTROKE is available in Supplementary Table 4. T.I. is supported by a Research Training scholarship from The University of Queensland, Australia.