The causes and consequences of Alzheimer’s disease: A Mendelian randomization analysis

Roxanna Korologou-Linden; Emma L Anderson; Laura D Howe; Louise A C Millard; Yoav Ben-Shlomo; Dylan M Williams; George Davey Smith; Evie Stergiakouli; Neil M Davies

doi:10.1101/2019.12.18.19013847

ABSTRACT

Objective To identify causal risk factors for Alzheimer’s disease and clarify which may instead be modified by emerging Alzheimer’s disease pathophysiology.

Method We performed a phenome-wide association study (PheWAS) of a polygenic risk score (p≤5×10⁻⁸) for Alzheimer’s disease with a wide range of phenotypes in the UK Biobank, stratified by age tertiles. We also investigated the association between the polygenic risk score for Alzheimer’s disease and previously implicated risk factors. Using two-sample bidirectional Mendelian randomization, we then estimated the size of causal effects of both previously implicated risk factors and those identified by the PheWAS on the risk of Alzheimer’s disease.

Results Genetic liability for Alzheimer’s disease was associated with red blood cell indices and cognitive measures in the youngest age tertile. In the middle and older age tertiles, higher genetic liability for Alzheimer’s disease was associated with medical history (e.g. atherosclerosis, use of cholesterol-lowering medications), physical measures (e.g. body fat measures), blood cell indices (e.g. red blood cell distribution width), cognition (e.g. fluid intelligence score) and lifestyle (e.g. self-reported moderate activity and daytime napping). In follow-up analyses using Mendelian randomization, we replicated established risk factors for Alzheimer’s disease (e.g. fluid intelligence score, education) and identified several novel risk factors (e.g. forced vital capacity, self-reported moderate physical activity and daytime napping).

Conclusion Genetic liability for Alzheimer’s disease is associated with over 160 phenotypes. However, findings from Mendelian randomization analyses imply that most of these associations are likely to be caused by increased genetic risk for Alzheimer’s disease or selection, rather than a cause of the disease.

INTRODUCTION

Alzheimer’s disease is a late-onset irreversible neurodegenerative disorder, constituting the majority of dementia cases [1], which affects 47 million people worldwide [2]. Genetic, molecular and clinical evidence suggests that pathophysiological changes occur two to three decades prior to the manifestation of clinical symptoms [3,4]. There are currently no disease-modifying therapeutics or preventative treatment for Alzheimer’s disease. Educational attainment is one of the few modifiable risk factors with evidence of a causal effect on Alzheimer’s disease [5]. Several quasi-experimental studies have shown that higher educational attainment reduces risk of Alzheimer’s disease [6–8]. Observational studies have reported conflicting evidence for the association of cardiovascular risk factors such as obesity, blood pressure, lipids [9–15] with incident Alzheimer’s disease across different age groups.

However, observational associations may be biased by confounding and reverse causation and Mendelian randomization can potentially overcome these issues. Mendelian randomization [16] is a form of instrumental variable analysis which uses genetic variants as proxies for environmental exposures. It can provide evidence about lifetime effects of factors on risk of Alzheimer’s disease and is robust to many forms of bias that can affect other observational study designs. In two-sample Mendelian randomization, the effect of genetic instrumental variables on the exposure and on the outcome are estimated in two separate samples. Mendelian randomization is based on three key assumptions [17]: (i) the genetic variant is strongly associated with the exposure, (ii) there are no confounders of the genetic variant-outcome association, and (iii) the effects of the genetic variant on the outcome are mediated entirely by the exposure. To date, hypothesis-driven Mendelian randomization studies have found mixed evidence for a causal role of cardiovascular risk factors in the development of Alzheimer’s disease [8,18,19].

Linkage studies identified the ε4 allele of the apolipoprotein E (APOE) gene increased the risk of Alzheimer’s disease up to twelvefold [20,21]. More recently, large genetic consortia have identified common genetic variants associated with late-onset Alzheimer’s disease [22]. While non-APOE genetic variants are more weakly associated with disease (odds ratio<1.2), they can be aggregated to generate a polygenic risk score for Alzheimer’s disease [23]. Previous work suggested that a polygenic risk score including single nucleotide polymorphisms (SNPs) at a p-value threshold≤0.5 had a better predictive accuracy (78.2%), explaining 2% of the variance in Alzheimer’s disease, compared with polygenic risk scores at lower p-value thresholds [24]. Polygenic risk scores indicate genetic liability for Alzheimer’s disease (regardless of whether an individual will or has developed Alzheimer’s disease).

They can be used to investigate the association between genetic liability for Alzheimer’s disease and other diseases or traits to, identify, or confirm traits that modify disease risk, establish protopathic effects of disease and identify biomarkers that predict disease.

Phenome-wide association studies (PheWAS) are based on a hypothesis-free approach, similar to the one used for genome-wide association studies (GWAS), and estimate the association between a genotype or polygenic risk score and a large array of phenotypes [25]. In contrast to hypothesis-driven analyses, the use of an agnostic approach in analyses allows for the discovery of novel associations with no prior belief of an association, and minimises publication bias, as all the findings are published [26,27]. Consequently, this method may provide new evidence about the etiology of Alzheimer’s disease but to date it has not been investigated in this manner.

In our study, we divided the UK Biobank participants (N=334,968) into three equal subsamples and conducted phenome-wide analysis within each tertile to investigate whether the association of genetic liability for Alzheimer’s disease with various phenotypes differed by age (Fig 1a). To delineate whether associated phenotypes could be causing the disease or be a consequence of the disease process, we also estimated the effect of the phenotypes identified from the PheWAS on Alzheimer’s disease using Mendelian randomization (Fig 1b).

Fig 1.

Diagram (A) describes our study design when conducting a phenome-wide association study (PheWAS) and diagram (B) describes our study design when using Mendelian randomization. In (A), the polygenic risk score for Alzheimer’s disease may either have a downstream causal effect on the trait (e.g. body mass index), or it may affect the trait through pathways other than through Alzheimer’s disease (i.e. pleiotropic effects). Diagram (B) describes our follow-up analysis using Mendelian randomization to establish causality and directionality of the observed associations. In (2), we test the hypothesis that the trait (e.g. body mass index) is causally associated with Alzheimer’s disease, provided that the conditions (i), (ii), and (iii) are adequately satisfied, governing that the polygenic risk score for the trait of interest is a valid instrument, in that (i) the polygenic risk score for a trait is robustly associated with the trait it proxies, (ii) it is not associated with measured or unmeasured confounders of the trait, and (iii) its association with the outcome is conditional on the trait.

METHODS

Study design

Our analysis proceeded in two steps. First, we ran a PheWAS of the Alzheimer’s disease polygenic risk score and all available phenotypes in UK Biobank, stratifying the sample by age. Second, we followed-up all phenotypes associated with the polygenic risk score using two-sample Mendelian randomization. We outline the different research questions answered by the PheWAS and the Mendelian randomization approach in Fig 1.

Sample description

UK Biobank is a population-based health research resource consisting of 503,325 people, aged between 38 years and 73 years, who were recruited between the years 2006 and 2010 from across England, Wales, and Scotland [28]. The study was designed to identify determinants of human diseases. A full description of the study design, participants and quality control (QC) methods have been published previously [29]. UK Biobank received ethical approval from the Research Ethics Committee (REC reference for UK Biobank is 11/NW/0382). Of the 463,010 participants with genetic data (already quality checked and excluding participants with sex mismatch or sex aneuploidy), 54,757 were of non-white British ancestry, 73,277 participants had a kinship coefficient denoting a third-degree relatedness, and eight participants withdrew consent (Fig 2). In total, a sample of 334,968 remained after QC. This work was done under application number 16729 (version 2 genetic data [500 K with HRC imputation] and phenotype dataset 21753).

Fig 2.

UK Biobank participant flow diagram

Polygenic risk score

We constructed a polygenic risk score including SNPs associated with Alzheimer’s disease at genome-wide significance (p≤5×10⁻⁸) for UK Biobank participants based on the summary statistics from a meta-analysis of the IGAP consortium [30], ADSP [31] and PGC [32], totalling 24,087 individuals with a clinical diagnosis of Alzheimer’s disease, paired with 55,058 controls. Further details on the samples used can be found in the Supplementary material. SNPs were clumped using r²>0.001 and a physical distance for clumping of 10,000 kb (Supplementary Fig 1). A polygenic risk score was calculated for each participant with genetic data using PLINK (version 1.9). Each score was calculated from the effect size (logarithm (log) odds)-weighted sum of associated alleles within each participant. A list of 18 SNPs used to generate the polygenic risk score is provided in Supplementary table 1. Our main analysis is based on the polygenic risk score with the APOE region. The polygenic risk score was standardised by subtracting the mean and dividing by the standard deviation of the polygenic risk score.

Main analysis

The full UK Biobank sample was divided into three equal subsamples (n=111,656 in each tertile). We performed PheWAS within each tertile. Age and sex were reported at recruitment and confirmed by genetic data were included as covariates in the models to reduce variation in the outcomes. We adjusted for the first 10 genetic principal components to control for confounding due to population stratification.

Outcomes

The Biobank data showcase enables researchers to identify variables based on the field type (http://biobank.ctsu.ox.ac.uk/showcase/list.cgi). At the time of data usage, there were 2,655 fields of the following types: integer, continuous, categorical (single) and categorical (multiple). We excluded 55 fields (Supplementary table 2) a priori because: a) 3 fields described age, b) 17 fields described the genetic data, c) 17 fields denoted assessment centre environment variables, and d) 18 categorical (single) fields reported more than one value per individual.

STATISTICAL ANALYSES

Phenome-wide association study

We estimated the association of an Alzheimer’s disease polygenic risk score with each phenotype in the three age strata using the PHESANT package (version 14). A detailed description of PHESANT’s automated rule-based method is published elsewhere [27] and a brief description can be found in the Supplementary material. The polygenic risk score for Alzheimer’s disease and phenotypes are independent (exposure) and dependent (outcome) variables in the regression model, respectively. Outcome variables with continuous, binary, ordered categorical, and unordered categorical data types were tested using linear, logistic, ordered logistic, and multinomial logistic regression, respectively. Before testing, an inverse normal rank transformation was applied to continuous variables to ensure a normal distribution. We accounted for the multiple tests performed by generating a 5% false discovery rate adjusted p-value threshold. After ranking the results by p-value, we identified the largest rank position with a p-value less than P_threshold=0.05×rank/n, where n is number of total number of tests in the phenome scan. P_threshold is the p-value threshold controlling the false discovery rate at 5% [33] and was used as a heuristic to identify phenotypes to follow-up in the Mendelian randomization analysis and not as an indicator of significance [34,35]. Categories for the ordered categorical variables are in Supplementary table 3.

Risk factors implicated in Alzheimer’s disease in previous research

We investigated whether previously implicated risk factors for Alzheimer’s disease, that were not detected in the PheWAS, were associated with the polygenic score. We extracted a list of factors from the Global Burden of Disease Study for Alzheimer’s disease [36] and a literature review of the evidence on modifiable risk factors for cognitive decline and dementia from observational studies and randomised controlled trials [37]. We selected four factors from the Global Burden of Disease Study (high BMI, high fasting plasma glucose, smoking, and a high intake of sugar-sweetened beverages) that contributed to metrics for deaths, prevalence, years of life lost, years of life lived with disability, and disability-adjusted life-years due to Alzheimer’s disease. The review identified the following as potentially modifiable risk factors for dementia; less education, midlife hypertension, obesity and hearing loss, as well as later life smoking, depression, physical inactivity, social isolation, and diabetes. Furthermore, a meta-analysis of case-control and population-based studies showed that rheumatoid arthritis is associated with lower incidence of Alzheimer’s disease [38]. The relationship between Alzheimer’s disease and rheumatoid arthritis has been studied before using genetic-based methods such as Mendelian randomization [38], hence it is not examined here. We examined the use of methotrexate (anti-inflammatory drug for rheumatoid arthritis) due to observational studies [39,40] suggesting anti-inflammatory medicines for rheumatoid arthritis reduces risk of Alzheimer’s disease [39]. At the time of the analysis, plasma glucose was not available and was not investigated.

Sensitivity analysis

We repeated the PheWAS for the entire sample, irrespective of age. This was performed to maximise power to detect associations. Furthermore, to examine if any detected associations could be attributed to the variants in or near the APOE gene (Chr 19: 44,400 kb-46,500 kb) which is known to have widespread physiological effects (i.e. highly pleiotropic), we repeated the PheWAS on the entire sample using the polygenic risk score excluding SNPs in the APOE region.

Follow-up using Mendelian randomization

We investigated whether the phenotypes identified in our PheWAS caused or were caused by Alzheimer’s disease using bidirectional two-sample Mendelian randomization. Of the 177 phenotypes identified in the PheWAS and previously implicated risk factors not identified in the PheWAS, we followed up 87 phenotypes, respectively, using two-sample bidirectional Mendelian randomization. We did not follow up all 90 phenotypes identified in the PheWAS because of either low prevalence, no genetic instruments, or if they indicated own diagnosis or family history of Alzheimer’s disease. For wheeze/whistling, we also examined the measured phenotype of forced vital capacity as a better measure of respiratory function. For spherical power, we derived four binary variables to indicate myopia (spherical power<-0.5) and hypertropia (spherical power>0.5) in each eye.

Data sources for the Mendelian randomization analyses

For each risk factor identified by the PheWAS and literature reviews, we performed GWAS to identify SNPs that are strongly associated (p≤5×10⁻⁸) with each trait. Exposure GWASs were based on summary statistics from UK Biobank, and were performed with the BOLT-LMM software package [41] using a published pipeline [42], described in detail in the Supplementary material. For body mass index, hip and waist circumference, we used summary statistics from the GIANT consortium as it had larger sample sizes than UK Biobank alone [43,44]. SNPs in the APOE region defined as the region between positions 44,400 kb-46,500 kb on chromosome 19 [45] were removed from instruments proxying the exposure, due to the highly pleiotropic effects of the APOE gene. We retained independent genome-wide significant SNPs that had an r2<0.001 with another variant within a 10 mb window using the 1000 genomes panel [46] and removed any ambiguous palindromic SNPs. Simulation studies [47] have shown inflation in BOLT-LMM tests statistics for unbalanced case-control analyses. Based on the results of these simulations, authors reported BOLT-LMM p-values are well-calibrated for a prevalence>10% and minor allele frequency>0.1%, and for lower prevalence of cases, BOLT-LMM only results in inflated statistics for rare variants. We retained phenotypes using the lowest examined prevalence in BOLT-LMM simulation studies (0.1%). The number of cases and controls for each binary phenotype is in Supplementary table 4.

Alzheimer’s disease GWAS

We used the same sample for the two-sample Mendelian randomization analyses as for the construction of the polygenic risk score for Alzheimer’s disease [32]. For these analyses, the APOE region was retained as Alzheimer’s disease is investigated as an outcome in this analysis.

Estimating the association between risk factors and Alzheimer’s disease

We harmonized the exposure and outcome GWAS; see the Supplementary material for details. We employed univariable Mendelian randomization to estimate the effect of each exposure on Alzheimer’s disease, using inverse-variance weighted regression analysis; this estimator assumes no directional horizontal pleiotropy [16]. We used the F-statistic as a measure of instrument strength [48]. The F-statistic is a function of R² (amount of variance explained by the genetic variants and the sample size). We interpreted estimates derived from linear and ordered logistic models as a change in odds ratio for Alzheimer’s disease per 1 standard deviation in unit of exposure or category for ordered categorical variables. We present adjusted p-values for inverse variance weighted regression accounting for the number of results in the follow-up using the false discovery rate method.

Assessing pleiotropy

We investigated whether the SNPs had pleiotropic effects on the outcome other than through the exposure. We compared our results obtained from inverse variance weighted regressions, where the intercept is not constrained to zero, to those obtained with Egger regression [49,50]. Egger regression allows for pleiotropic effects that are independent of the effect on the exposure of interest [49,51,52]. The MR-Egger intercept estimates the average pleiotropic effect across the genetic variants. A non-zero intercept suggests presence of directional horizontal pleiotropy, and the estimated effect of the exposure obtained from MR-Egger regression allows for horizontal pleiotropy provided that the instrument strength independent of direct effect assumption (InSIDE) holds. We also report the I²G_x statistic [53], an analogous measure to the F-statistic in inverse variance weighted regression. The MR-Egger estimate is biased towards the null when the no measurement error assumption is violated, and the stronger the violation the larger the dilution (as indicated by lower I²G_x statistics).

Assessing causal direction

We used Steiger filtering to investigate the direction of causation between Alzheimer’s disease and the phenotypes [54]. Steiger filtering examines whether the SNPs for each of the phenotypes used in the two-sample Mendelian randomization explain more variance in the phenotypes than in Alzheimer’s disease (which should be true if the hypothesised direction from phenotype to Alzheimer’s disease is valid). We repeated Mendelian randomization analyses removing SNPs which explained more variance in the outcome than in the exposure.

RESULTS

Sample characteristics

The UK Biobank sample is 55% female (age range=39 to 53 years, mean=47.2 years, standard deviation=3.8 years) in tertile 1, 55% female (age range=53 to 62 years, mean=58.03 years, standard deviation=2.4 years) in tertile 2 and 49% female (age range=62 to 72 years, mean=65.3 years, standard deviation=2.7 years) in tertile 3. In the whole UK Biobank sample, the Alzheimer’s disease polygenic risk score was associated with a lower age at recruitment (β: −0.006 years; 95% CI: −0.01, −0.0002; p=0.007). The mean standardised polygenic risk score (95% CI) in each age tertile was as follows: 0.006 (−0.0003, 0.01); and 0.001 (−0.01, 0.009) and −0.007 (−0.02, 0.002) (P for trend=0.01).

Main Results

Selected PheWAS hits are presented in graphs and full results are in Supplementary file 2. PheWAS showed that the polygenic risk score for Alzheimer’s disease was associated with outcomes broadly categorised as dementia-associated medical history, medical history, physical measures, parental health factors, cognitive test and brain-related measures, biological sample measures, and lifestyle and dietary factors. Where we report results for continuous outcomes, these are in terms of a 1 standard deviation (SD) change of inverse rank normal transformed outcome (indicated by ‘ΔSD’), where we report results for binary or categorical outcomes these are in terms of log odds or odds (indicated by ‘logOR’ and ‘OR’).

Dementia-associated medical history

A higher polygenic risk score for Alzheimer’s disease was associated with higher odds of being diagnosed with unspecified Alzheimer’s disease (OR:2.39; 95% CI:1.83,3.11), vascular dementia (OR:1.92; 95% CI:1.65,2.24), or other forms of Alzheimer’s disease (atypical/mixed type) (OR:2.74; 95% CI:1.84,4.09), as well as higher odds of death from Alzheimer’s disease in the oldest age tertile (aged 62 to 72 years) (OR:2.50; 95% CI:1.86,3.35) (Supplementary Fig 2). A higher polygenic risk score was also associated with having a maternal and paternal history of Alzheimer’s disease/dementia for participants in all age tertiles examined, as well as a sibling history of Alzheimer’s disease/dementia for the two older age tertiles. Furthermore, we found strong evidence of a higher polygenic risk score being associated with higher odds of amnesia, disorientation, and symptoms involving cognitive function and awareness for the two older age groups.

Medical history

Participants with a higher polygenic risk score for Alzheimer’s disease were on average more likely to have been diagnosed with angina (OR: 1.05; 95% CI: 1.03, 1.08) and atherosclerotic heart disease (OR: 1.05; 95% CI: 1.03, 1.08) and to have used beta-blockers (e.g. atenolol) (OR: 1.04; 95% CI: 1.02, 1.07), as well as aspirin in all age tertiles (Supplementary Fig 2). In the two older tertiles (ages 53 to 72 years), participants with higher polygenic risk score were on average less likely to have had a cholecystectomy (gallbladder removal). Additionally, a higher polygenic risk score was associated with higher odds of having an aortocoronary bypass graft. At all ages, a higher polygenic risk score was associated with being more likely to have self-reported a history of high cholesterol (OR: 1.16; 95% CI:1.14, 1.18), a diagnosis of pure hypercholesterolaemia (OR:1.10; 95% CI: 1.09, 1.12), and using cholesterol-lowering drugs such as ezetimibe (OR:1.20; 95% CI:1.14, 1.27) or statins (OR: 1.11; 95% CI: 1.09, 1.13). For participants of ages 62 to 72 years, those with a higher polygenic risk score were more likely to use thrombin injections. This outcome was not observed in the younger age tertiles examined (39 to 53 and 53 to 62 years).

Physical measures

There was evidence that a higher polygenic risk score was associated with lower basal metabolic rate (ΔSD: −0.01; 95% CI: −0.01, −0.01), body fat percentage (ΔSD: −0.02; 95% CI: −0.02, −0.01), body mass index (ΔSD: −0.02; 95% CI: −0.03, −0.02), pulse rate (ΔSD: −0.02; 95% CI: −0.02, −0.01), waist circumference (ΔSD: −0.02; 95% CI: −0.03, −0.02), trunk fat percentage (ΔSD: −0.02; 95% CI: −0.02, −0.02), whole body fat (β: −0.02; 95% CI: −0.03, −0.02) and fat-free mass (ΔSD: −0.01; 95% CI: −0.01, −0.004), as well as whole body water mass (ΔSD: −0.01; 95% CI: −0.01, −0.004) in the two older age tertiles examined (Fig 3). There was weak evidence of such effects for the youngest participants. A higher polygenic risk score was associated with higher hip circumference in the younger participants (ages 39 to 53 years) (ΔSD: 0.01; 95% CI: 0.002, 0.01), but lower hip circumference (ΔSD: −0.02; 95% CI: −0.02,-0.01) for the two older age groups (ages 53 to 72 years). We also found evidence that a higher polygenic risk score was associated with higher spherical power in the oldest participants (i.e. strength of lens needed to correct focus). There was strong evidence that a higher polygenic risk score was associated with a lower diastolic blood pressure (ΔSD: - 0.01; 95% CI: −0.02, −0.01) in the oldest age tertile.

Fig 3.

Forest plot showing the effect estimates for the association between the polygenic score for Alzheimer’s disease (including the apolipoprotein E region) and physical measures. Legends in the right of each graph indicate age tertiles. Effect estimates are shown by box markers and confidence bands represent 95% confidence intervals. There is evidence that the polygenic risk score for Alzheimer’s disease is related to physical measures in older, but not younger participants. This suggests that Alzheimer’s disease causes these changes rather than vice versa.

Parental health factors

On average, the parents of participants with a higher polygenic risk score for Alzheimer’s disease died at a younger age (ΔSD: −0.01; 95% CI: −0.02, −0.01) and had lower odds of the mother (OR: 0.91; 95% CI: 0.89, 0.93) and father (OR: 0.93; 95% CI: 0.89, 0.96) being still alive. Furthermore, a higher polygenic risk score was associated with lower odds of a paternal history of chronic bronchitis/emphysema (OR: 0.96; 95% CI: 0.94, 0.98), as well as lower odds of maternal history of high blood pressure (OR: 0.96; 95% CI: 0.95, 0.98) (Supplementary Fig 3). There was an age-dependent increase in effect size, and for some outcomes, the 95% CIs included the null value for the younger participants.

Cognitive test measures

Participants with higher polygenic risk scores on average took longer to enter values and complete cognitive tests in all ages examined (39 to 72 years) (Fig 4). Furthermore, a higher polygenic risk score was associated with lower log odds of being in a higher scoring category for number of correct matches in pairs matching round (logOR: −0.07; 95% CI: - 0.10, −0.04) and being in a higher category for fluid intelligence score (logOR: −0.04; 95% CI: −0.07, −0.02) for participants of ages 53 to 62 and 62 to 72 years. Additionally, a higher polygenic risk score was associated with a higher weighted-mean mode of anisotropy (MO) in the left inferior fronto-occipital fasciculus (ΔSD: 0.08; 95% CI: 0.04, 0.13). There is an age-dependent increase in effect size for all these factors, with the effect being weaker for the younger participants of ages 39 to 53 years.

Fig 4.

Forest plot showing the effect estimates for the association between the polygenic score for Alzheimer’s disease (including the apolipoprotein E region), cognitive, and brain-related measures. Legends in the right of each graph indicate age tertiles. Effect estimates are shown by box markers and confidence bands represent 95% confidence intervals. *Effect estimates were derived from ordered logistic models and effect estimates are on log odds scale. We found evidence that the polygenic risk score for Alzheimer’s disease is related to some cognitive measures in all age ranges examined. This may suggest a bidirectional relationship between cognitive measures and Alzheimer’s disease.

Biological measures

A higher polygenic risk score was associated with lower red blood cell count (ΔSD: −0.01; 95% CI: −0.02, −0.01), red blood cell distribution width (ΔSD: −0.04; 95% CI: −0.05, −0.03), haematocrit percentage (ΔSD: −0.01; 95% CI: −0.02, −0.01), haemoglobin concentration (ΔSD: −0.01; 95% CI: −0.01, −0.004), monocyte count (ΔSD: −0.02; 95% CI: −0.02, −0.01), platelet count (ΔSD: −0.01; 95% CI: −0.02, −0.01) and plateletcrit (ΔSD: −0.02; 95% CI: −0.02, −0.01) (Fig 5). These effects increased with age, except for platelet count and plateletcrit which showed similar estimates for all tertiles. Furthermore, although the trend was consistent for all listed factors, for red blood cell count and haemoglobin concentration, the 95% CIs included the null for the polygenic risk score of the younger participants (ages 39 to 53 years).

Fig 5.

Forest plot showing the effect estimates for the association between the polygenic score for Alzheimer’s disease (including the apolipoprotein E region) and biological measures. Legends in the right of each graph indicate age tertiles. Effect estimates are shown by box markers and confidence bands represent 95% confidence intervals. There is an age-dependent increase in the effect of the polygenic risk score on blood-based measures. This may indicate that blood-based markers may be causal in the development of Alzheimer’s disease.

Dietary measures

There was strong evidence that a higher polygenic risk score was associated with dietary factors (Supplementary Fig 3). A higher polygenic risk score for Alzheimer’s disease in the oldest participants (ages 62 to 72 years) was associated with lower odds of eating eggs and dairy products, and higher odds of having a greater intake of salad or raw vegetables.

Evidence of such dietary choices was weaker in the younger participants (ages 39 to 62 years). Furthermore, a higher polygenic risk score for Alzheimer’s disease was also associated with increased odds of eating more non-oily fish and having a frequent variation in diet. There was evidence that a polygenic risk score for participants in all age tertiles was associated with reduced odds of being in a category for always adding salt to food, lower use of butter/spreadable butter, and higher odds of using cholesterol-lowering margarines such as Flora pro-active or Benecol. Furthermore, we observed that a higher polygenic risk score was associated with lower odds of being in a higher category for pork intake and lamb/mutton in all age tertiles.

Lifestyle measures

There was strong evidence that a higher polygenic risk score was associated with a greater frequency of walking for pleasure in the last 4 weeks (logOR: 0.02; 95% CI: 0.01, 0.04), reporting more days per week including moderate (logOR: 0.03; 95% CI: 0.02, 0.04) and vigorous activity (logOR: 0.02; 95% CI: 0.01, 0.03) and finding it easier to get up in the morning (logOR: 0.03; 95% CI: 0.02, 0.04) for participants of ages 53 to 72 years (Fig 6). Furthermore, a higher polygenic risk score was also associated with higher odds of making dietary changes due to illness in this age range. There was weak evidence of association for these factors in the youngest participants examined (age 39 to 53 years). For participants in the youngest and oldest tertiles (ages 39 to 53 and 62 to 72 years), those with a higher polygenic risk score were less likely to experience sleeplessness/insomnia (logOR: −0.03; 95% CI: −0.04, −0.02). For the oldest participants, those with a higher polygenic risk score reported having less sleep (logOR: −0.02; 95% CI: −0.03, −0.01) and being less likely to nap during the day (logOR: −0.03; 95% CI: −0.04, −0.02). The evidence for these factors was weaker for participants in the younger age ranges.

Fig 6.

Forest plot showing the effect estimates for the association between the polygenic score for Alzheimer’s disease (including the apolipoprotein E region) and lifestyle measures. Legends in the right of each graph indicate age tertiles. Effect estimates are shown by box markers and confidence bands represent 95% confidence intervals. All effect estimates were derived from ordered logistic models. There is evidence of association for the polygenic risk score on lifestyle in the two older age groups examined (ages 62-72 years), suggesting these lifestyles are an effect of the disease process.

Previously implicated risk factors for Alzheimer’s disease

For previously implicated factors in Alzheimer’s disease, a higher polygenic risk score was associated with higher systolic blood pressure in the youngest participants (ages 39-53 years) (ΔSD: 0.01; 95% CI: 0.003, 0.01) but not for the older participants (ΔSD: 0.005; 95% CI: −0.001, 0.01; ages 62-72 years). Furthermore, a higher polygenic risk score was associated with a higher pulse pressure for participants in all age ranges (ΔSD: 0.02; 95% CI: 0.01, 0.02). We found little evidence of an association between the polygenic risk score and qualifications attained, social activities, anti-inflammatory treatment for rheumatoid arthritis, and sweetened beverages. There was some evidence of an association between the polygenic risk score and lower number of pack years of smoking for the older participants (ΔSD: −0.01; 95% CI: −0.02, −0.003) (Fig 7).

Fig 7.

Forest plot showing the effect estimates for the association between the polygenic score for Alzheimer’s disease (including the apolipoprotein E region) and previously implicated risk factors for Alzheimer’s disease that did not pass the corrected p-value threshold for multiple testing. Legends in the right of each graph indicate age tertiles. Effect estimates are shown by box markers and confidence bands represent 95% confidence intervals. *Effect estimates were derived from ordered logistic models and effect estimates are on the log odds scale. †Effect estimates were derived from linear regression models and are on the log odds scale. ‡Effect estimates were derived from binary logistic regression models and are on the log odds scale. There is little evidence of an association for the polygenic risk score for Alzheimer’s disease and previously implicated risk factors for Alzheimer’s disease, except for pulse pressure, pack years of smoking, use of a hearing aid, and going to the pub or social club. Most of these associations appear in the older age group, suggesting these previously implicated factors for Alzheimer’s disease may be attributed to the disease process or frailty.