Abstract
Polygenic risk scores (PRS) have the potential to serve as a low-cost, non-invasive screening method for Alzheimer’s disease (AD). However, to what extent age and the Apolipoprotein E-ε4 (APOE4) risk allele influence the effect of PRS is underexplored. In a cohort of 346 superager controls (age ≥ 90 years), 2,930 controls (age 60-89) and 1,760 AD cases, we computed APOE-independent PRS for AD. When using superager controls, subjects with PRS in the top decile had nearly five times greater odds of having AD than subjects in the lowest decile (OR=4.91, P=2.24×10−6). In our cross-sectional cohort, PRS modifies the age of onset for AD among APOE4 carriers, but not among non-carriers. Among APOE4 carriers, PRS in the top decile was associated with a five years earlier AD onset than the lowest decile (70.0 vs 75.0 years; t-test P=2.4×10−5). These findings suggest that APOE-independent genetic risk disproportionally affects younger APOE4 carriers, leading to earlier disease onset, while older controls carry less genetic risk.
1. Introduction
Alzheimer’s disease (AD) has an estimated heritability of 59-79%(Gatz et al., 2006) and a decades long pre-clinical stage, during which AD-specific pathological changes develop without clinical signs of cognitive decline (Bateman et al., 2012; Jack et al., 2010). Low-cost non-invasive testing would ideally identify at-risk individuals in the pre-clinical stage to enable preventive intervention. To this end, genetic risk factors have been proposed as useful predictors. APOE is a major risk gene for AD with dose-dependent risk effects. Compared to non-carriers of the APOEε4 allele (APOE4), heterozygous carriers of APOE4 have a three-to fourfold and homozygous carriers up to a 14-fold increased AD risk (Farrer et al., 1997; Genin et al., 2011). The APOEε2 allele (APOE2) is protective (odds ratio, OR=0.6) (Bertram et al., 2008; Bickeböller et al., 1997; Corder et al., 1994; Genin et al., 2011; Kunkle et al., 2019a). The effects of these variants are also age-dependent (Rasmussen et al., 2018). In addition to the APOE gene, genome-wide association studies (GWAS) and meta-analysis of GWAS have identified 39 risk loci in subjects of European ancestry (Jansen et al., 2019; Kunkle et al., 2019b; Lambert et al., 2013b; Marioni et al., 2018). The individual genetic risk can be aggregated into a polygenic risk score (PRS) by summing the number of risk alleles, each weighted by its effect size derived from independent GWAS summary statistics (Euesden et al., 2015; Khera et al., 2018; Purcell et al., 2009). PRS has been reported to be useful to identify individuals with high risk for several complex and common diseases. For example, subjects whose PRS for coronary artery disease (CAD) were in the top 8% of the distribution had a threefold increased risk for CAD (Khera et al., 2018).
For AD, the predictive power of PRS derived from the GWAS meta-analysis (Lambert et al., 2013a) reached an area under the curve (AUC) of 0.78 in clinically defined AD when genetic risk factors were combined with demographic factors, and 0.84 in pathologically confirmed AD (Escott-Price et al., 2017; Escott-Price et al., 2015). The APOE gene confers most of the AD genetic risk. However, the predictive performance of PRS generally varies depending on the prevalence of the disease in a specific population (Gibson et al., 2019). Because the prevalence of AD increases with age (Brookmeyer et al., 2011; Mayeux and Stern, 2012), individuals who are ascertained as younger controls may develop dementia at an older age. In a prospective study of non-demented subjects in their 70s followed over 14 years, only 31% were still alive, of which 71% had dementia (Kuller et al., 2016). Furthermore, the prevalence of dementia increases from around 5% in individuals in their 70s to 37% among individuals aged 90 and older (Plassman et al., 2007).Thus, nonagenarians and centenarians who remain unaffected by dementia can be viewed as aging exceptionally well cognitively, i.e. “superagers”, but have not been systematically evaluated in studies of AD genetics.
Here, we investigate PRS in a sample that is enriched for subjects aged 90 years or older without dementia (90+; range 90-109 years) along with control subjects aged 89 or younger (89-; range 60-89 years) and AD cases at any age (range 60-99 years).
Given the known associations between age and AD risk as well as between APOE genotype and AD risk, we investigated the interactions between age, APOE and non-APOE genetic risk for AD. Specifically, we tested how the age of controls influences the proportion of AD phenotypic variance explained by non-APOE genetic risk. Because APOE affects both longevity (Sebastiani et al., 2019) and age at onset of AD (Olarte et al., 2006), we expected that both APOE genotype and age would influence the predictive power of non-APOE PRS. We hypothesized that the non-APOE PRS for AD contributes to longevity and is inversely correlated with age. We tested whether non-APOE PRS modifies the effect of the APOE4 risk genotype.
2. Methods
2.1 Participants
We selected 13 cohorts consisting of a total of 1,733 cases and 3,121 controls from the Alzheimer’s Disease Genetics Consortium (ADGC) that are independent from the 2013 International Genomics of Alzheimer’s Project (IGAP) meta-analysis(Lambert et al., 2013b) (Supplementary Table 1; Supplementary Methods). Additionally, whole genome sequences of 182 subjects with Ashkenazi ancestry from the Litwin Zucker Alzheimer’s Research Center at the Feinstein Institutes for Medical Research (LZ) were included. The sample ascertainment and assessment were described previously (Adelson et al., 2019; Freudenberg-Hua et al., 2016; Koppel et al., 2018) (Supplementary Table 1).
The enrollment of study participants in the LZ cohort was approved by the Institutional Review Board (IRB) of Northwell Health as well as the Montefiore Medical Center and the Committee on Clinical Investigation at the Albert Einstein College of Medicine. Written informed consent was obtained from all subjects or their Legally Authorized Representatives prior to publication. The ADGC data and information regarding study approval at each contributing institution or organization have been previously published (Kunkle et al., 2019a). For the ADGC cohorts, written informed consent was obtained from study participants or, for those with substantial cognitive impairment, from a caregiver, legal guardian, or other proxy, and the study protocols for all populations were reviewed and approved by the appropriate IRBs at all institutions.
Age for ADGC samples is defined differently for AD cases and controls. For controls, age at death (AAD) is used when available (n=126), otherwise the age at last examination (AAE) is used (n=2,995). For cases, age at onset (AAO), as reported by the study, is used if available (n=1,574), otherwise imputed by subtracting 10 years from AAD if available (n=102). For the remaining cases, AAE is used (n = 57). For the LZ cases (n=27), age is reported as age at enrollment, and as AAE for controls (89-: n=35, 90+: n=120).
The total sample utilized in this study includes 1,760 AD cases (60-99 years), 2,930 89-controls (60-89 years) and 346 90+ controls.
2.2 Genotype Data Quality Control
Stringent sample quality control (QC) was performed on each cohort, removing samples with poor genotyping quality, subjects of non-European ancestry, and samples with discordant sex (Supplementary Methods).
Principal component analysis (PCA) was performed using PLINK 1.90 (Chang et al., 2015; Shaun Purcell) on linkage disequilibrium-pruned variants with minor allele frequency ≥5%. PCA was used to remove any samples more than six standard deviations from the mean eigenvector for any of the first ten principle components (PCs), using the 1000 Genomes Project Phase 3 European superpopulation as a reference (Auton et al., 2015).
The demographics of the sample set following merging and QC are shown in Supplementary Table 1. The distribution of APOE genotypes among cases and controls is described in Supplementary Table 2.
2.3 Polygenic Risk Score Estimation
A PRS is a sum of genetic variants, weighted by variant effect sizes for a given trait. The effect sizes are estimated from an independent GWAS summary statistics. PRS were calculated at various P-value thresholds (PT) in the GWAS (Supplementary Methods). We estimated the best-fit PRS using two different control sets. One control set consisted of ADGC controls without age restriction, and the second control set consisted of controls with age ≥90 years from both ADGC and LZ. We separately optimized PRS using these two control sets (Supplementary Figure 1).
2.4 Statistical Analysis
Statistical analysis was performed using the R language version 3.5.3 (R Core Team, 2019). Linear or logistic regression models included covariates for sex and the top ten PCs, as well as the variables of interest. All plots and statistical tests except for Figure 1a used Z-standardized PRS.
To compare PRS distributions across cases, 89-controls, and 90+ controls, we performed a one-way ANOVA and pairwise Student’s t-tests. In order to ensure that observed differences were not due to ethnicity or sex, we tested the residuals of a linear model with sex and 10 PCs as predictors and PRS as the response variable.
We evaluated the association between PRS and AD status in two datasets: 1) all cases and ADGC controls; and 2) all cases and superager (90+) controls. To determine the effect of a one standard deviation SD increase in PRS, we performed logistic regressions with PRS as the predictor and AD status as the response (continuous model). To determine the difference in AD OR between the extremes of PRS, we selected the bottom and top deciles of PRS distribution and performed logistic regressions between PRS decile and AD status.
We investigated the relationship between APOE4 genotype, age and PRS. We modeled the association between age, APOE genotype (dummied with APOE3/3 as the contrast variable) and PRS using linear regression, with covariates for confounding variables as before. To determine if the effects were independent of AD status, we performed an additional linear regression controlling for AD status. We also tested for collinearity by calculating variance inflation factor (VIF) with the “car” package.
To test the interaction between age and APOE genotype on PRS, we stratified by AD case-control status, categorized samples as APOE4 carriers and non-carriers, and then tested for interactions. An APOE4 carrier is defined as having APOE3/4 or 4/4, and a non-carrier as APOE2/2, 2/3 or 3/3 (excluding 2/4). We first performed linear regressions separately in cases and controls, with APOE4 carrier status and age as predictors and PRS as the response. We then performed a second linear regression stratified by AD status, adding the interaction between APOE4 carrier status and age as a predictor.
Finally, to test for the effect of these interactions on case-control status, we performed logistic regressions for the relationship between AD case-control status, APOE4 carrier status, age and PRS, with and without interactions. We also tested for collinearity by calculating VIF.
3. Results
Calculating PRS using the two control sets (all controls and 90+ controls; see 2.1, 2.3, 2.4, and Supplementary Methods) resulted in identical optimal P-value thresholds for the most predictive PRS at PT =1×10−5. The PRS at this PT is based on 92 single nucleotide polymorphisms that are associated with AD at a genome-wide significant level or suggestive association level in the IGAP summary statistics (Lambert et al., 2013b). This PRS model explains 1.5% of the phenotypic variation (R2) in our cohort using either control set, based on the Nagelkerke pseudo-R2 estimate. This PRS was significantly associated with AD status when compared with the 90+ control-set (P=1.3 × 10−6) as well as all ADGC controls (P=2.5 × 10−14). Although both control-sets resulted in the same optimal P-value thresholds for PRS, the predictive performance of variants with weak association signals was reduced when 90+ controls were used (Supplementary Figure 1).
There was a significant difference in PRS between cases, 89-controls, and 90+ controls, both without controlling for covariates and after residualizing to control for age and PCs (ANOVA P<2×10−16; F=38.9; residualized F=54.8) (Figure 1a). AD subjects had higher PRS compared to 89-controls (Bonferroni adj. post-hoc t-test P=3.6×10−14; residualized P=1.2×10−5) and the 89-controls had higher PRS than 90+ controls (t-test P=0.027; residualized P=4.4×10−16). The difference between the PRS for 90+ controls and cases was also significant (t-test P = 2.8×10−10; residualized P < 2×10−16).
We assessed the effect of PRS on AD risk by separately comparing cases with all controls and with 90+ controls (Figure 1b). We calculated “Extremes ORs” between subjects in the bottom and top deciles of PRS, and “Continuous ORs” for a one standard deviation increase in PRS. Between the 90+ controls and AD cases, subjects having PRS in the top decile are nearly five times more likely to have AD than subjects with PRS in the lowest decile (Extremes OR=4.91, CI: 2.54-9.50, P=2.24×10−6). The ORs were significant, but are less pronounced both for Continuous OR with 90+ controls (Continuous OR=1.56, CI: 1.30-1.87, P=1.46×10−6) and when all controls are used (Extremes OR=2.50, CI: 1.88-3.32, P=2.54×10−10; Continuous OR=1.30, CI: 1.20-1.41, P=1.02×10−10). We evaluated whether AD PRS was more predictive when using 90+ controls by calculating the AUC for the receiver operating characteristic (ROC) curve. We found that PRS is marginally more predictive (P=0.07) when using 90+ controls (AUC=0.60; Supplementary Figure 2) than when using controls of all ages (AUC=0.57). Thus, PRS is not significantly more predictive of AD when using superager controls.
Given that superagers had particularly low AD PRS, we investigated the potential correlation of AD PRS with age (Figure 2). Age was negatively associated with PRS in the case-control design, even when AD status was accounted for (Figure 2a). In a linear regression controlling for sex, APOE and PCs, age was a significant predictor of PRS (P=2.8×10−5, β=-0.007). This effect remained after accounting for case-control status (P=1.6×10−4, β=-0.006). No APOE genotype was directly associated with PRS.
To investigate whether APOE4 carrier status modifies the correlation between PRS and age, we stratified the subjects into APOE4 carriers and non-carriers. We found that the correlation between PRS and age depended on both case-control status and APOE genotype (Figure 2a). Therefore, we investigated the interactions between APOE4 carrier status and age on PRS separately in cases and controls (Figure 2b-c). We observed that the association between age and PRS depended on APOE4 carrier status in AD cases, but not in controls.
In AD cases, age was weakly correlated with PRS (P=0.03, β=-0.007), and APOE4 carrier status was not associated with PRS (P=0.55). However, when the interaction between age and APOE (P=5.9×10−3, β=-0.017) was accounted for, APOE became a significant predictor of PRS (P=7.5×10−3, β=1.3), whereas age was no longer significant (P=0.73). Among AD cases who were APOE4 carriers, those with PRS in the top decile had a significantly earlier age at onset (70.0 ± 6.4 years) than those with PRS in the lowest decile (75.0 ± 7.3 years; P=2.4×10−5). This difference in age at onset remained significant even after removing APOE4/4 homozygous cases (71.1 vs 76.7 years; P=2.1×10−5). This indicates that PRS was associated with age at onset in AD patients, but only in APOE4 carriers (Figure 2b).
In controls, the negative correlation between age and PRS (P=0.02, β=-0.005) did not depend on APOE4 carrier status (Figure 2c), and APOE was not associated with PRS (P=0.72). When accounting for interactions, the age and APOE associations remained unchanged, and there was no interaction between the two (P=0.81).
Due to the interaction in AD cases between APOE genotype and age that predicts PRS in cases, we tested whether the interactions between APOE, age and PRS predict AD status. Without modeling for interactions, age (P=4.5×10−6, β=-0.003), APOE4 carrier status (P<2.0×10−16, β=0.29) and PRS (P=2.03×10−12, β=0.045) were all predictors of AD status. When all interactions between these three predictors were included in the model, the main effects of age (P=0.93) and PRS (P=0.68) were no longer significant, indicating that the effects of age and PRS on AD status are dependent on APOE. APOE4 carrier status remained a significant predictor (P<2.0×10−16, β=1.16). The interaction between age and APOE showed a small but significant effect (P=4.15×10−12, β=-0.012) on AD status, whereas the interaction between PRS and APOE (P=0.054, β=0.228) and the three-way interaction between age, PRS and APOE (P=0.07, β=-0.003) were suggestive but did not significantly predict AD status, further supporting that the effect of PRS and age depend on APOE.
4. Discussion
We show that non-APOE PRS is inversely correlated with dementia-free survival. Among AD cases who are APOE4 carriers, a higher PRS is associated with a younger age of AD onset. Among AD cases with APOE4, those with PRS in the top decile had, on average, a disease onset five years earlier than those with PRS in the lowest decile. In contrast, no correlation between PRS and age-of-onset was observed among APOE4 non-carriers. This suggests that high PRS is particularly detrimental in APOE4 carriers, and conversely lower PRS delays dementia onset. In controls, the negative correlation between age and PRS is independent of APOE4 carrier status. Thus, superager controls are depleted of AD risk variants and, as a result, are protected from dementia at higher age.
When comparing extremes in the PRS distribution, the individuals with the top 10% of PRS are nearly five times more likely to develop AD than those in the bottom 10% when superager controls are used. This effect size is comparable to the reported findings in CAD (Khera et al., 2018). This indicates that many younger unaffected controls will develop AD later in life, which is consistent with the long pre-clinical stage of 20-25 years during which a subject does not display dementia symptoms (Bateman et al., 2012). Our observation is consistent with the higher effect sizes of GWAS risk alleles reported when superagers are used as controls (Tesi et al., 2019). The lifetime risk of AD for healthy females and males at 75 years old is estimated to be 26.2% and 19.9%, respectively (Brookmeyer et al., 2018). As the majority of controls enrolled in AD genetic studies are in their 70s and 80s, a considerable proportion of them will convert to AD at a later age unless they die from other causes. These cryptic preclinical AD cases reduce the power of case-control studies. Thus, increasing the number of superagers in case-control studies for AD may increase the power to detect additional risk variants as well as protective variants.
In this study, employing superagers did not significantly improve predictive power of PRS compared to standard controls. A possible explanation might be the inclusion of the APOE genotypes as covariates, which have been shown to be associated with longevity, thus reducing the power of using superagers. Escott-Price et al (Escott-Price et al., 2017) showed that PRS had higher predictive value than in our data, reaching an AUC of 0.74. This is likely due to the fact that APOE genotype along with age and sex were included in their models. In contrast we see that non-APOE common genetic risks explain very little of AD risk using current models even when superager controls are used. Thus, non-APOE AD PRS is not useful as a stand-alone test to predict individual risk of future disease. However, the strong interaction between PRS and APOE4 carrier status in AD cases suggests a greater impact of non-APOE genetic risks in younger APOE4 carriers. This has implications for the design of future association studies. Enrolling younger APOE4 carrier cases and older controls, regardless of APOE status, would yield optimal power for signal detection. However, for APOE4 negative cases enrollment does not require age consideration.
In addition to genetic factors, novel non-genetic protective factors may best be found among healthy elderly APOE4 carriers with high PRS based on known GWAS risk loci, whereas novel risk factors may best be found among AD patients with both low PRS and no APOE4 allele.
5. Conclusions
In summary, we show that APOE-independent polygenic risk is negatively correlated with dementia-free aging and that there is a complex interaction between APOE, age and PRS. Higher polygenic risk is especially detrimental for people carrying the APOE4 risk allele. Future studies are necessary to evaluate whether interactions between PRS, APOE and other AD risk factors, such as health data and biomarkers, can improve AD risk prediction.
Data Availability
All genetic data referred to in the manuscript will be made available in a suitable repository, and is available upon reasonable request made to the corresponding author following peer-reviewed publication of the manuscript.
Funding
This work was supported by the National Institutes of Health, National Institute on Aging (NIH-NIA) (ADGC and grant numbers U01 AG032984 and RC2 AG036528). Samples from the National Cell Repository for Alzheimer’s Disease (NCRAD), which receives government support under a cooperative agreement awarded by the NIH-NIA (grant number U24 AG21886), were used in this study. The NACC database is funded by the NIH-NIA (grant number U01 AG016976). Data for this study were prepared, archived, and distributed by the National Institute on Aging Alzheimer’s Disease Data Storage Site (NIAGADS) at the University of Pennsylvania (grant number U24-AG041689-01). In addition, this work was funded by the NIH (grant numbers U01AG052411, U01AG058635, K08AG054727, R01 AG618381, and R01 AG046949), and the Einstein Nathan Shock Center (grant number P30 AG038072). Additional support was provided by the Mildred and Frank Feinberg Family Foundation, the Advancing Women in Science and Medicine Foundation, and the JPB Foundation.
The authors would like to thank the contributors who collected samples used in this study during a period of 30 years, as well as patients and their families, whose help and participation made this work possible. A list of ADGC members and their affiliations are included in the Appendix within the Supplementary Data.
Acknowledgements
Footnotes
Disclosure Statement AMG has consulted for Eisai, Biogen, Pfizer, AbbVie, Cognition Therapeutics and GSK. She also served on the SAB at Denali Therapeutics from 2015-2018. YFH co-owns stock and stock options of Regeneron Pharmaceuticals. All other authors have no interests to declare.