Abstract
Primary open-angle glaucoma (POAG) is a complex eye disease characterized by progressive loss of optic nerve function that, if untreated, ultimately leads to irreversible blindness. To date, the biological mechanisms causing POAG are still unclear. There is disparity in POAG prevalence, clinical presentations and outcomes across ancestries. Here, we aim to identify unique genetics that underlie risk to POAG and evaluate the potential connection with vascular mechanisms. We performed POAG meta-analysis across 15 biobanks that are part of the Global Biobank Meta-analysis Initiative, with two previously published multi-ancestry analyses for a total of 1,478,037 individuals from six ancestries (46,325 cases and 1,431,712 controls). A total of 109 genome-wide significantly associated loci (p<5e-8) were identified, 18 of which were novel. Three of these novel loci are ancestry-specific, two African-specific and the third specific to northern Europeans. We also identified five sex-specific novel loci, four of which are African-specific and one European-specific. To explore biological implications underlying these variant-trait associations, we performed gene enrichment analysis, gene prioritization analysis and transcriptome-wide association studies (TWAS) implicating genes related to vascular-related functions, blood vessels, angiogenesis, and cancer. A fifth of TWAS-prioritized genes with vascular-related and/or cell senescence/proliferation functional roles or have been implicated in vascular or neoplastic diseases are primary ciliary related genes. We further performed extensive statistical validation analysis of genes in the SIX6 and well-known CDKN2B-AS1 loci, previously implicated in POAG, cardiovascular diseases, and cancers across multiple ancestries. We found evidence of significant interaction between SIX6 rs33912345 and causal variants in chr9p21.3, with concomitant effect on expression of a primary cilia gene CDKN2A and CDKN2B at the CDKN2B-AS1 locus. Phenome-wide association analysis of POAG genetic risk burden across five biobanks and genetic correlation analysis also show the shared biology between POAG and vascular and neoplastic traits. In summary, our findings suggest that some POAG risk variants may be ancestry-specific, sex-specific, or both. Our findings further support the contribution of vascular and proliferation genes in POAG and suggest potential involvement of primary cilia in POAG pathogenesis.
Introduction
Glaucoma is a complex eye disease characterized by a progressive loss of optic nerve (ON), fibers, which manifests initially as visual field loss, and if untreated ultimately leads to irreversible blindness.1 Primary open-angle glaucoma (POAG) represents the most prevalent type of glaucoma. POAG affects the trabecular meshwork (TM), the inner retina with retinal ganglion cell axons forming the optic nerve, and, presumably due to transsynaptic degeneration, the visual pathways including the visual cortex.2, 3 High intraocular pressure (IOP) is a major risk factor identified in POAG patients.4 Other risk factors are advanced age and positive family history, and difference in prevalence has been shown for POAG across ethnicities and gender.5–7 Moreover, there is disparity in POAG clinical presentations and outcome across ancestries.8, 9
Through genome-wide association studies (GWASs), significant progress has been made in understanding the genetic pathophysiology of glaucoma in humans. In addition, animal models have also provided a valuable resource for understanding the biological mechanisms.10 However, there is still a lack of understanding of the underlying pathologic mechanisms in POAG, limiting the development of specific interventions in patients.
Three hypotheses have been suggested to explain the mechanisms causing the optic nerve damage (OND) that leads to glaucoma: vascular, biomechanical, and biochemical. The vascular theory posits that OND is due to unstable ocular perfusion which leads to ischemia of the optic disk and retina. In the biomechanical theory, elevation of IOP, due to an increase in TM outflow resistance, impinges on the ON fibers and thus causes subsequent retinal ganglion cell death.11–13 The biochemical theory attributes OND to neurotoxic biological molecules, such as excitatory amino acids, caspases, protein kinases, oxygen free radicals, nitric oxide, and tumor necrosis factor-alpha.12 However, the etiology that underlies POAG may actually involve interactions among the three theories.
Several studies have found that POAG is associated with a variety of cardiovascular diseases and vascular risk factors.11, 14–23 In a previous large multi-ancestry GWAS,24 results from gene-enrichment analysis have implicated perturbation of molecular mechanisms in the vascular system that contribute to blood vessel morphogenesis, vasculature development, and regulation of endothelial cell proliferation. Analysis in large electronic health records (EHR), coupled with a zebrafish-model system, showed association of reduced genetically predicted expression of a gene that encodes for glutamate receptor GRIK5, which potentially determines blood vessel numbers, integrity in the eye, and increased vascular permeability, with comorbid vascular and eye diseases.25 However, no study has previously performed detailed exploration of the genetics that underlies the potential vascular connection with POAG across ancestries. Large population-based and clinical based biobanks offer an opportunity to do large-scale in silico investigations to elucidate common molecular systemic pathways between POAG and vascular systems.
The Global Biobank Meta-analysis Initiative (GBMI) is a collaborative group that currently involves 19 biobanks from 12 countries spanning four different continents: North America (Canada, USA), East Asia (China, Japan), Europe (Iceland, Estonia, Finland, Netherlands, Norway, Scotland, UK) and Australia. Each GBMI-affiliated biobank has paired genetic and phenotypic data collated through different types of electronic health data, such as self-report data from questionnaires, billing codes, doctors’ narrative notes, and death registry for >2.1 million individuals representing diverse ancestries: African ancestry individuals (AFR), Admixed Americans (AMR), Central or South Asians (CSA), East Asians (EAS), Europeans (EUR) and Middle Eastern (MID). Detailed description of each biobank is found in Zhou et al., 2021.26 We conducted a large-scale meta-analysis of POAG GWASs from 15 GBMI biobanks from six ancestries (n=1,487,447). We leveraged the largest, most diverse data to date combined with sophisticated statistical methods to identify unique molecular actors across ancestries and to determine the biological connections of POAG with vascular dysregulation.
Materials and Methods
Association testing
The overall design of this study is reported in Figure 1. As part of the GBMI, a large-scale multi-ancestry meta-analysis of genome-wide association studies was conducted including a total of 26,848 cases and 1,460,599 controls from 15 global biobanks across six ancestries (Suppl. Table S1, Suppl. Figure S1-2). Phenotype definition, biobank specific quality control and standardized GWAS was performed by each contributing biobank. POAG phenotyping was done using any of the three approaches: phecode mapping (BioVU, UKBB, HUNT, MGI and CCPM), ICD9/ICD10 codes (BioMe, FinnGen, EstBB, MGB, deCODE) and either one or a combination of physicians’ diagnosis, glaucoma medications or self-reporting (BBJ, TWB, Lifelines, GS and QSkin) (Suppl. Table S1).27 Potential effect of heterogeneity of phenotyping was checked by comparing a subset of manually reviewed and phecode defined patients in BioVU with an independent traditionally phenotyped cohort (Supplementary information, Suppl. Table S2).
Meta-analysis was performed followed by the variant-level quality control for each biobank by flagging markers with different allele frequencies compared to gnomAD and excluding markers with imputation quality score < 0.3 (Suppl. Table S3).26, 28 Gender-specific association analysis was also conducted across 9 biobanks in 7916 cases and 269,105 controls (males) and across 10 biobanks in 9538 cases and 342,870 controls (females) (Suppl. Table S4-5, Suppl. Figure S3-4). The number of independent loci and number of top hits for each biobank contributing to GBMI are reported in Suppl. Figure S5.
To increase the power to discover additional variants associated with POAG, a meta-analysis was performed combining GBMI (n=1,259,040), International Glaucoma Genetics Consortium (IGGC) of European ancestry (n=192,702) and Genetics of Glaucoma in People of African Descent (GGLAD) (n=26,295) summary statistics, excluding any biobank cohorts from GBMI that overlap with the two other datasets in our analysis (FinGenn, BioME and UKBB Africans). This represents the largest and most diverse GWAS in POAG up to date (number of cases=46,325, number of controls=1,431,712) (Suppl. Table S7, Suppl. Figure S9-10).24, 29 Fixed-effect meta-analyses based on inverse-variance weighting were performed. We further performed ancestry specific meta-analysis of the data. We defined genome-wide significant loci by iteratively spanning the ± 500 kb region around the most significant variant and merging overlapping regions until no genome-wide significant variants were detected within ± 500 kb. The most significant variant in each locus was selected as the index variant. To identify independent variants, clumping was performed in PLINK ( http://pngu.mgh.harvard.edu/purcell/plink/) using r2 < 0.05 as linkage disequilibrium (LD) metric threshold, physical distance of 500 kb, 1000genomes Europeans as LD reference panel and the significance threshold for index SNPs was set to P<1×10-5.30 Quantile-quantile (Q-Q) and Manhattan plots were generated to visualize the results.
POAG prevalence in European and African American were estimated in 3,414,079 BioVU patients that are ≥40 years older and self-identify as European Americans (EA, n=1,218,124, 555,132 males 662,825 females) or African Americans (AA, n=154,273, 66,691 males 87,541 females). Comorbidity with circulatory related phecodes (phecodes=394-459) in a total of 1,968,903 individuals with ≥2 instances of mention of the relevant phecode excluding individuals with any mention of eye phecodes (phecodes=360-379) (AA, n= 273379, 1549 cases 271830 controls, and EA, n= 1695524, 5446 cases 1,690,078 controls) was inferred by performing logistic regression analysis conditioned on gender and age as covariates. Odd ratio comparison was done using epitools in R.
SNPs and gene annotations
Polymorphisms associated at a genome-wide significant level (p<5e-8) LD clumped as above ( http://pngu.mgh.harvard.edu/purcell/plink/).30 Significant polymorphisms were annotated with the gene inside whose transcript-coding region they are located, or alternatively, the nearest gene. In addition, the polymorphic sites were functionally annotated using ANNOVAR.31 Exonic SNPs (single nucleotide polymorphism) were investigated further using SNPnexus to uncover non-synonymous variants.32 The possible damaging effects of non-synonymous SNPs on protein structure and function were predicted using the Sorting Intolerant From Tolerant (SIFT) and Polymorphism Phenotyping (PolyPhen) scoring tools.33, 34
Enrichment analysis
To identify the functional roles and tissue specificity of the associated variants, we performed gene prioritization and tissue- and gene-set enrichment analyses using DEPICT (Suppl. Table S8-10) in which we prioritized POAG-associated genes using a co-regulation-based method across multiple different tissues.35 The tool assesses the potential role of genes independent of the presence of an eQTL, making it possible to test more genes.
For the gene-set enrichment analysis, gene expression data from 77,840 samples was used to predict gene function for all genes in the genome based on similarities in gene expression. In DEPICT, the probability of a gene being a member of a gene set was estimated based on their co-functionality to prioritize the most likely causal genes. A total of 14,461 reconstituted gene sets was generated which represent a set of biological annotations: Gene Ontology (GO) gene sets, REACTOME gene sets, Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets, InWeb protein-protein interactions, and Mouse Genetics Initiative gene sets (MP). Bonferroni correction was applied for multiple comparisons of 14,461 independent tests (p <0.05/14,461).
For tissue enrichment, microarray data from 37,427 human tissues of 209 Medical Subject Heading (MeSH) annotations from Affymetrix HGU133a2.0 platform microarrays was used to identify genes with high expression in different cells and tissues.
TWAS and fine-mapping analyses
We used three gene-based methods, PrediXcan, joint tissue imputation (JTI) and unified test for molecular signatures (UTMOST) for correlating the genetic component of gene expression with phenotype.36–38 PrediXcan estimates gene expression weights by training a linear prediction model in a reference sample with both gene expression and SNP genotype data.39 UTMOST and JTI methods borrow information across transcriptomes of different tissues, leveraging shared genetic regulation, to improve prediction performance in a tissue-dependent manner.37
There is no ocular tissue in GTEx data, so we used proxy tissues that have biological functions that we inferred to be crucial in visual perception. These include: vascular tissues (artery and heart tissues) that are crucial in production and drainage of aqueous humor (defects in this system can cause glaucoma), tissues of the nervous systems, which the eye is considered its extension, and the liver tissue because it is a pivotal metabolic center.40–42 We consider the brain cortex as the most relevant GTEx tissue since the visual cortex is located there.
For the three models, gene expression prediction models were trained for 23 different human nervous, vascular and liver tissues (Suppl. Table S10-11) using GTEx v8 data.43 The corresponding genotype data were imputed using the University of Michigan Imputation Server, with 1000Genomes (phase 3 ver. 5) as a reference panel.44, 45 Models with non-zero weights that met set significance criteria (FDR<0.05 from the cross-validation in each tissue) were retained in the database. For each model, there was also a corresponding file with covariance data for the SNPs in each model. The three models were applied to the GBMI-IGGC-GGLAD GWAS meta-analysis summary results (n=1,478,037: 46,325 cases, 1,431,712 controls) (Suppl. Table S10-11).
Fine-mapping of TWAS association signals was done using FOCUS, a probabilistic gene-level fine-mapping method, to define credible sets of genes that explain the expression-trait signal at any given locus.46 The default non-informative priors implemented in FOCUS were used to estimate the posterior inclusion probability and a 90% credible set of genes at a given locus.
Polygenic risk scores (PRS) and pleiotropy
PRS for POAG were constructed from the leave-biobank-out GBMI-IGGC-GGLAD meta-analysis summary statistics in six different biobanks, BioVU (n=85,615), UK Biobank (n=370,088), Biobank Japan, BBJ (n=178,726), Lifelines (n=14,930), Estonian Biobank (n=53,821), and glaucoma cohort GLGS47 (n=3,739), with PRS-CS-auto option, and the best performing European reference panel (based on R2) of either 1KG Phase 3 or UK Biobank was used to estimate LD.48 In the target samples, genotypes were filtered for SNPs using the following criteria: minor allele frequency< 0.01, missing genotype rate<0.05, filters out variants which have Hardy-Weinberg equilibrium exact test p-value < 1e-6, and exclude individuals with missing data < 0.1. Only unrelated individuals with genetic relatedness less than 0.05 were retained. We used EHR/or self-reported health conditions (depending on the data availability in each biobank) in a total of 617,565 individuals across five biobanks (excluding GLGS) to explore pleiotropy of genetic risk for POAG. We then evaluated the predictive performance of the PRSs generated using leave-biobank-out meta-analysis results in the five biobanks. PRSs were then tested for association with 17 disease categories using a phenome-wide association study (PheWAS).
Genetic correlation analysis
To evaluate genetic correlations between POAG and vascular traits, cross-trait LD score regression analysis was performed using LDSC software 49 with GBMI-IGGC-GGLAD meta-analysis summary statistics and, 13 well powered published summary statistics for vascular and neoplastic traits, respectively, obtained from GWAS catalog.50 For this we used default LD scores estimated for the European ancestry populations.49
Expression effects of missense variants (male-specific rs74113753 & trans-ancestry rs33912345)
Variants in SIX6 and chr9p21.3 CDKN2B-AS1 loci have been associated with POAG 51–58 and related risk factors and endophenotypes, such as peripapillary retinal nerve fiber layer,59 optic nerve degeneration,55, 60 and vertical cup-disc ratio (VCDR)61–67 across all ancestries. The rs33912345 missense variant (p.His141Asn) is the lead variant in SIX6 locus, and is outside the DNA binding site and is speculated to affect the ability of the SIX6 gene to interact with other transcription factors and cofactors.68 rs33912345t is also a GTEx eQTL for several genes within the SIX6 locus.43 The SIX6 locus and the CDKN2B-AS1 locus on chr9p21.3 are known to be associated with POAG51–58 and cardiovascular disease69. Studies in both glaucoma models and cell lines indicated that SIX6 missense variants interact with genes in the chr9p21.3 locus.68, 70 However, no comprehensive analysis of this interaction has been done in human genetic and transcriptomic data.
The effect of the missense rs33912345 variant on expression patterns of genes SIX6 and SALRNA1 in the SIX6 locus and on genes CDKN2A and CDKN2B in the CDKN2B-AS1 locus were determined by performing regression analysis of residuals of the GTEx normalized gene expression levels. For each tissue, the possible confounders (gender, platform, the first five principal components, and Probabilistic Estimation of Expression Residuals (PEER) factors)37 have been accounted for in the analysis. Since the missense variant did not pass the QC filter, we used a proxy SNP, rs7493429 which was in high LD with the missense variant (r2=0.724, D’=0.99) determined using R package LDlinkR (https://github.com/CBIIT/LDlinkR).71 We also evaluated the effect of African specific missense variant rs74113753 using proxy rs17641032 variant (−862bp, r2=0.699, D’=1, 1000 genomes)71 on measured gene expression residuals for a total of 24 genes in a 500 kb window either side of the variant (HENMT1-AMPD2) that passed quality filters in GTEx muscle skeletal tissues (n=706) in all the samples and sex stratified set (males=469, females=237).
We further evaluated the effect of the variant rs33912345 in the SIX6 locus on trait association to variants in CDKN2B-AS1 locus in four steps. We first checked the differences in the gene expressions of SIX6 and SALRNA1 measured in skeletal muscle tissues (one of the only tissues with GTEx expression data remaining after correcting for confounders) between the haplotype background that has the reference and derived allele. Secondly, we rebuilt the muscle skeletal genes expression model while excluding any variants that were in LD with the rs33912345 missense variant (r2>0.1). The PrediXcan and UTMOST muscle tissue models were the best performing gene models for SIX6 (r2=0.131, compared to JTI r2=0.120) and SALRNA1 (r2=0.024 compared to PrediXcan r2=0.016), respectively. Thirdly, we evaluated the Pearson correlations in gene expressions between genes in the SIX6 and CDKN2B-AS1 loci. We then checked for statistical interaction between the SIX6 sentinel SNP and two genome-wide significant variants from the CDKN2B-AS1 locus with traits that represent groups in LD with glaucoma association in the loci. The chr9p21.3 POAG associated GWAS variants are in LD in European ancestry individuals (r2>0.1) with more than 140 GWAS catalog variants in the ∼260 kb cluster associated with vascular69, carcinoma, and neurological traits.50 We used two variants in the locus that are associated with a whole range of the cardiovascular and cancer related traits; rs2891168 (coronary artery disease, myocardial infarction, and beta blocking agent use measurement),72–77 and rs10811650 (breast cancer, melanoma, and hair color).78–81 We finally determined if these interactions had effects on expression patterns of CDKN2A & CDKN2B in GTEx data brain cortex tissue. Mean expression difference between allele combinations was assessed using Student’s t-test and ANOVA (Suppl. Table S20).
To determine phenotypic consequences of the SIX-CDKN2B-AS1 loci interaction and African male-specific missense variant, we further performed TWAS-PheWAS for the two chr9p21.3 genes and CELSR2 (that have male-specific expression association with the missense variant) gene in summary statistics from the UKBB (n=396,618) and BioVU (n=59,805), followed by meta-analysis of the two PheWAS (n=456,423) across 731 traits and diseases grouped into 17 categories (Suppl. Table S21).82, 83
Results
Discovery of novel ancestry-specific POAG loci
We report here a multi-ancestry genome-wide association study of 26,848 primary open-angle glaucoma cases and 1,460,593 controls from 19 biobanks. By analyzing all the biobanks together, a total of 62 loci that reach the genome-wide significant threshold were identified (Figure 2A, Suppl. Table S3; Suppl. Figure 5). Six loci were not previously reported for POAG, and encompass the genes F5, RPL37A-LINC01280, ZFP91-CNTF-GLYAT, KALRN, CCDC13 and MIR2054-INTU (Table 1). Of these six loci, the latter four were replicated in an independent glaucoma cohort (Figure 2B, correlation between the effect sizes r = 0.71).24 A newly defined ANGPTL7-MTOR locus encompasses previously reported rare variants.84 A Bayesian cross-ancestry meta-analysis identified three additional loci, one of which is specific to African ancestry and appears to be a novel locus (rs77136907_MYO1B;NABP1, p=2.74e-08), using Meta-Regression of Multi-AncEstry Genetic Association (MR-MEGA)85 (Suppl. Table S7).26 Three previously reported loci and identified here might be due to other glaucoma subtypes (Supplementary information). Therefore, we cannot rule out potential effect of non-POAG glaucoma signals i.e., especially due to primary angle closure in EAS populations where the glaucoma subtype is common,6,86–88 or from biobanks where phenotyping was based on self-reporting (Supplementary information).
To evaluate the cross-ancestry consistency in effect sizes, we estimated the correlation between betas of independent genome-wide significant SNPs between European, African and Asian populations in the GBMI dataset. The results showed the highest correlation between the European and Asian ancestry (Pearson correlation coefficient (r) = 0.87), while correlation between African with to other continental populations was much lower: European and African ancestry (r = 0.34), and African and Asian ancestry (r = 0.25) (Suppl. Figures S6-S8). However, three loci showed statistical heterogeneity in effect between ancestry and biobanks: novel rs12476634_LINC01280 (in African and Hispanic Americans - African-Specific henceforth), well known rs74315329_MYOC (Europeans, European-specific henceforth), novel rs1469837390_F5 (FinnGen & EstBB) and newly defined rs147660927_ANGPTL7-MTOR84(HUNT, FinnGen, EstBB, Lifelines, QSKIN) (Northern European-specific henceforth; Suppl. Table S3, S7).26 The well-known MYOC p.Gln368Ter variant, which hitherto had not been defined as ancestry specific, shows association in GBMI European-ancestry biobanks only.89 Additionally, >92% (289/314) carriers of this allele in the latest version of gnomAD database are of European ancestry, and the few that are reported in populations of other ancestries, have history of admixture with European populations. Furthermore, in 1000 genomes project this variant is present in Europeans and an individual of South Asian ancestry sampled in the UK.28, 48 The three novel loci and newly defined locus that we classify as either African- or northern European-specific have similar patterns that are highly-specific to the two ancestries described here.28, 48 The lead variants in northern-European-specific loci are only reported in Finnish and non-Finnish European individuals in gnomAD, while the lead variant in African-specific loci are observed in this study and the gnomAD/1000 genome databases in individuals of African and Hispanic Americans, who are generally African-admixed.90–92 In addition, 14 loci show significant effect size differences between ancestries based on Cochran heterogeneity test (Suppl. Table S7).
These ancestry specific loci and those that show heterogeneity between the ancestry loci might be driving most of the overall heterogeneity observed between the biobanks (Suppl. Figure S1)26 Phenotypic heterogeneity might also be a contributing factor with all self-reporting biobanks clustering together (Suppl. Figure S1) relative to those that used ICD-codes to identify cases. We did not find any EAS (BBJ and TWB) specific loci that demonstrated significant evidence for heterogeneity between the ancestries (Suppl. Table S7).
Identification of novel sex-specific loci
We performed analyses separately by sex to identify any variants that demonstrated sex-specific association, or effect size heterogeneity by sex for the aforementioned index variants. Sex-stratified GWAS identified one low-frequency African-specific association in females rs116625313_PRKG2;RASGEF1B (females p=2.85e-8, beta=1.52 vs males p=0.35, beta=-0.59) (Suppl. Table S4). In addition, four novel loci with POAG-association specific to males were identified (Suppl. Table S5) with low-frequency lead variants, three of which are African-specific (2-6% in population frequencies): rs111739240_TMEM167B-C1orf194 on chromosome 1, rs17057880_MIR3142HG-ATP10B on chromosome 5 and rs114598725_ARMC4 on chromosome 10, and one is European-specific, rs150385013_LINC02024-LOC105374060 on chromosome 3 (>0.1% population frequency) (Suppl. Table S4-S5).
We next searched for coding variants at associated loci that explain the association signal and therefore implicate a highly-likely functional gene. The novel rs111739240_TMEM167B-C1orf194 variant is in perfect LD (r2=1) in African ancestry populations with rs74113753 C1orf194 missense variant that has also significant but slightly weaker association signal in our study and considered deleterious and possibly damaging by the SIFT and LoFtool, respectively.93 A proxy variant, rs17641032, that tags the missense variant (r2=0.7 in 1000genomes Africans) is at 7% and 3% frequencies in European and East-Asian population, respectively. However, the proxy variant is associated with POAG in African ancestry BioVU male subjects (AFR n=69 p=5.05e-4) but not in European ancestry subjects (EUR, n=213, p=0.892), confirming African-specific association of these locus. The proxy rs17641032 variant is a GTEx eQTL for TMEM167B and AMIGO1 in eight tissues and one tissue, respectively.94 In GTEx tissue with the largest sample size, muscle skeletal (but not among the eight tissues reported in GTEx), the variant is associated with expression changes for three genes (TMEM167B p=0.00027, CELSR2 p=0.0086 and AMPD2 p=0.041). However, only CELSR2 has male-specific expression changes at this variant (males p=0.0063, females p=0.38) in GTEx muscle skeletal tissue (Suppl. Figure S10). While the other two genes have weakened signals in both genders as expected due to reduced sample sizes in sex-stratified analysis in muscle skeletal and five more tissues out of seven, which had signals with nominal significant association out of all the 49 GTEx tissues (Suppl. Table S6). In addition, TWAS-PheWAS analysis revealed associations of the CELSR2 with traits/phenotypes within the endocrine/metabolic (lipid traits) and circulatory groups with hyperlipidemia and angina pectoris, respectively, as the top phenotype associations (Suppl. Table S22).
In addition, 14 of the loci that have association signals in combined both-sex meta-analysis have significant albeit attenuated signals, only in either male (3 loci) or females (11 loci) in sex stratified analysis (Suppl. Table S4, S5). Four of these loci (near CADM2, DGKG, KALRN and ARHGEF12) show significant effect size differences between males and females based on Cochran heterogeneity test (Suppl. Table S4, S5, Suppl. Figure S9). All four genes prioritized using TWAS that are near lead variants for the four loci that show significant heterogeneity between males and females (CADM2, DGKG, KALRN and ARGHEF2), have also been shown to have sex-biased expression patterns.95, 96
We examined differences in POAG prevalence between genders in BioVU subjects who are ≥ 40yrs and self-identify as European Americans (EA) or African (AA) ancestry. In 12,755 POAG cases in overall total of 1,372,397 BioVU subjects, there was higher odds of POAG in AA males relative to females (OR=1.15, CI 1.06-1.24, p=3.85e-4) while the risk was marginally lower in males than females in EA (OR=0.96, CI 0.92-1 (p=0.048).
We further performed meta-analysis of GBMI and public datasets from IGGC and GGLAD, generating the largest and most diverse GWAS to date, to identify additional loci that are associated with POAG. Ten additional novel loci that encompass the following prioritized genes LOC654841, KBTBD8, ADGRL3, DDIT4L, HMGXB3, KCNK5, MAD1L1, APPL2, CATSPERB, and FENDRR attained genome-wide significance, all of which were associated with POAG at nominal significance in the GBMI and IGGC data (Suppl. Table S7, Suppl. Figure S11, S12). Five of these ten genes in the novel loci are involved in cardiovascular conditions and six in cancer processes.77, 97–108 Three loci that were novel in GBMI, ANGPTL7, CCDC13, and LHPP, were attenuated to sub-genome wide significance level in GBMI-IGGC-GGLAD meta-analysis (p-values of 0.0043, 6.202e-08 and 2.506e-07, respectively, Suppl. Table S3, S7, S12).
Enrichment analyses prioritize vascular and cell proliferation mechanisms
To identify the functional roles of POAG-associated variants and which tissues are mediating the genetic effects, we performed enrichment analyses with DEPICT using meta-analysis of GBMI-IGGC-GGLAD summary statistics (see methods section). DEPICT prioritized 101 co-regulated genes, of which 43 (43%) have been previously reported for POAG and 58 (57%) that were novel (Suppl. Table S8). Nearly 50% (49 genes) of the genes identified here were reported to be associated with vascular traits including cardiac disease, and arterial stiffness measurement while nearly a third (34 genes) were associated with cancer, including cutaneous melanoma, keratinocyte carcinoma, breast carcinoma, prostate carcinoma, and lung carcinoma (Suppl. Table S8). The gene-enrichment analysis was performed using the identified 110 biological pathways, reaching the Bonferroni corrected level of significance (p< 3.45e-6) (Suppl. Table S9). Among the top and common enrichment signals were biological pathways involved in vascular related blood vessel development/morphogenesis, angiogenesis and epithelial cell proliferation (Suppl. Table S8). In addition, several biological pathways crucial in cell development and proliferation like the IGF1 subnetwork, insulin-like growth factor binding and Wnt signaling were significantly enriched (Suppl. Table S9). Tissue/cell type enrichment analysis prioritized 10 significant tissues and cell types, with musculoskeletal and cardiovascular systems being the most represented (Suppl. Table S10).
TWAS Analyses identify novel associations
TWAS of the GBMI-IGGC-GGLAD meta-analysis summary statistics identified 19 gene-trait associations (p<2.5e-6, Bonferroni correction for mean 20,000 genes tested) in GTEx tissue biologically most relevant to ocular traits, brain cortex tissue (Suppl. Table S11). Most of these genes (16/19) showed associations in at least one other tissue among the additional 22 other GTEx tissues potentially relevant to ocular diseases (Suppl. Table S11-S13). The JTI model generated most genes and gene-trait associations among the three cis prediction models (Suppl. Table S12, S13, Suppl. Figure S13). However, the majority of the gene-trait associations in JTI overlap with what was obtained using PrediXcan and UTMOST models (Suppl. Figure S13). Using the three cis models made it possible to have a comprehensive TWAS prioritization of gene-trait associations, for example one additional gene-trait association that transect novel GWAS loci identified in this study was prioritized using PrediXcan and UTMOST models but not the most robust JTI (Suppl. Table S13).
Overall, 221 out of 271 gene-trait associations across the 23 tissues were estimated to define the 90% credible set at the locus via probabilistic fine-mapping. 116 of the gene-trait associations in the 90% credible set intersect with 57 out of the 103 GWAS loci identified. A total of 14 of these gene-trait associations were novel, nine of which were unique to GBMI (Suppl. Table S12, S13). 156 gene-trait associations intersect 64 previously known loci, while 86 did not intersect with any of the genome-wide significant risk loci and correspond to 67 loci with subgenome-wide GWAS signals (TWAS ‘loci’) that will potentially attain significance in a more powered GWAS (Suppl. Figure S13, Suppl. Table S12, S13). For example, SEC31B (SEC31 Homolog B, COPII Coat Complex Component), a gene on chromosome 10, which is a novel significant TWAS association signal in three GTEx brain tissues using the JTI model (Suppl. Table S12). The SNP located near this gene, rs11190559, was the strongest SNP (p=2.4e-5) with 47 other variants in the locus with association signal below 1e-4 (Suppl. Table S12, S13). The SEC31B gene has been reported in the context of two anomalous pigmentary syndromes with ocular manifestations; retinitis pigmentosa 37 and Hermansky-Pudlak syndrome 1.109 A GWAS variant in the vicinity of the gene is associated with heart rate.110 Moreover, loss-of-function mutations in SEC31B promote colorectal cancer metastasis and a rare form of anemia.111, 112
Seven of the novel TWAS ‘loci’ were unique to PrediXcan and/or UTMOST models (Suppl. Figure S13, Suppl. Table S13). Overall, nearly two thirds of the genes cumulatively prioritized using the three cis TWAS models (181/271) have vascular-related and/or cell senescence/proliferation functional roles, or have been implicated in vascular or neoplastic diseases. A fifth of these vascular and cancer related genes are cilia-related genes (Suppl. Table S14).
We confirmed that all but four TWAS-prioritized genes (14/18) that transect novel loci associated with POAG are robustly expressed in all eye tissues based on publicly available Ocular Tissue Database (OTD).113, 114 However, the four genes missing in OTD: ABHD18, ACKR2, LAMTOR3 and ALDH1L2 have been shown to be expressed in mouse and pig retina.115, 116
PRS prediction performance and effect of POAG liability across HER
Prediction performance of the leave-biobank-out GBMI-IGGC-GGLAD POAG meta-analysis as discovery GWAS in four biobanks with European ancestry subjects as estimated by R2 on the liability scale ranged from 0.0166 (95% CI 0.01, 0.025) for Lifelines to 0.0484 (95% CI 0.042, 0.056) for UKBB (Figure 3A). However, consistent with previous findings, performance was lower for the two non-European ancestry populations (Figure 3A).117 Results for the more balanced case-control ratio of ∼1:4 in BioVU and Lifelines showed an improvement in the R2 (Suppl. Figure S14).118–121 Similarly, European ancestry subjects with polygenic risk scores (PRSs) in the highest risk decile of the PRS distribution had 2.1 to 4-fold higher odds of POAG compared with those in the mid decile (95% CI=1.8,2.24 and 3.34,4.79). In African ancestry individuals the odds ratio between the highest and mid decile polygenic risk was slightly lower but significant [2.36-fold (95% CI=1.21,4.97)] and in East-Asian samples 1.72-fold (95% CI=1.59,1.85) (Figure 3B). Further, the PRSs were robustly associated with POAG phecode across all biobanks (Suppl. Table S15-21).
In general, summary statistics from IGGC yielded better predictions and odds ratios across the five biobanks and the GLGS cohort, relative to GBMI data (Figure 3A, 3B, Suppl. Figure S14).117 Moreover, there was no significant improvement in predictions when using GBMI-IGGC-GGLAD meta-analysis summary data as a source relative to IGGC (Figure 3A). This could be due to phenotype heterogeneity and potentially unbalanced case-control ratio in GBMI relative to IGGC. These were exemplified by elevated prediction observed for the more accurately phenotyped GLGS cohort with balanced case-control ratio compared to equally balanced target data in BioVU and Lifelines (Supplementary Information, Suppl. Figure S14).
As expected, the genetic risk for POAG was associated with ocular traits, but also with other phenotypes, like circulatory, neoplasm and musculoskeletal traits, across the five biobanks (Figure 3C, Suppl. Table S15-S21). Additional association signals with circulatory traits and neoplasms were observed in African (e.g., UKBB, Fisher’s test p=0.0376) and European ancestry (e.g., BioVU p=0.0002) cohorts, respectively (Figure 3C, Suppl. Table S15, S18). We further explored POAG comorbidity pattern across circulatory phecodes (n=171, ≥ 100 cases) in a total of 1,968,903 BioVU subjects, African ancestry (n=273,379) and European ancestry (n=1,695,524), after excluding all individuals with other eye phecodes (phecode 360-379) and correcting for age and sex. African ancestry individuals who have POAG have significantly higher odds of comorbidity with a circulatory phecode relative to European ancestry individuals (OR=2.63, CI=1.66-4.16, p=3.4e-5).
Genetic correlation analysis
We observed negative genetic correlation between POAG and all but one of the 11 vascular traits tested, atherosclerosis (Suppl. Table S23). However, only pulmonary heart diseases had statistically significant genetic correlation (Suppl. Table S23). The two neoplastic traits that were tested, prostate and breast carcinomas were positively correlated but not statistically significant.
Interaction between the SIX6 and CDKN2B-AS1 loci
We confirmed that the haplotype containing the rs33912345 risk allele has reduced SALRNA1 and SIX6 expression relative to the haplotype containing the reference allele (Figure 4A-B), potentially indicating that the missense rs33912345 variant contributes to etiology of POAG by downregulating the expression of the genes in the SIX6 locus.
To further study if there is an association signal in SIX6 locus independent of rs33912345 missense variant, we rebuilt the prediction model in skeletal muscle tissue by excluding all variants in LD with the missense variant (r2<0.1). We found weak or no association signals with either SIX6 (p=0.041 vs p=5.93e-15 for original model) or SALRNA1 (p=0.777 vs p=1.96e-32 for the original model) with LD-constrained rebuilt PrediXcan and UTMOST models, respectively, indicating that all the association signals detected in the locus in the original gene models are attributable to the missense variant. Similarly, all the chr9p21.3 association signals observed with POAG are mainly attributable to exonic CDKN2B-AS1 rs1008878 variant (Suppl. Figure S15).
We next explored interaction between rs33912345 proxy variant and CDKN2 genes, and potential consequences on the expression pattern of these chr9p21.3 genes. We first confirmed that there was correlation between expression levels of SIX6 and chr9p21.3 loci TWAS prioritized genes in GTEx data in muscle skeletal tissue (Figure 4A, Suppl. Figure S16-18). We also observed interaction between the two loci for both POAG and vascular related traits (Phecodes 411, 411.1 - 411.9) in BioVU (Suppl. Table S24). We further found that the proxy variant has significant effect on expression of CDKN2A and CDKN2B via interaction with chr9p21.3 top GWAS variants that are associated with vascular and neoplastic traits in GWAS catalog (rs10811650 and rs2891168). The eQTL effect of the chr9p21.3 variants disappear in the presence of rs33912345 proxy causal allele (Figure 4B-C). However, in the presence of homozygous rs33912345 proxy, the eQTL effect is observed for those with homozygous causal alleles for the local variants (Figure 4B-C, Suppl. Figure S16-18).
We explored potential biological consequences of the interaction between the two loci by performing TWAS-PheWAS for the two CDKN2B-AS1 loci genes in the UKBB (n=396,618) and BioVU (n=59,805) cohorts. We applied the best performing model with the highest prediction ability estimated from the cross-validation for each gene among PrediXcan, UTMOST, and JTI.36–38 For the two chr9p21.3 genes the best performing models in brain cortex tissue were JTI (CDKN2A, r2=0.0257) and UTMOST (CDKN2B, r2=0.0413). We performed GReX-PheWAS for the two chr9p21.3 genes in the same two cohorts, followed by meta-analysis of the two PheWAS (n=456,423) across 731 traits and diseases, grouped into 17 categories. The GReX-PheWAS analysis revealed associations for CDKN2A/B in the CDKN2B-AS1 locus with glaucoma and enrichment for phenotypes of the circulatory and neoplasm groups with coronary atherosclerosis and skin cancers as the top phenotype associations (Figure 4D). These phenotypes include traits that are a result of senescence (upregulation) and cell proliferations (downregulation). We also detected associations with lipid disorders, which have been implicated in both these two groups, consistent with the defined functions of the two genes, suggesting that an etiology that connects these traits is a balance between cell proliferation and senescence.122, 123
Discussion
In this study, we aimed to determine the genetic mechanisms underlying POAG. We leveraged a suite of statistical tools to analyze a combination of massive GBMI genome-wide discovery resources, previously published GWAS and publicly available GTEx data. In a large-scale GBMI GWAS meta-analysis for POAG across 15 biobanks (n=1,487,447) we identified a total of 62 risk loci, of which six were novel and four of these were replicated in IGGC.24 The replicated associations encompass loci near genes INTU-MIR2054, KALRN, and ZFP91-GLYAT-CNTF, CCDC13, with only the latter not confirmed by TWAS.124–127 A rare missense and nonsense variants in a loci we defined in this study (ANGPTL7-MTOR) had been identified in individuals with European ancestries and reported to be protective against glaucoma.84 Meta-analysis of GBMI, IGGC and GGLAD cohorts revealed additional novel associations in the loci that encompass KBTBD8, LOC654841, ADGRL3, DDIT4L, HMGXB3, KCNK5, MAD1L1, APPL2-KCCAT198, CATSPERB and FENDRR genes. TWAS prioritized six novel POAG susceptibility genes that correspond to five of the ten additional novel loci, and all have functional roles or have previously been implicated in vascular and/or neoplastic related pathologies: COL4A4 128, MFF 129 (LOC654841), LAMTOR3 130, 131 (DDIT4L), HMGXB3132, 133 (HMGXB3), ALDH1L2134 (APPL2-, KCCAT198) and CCDC88C135–137 (CATSPERB). By virtue of being gene-based and a more well powered analysis than GWAS, TWAS identified potentially 67 additional novel ‘loci’ that are at subgenome-wide GWAS association. 111, 112 Overall, we leveraged multiple GWAS and TWAS methods in a massive multi-ancestry study, cumulatively identifying 176 POAG risk loci. Review of literature on functional roles of the genes that fall within these loci coupled with enrichment, genetic correlation and PheWAS analyses all point POAG risk to potential link to vascular and proliferation mechanism.
Two thirds of all the TWAS-prioritized POAG associated genes identified in this study have previously been associated with vascular related traits and/or implicated in carcinoma and act as tumor suppressors. In addition, all the genes that are near or regulated by the sex associated novel loci are also implicated with risk of vascular related diseases, serum cholesterol metabolism and neoplasms.96, 138–144 We performed enrichment analyses in which cell proliferation and angiogenesis pathways were significantly represented. In summary, enrichment analysis of the largest GWAS for POAG reported reinforced the role of vascular processes in glaucoma. The results also highlight pathways related to cell proliferation mechanisms that have not been reported in previous GWAS studies. Therefore, these results add more insights into understanding glaucoma pathogenesis. We further performed genetic correlation with vascular related GWAS traits and explored pleiotropy of genetic risk for POAG across the phenome in five biobanks, and confirmed association of POAG genetic risk burden with vascular and neoplastic related traits. Interestingly, consistent with our findings, an accompanying manuscript in this issue -- on genetic drug discovery/repurposing for the GBMI POAG GWAS -- identified four molecules approved for treatment of vascular-related conditions and two additional molecules approved for neoplasm related conditions.145 In addition, an anti-hyperlipidemic drug initially developed for the treatment of coronary artery disease, probucol, was prioritized.145
Previous studies showed disparity in POAG prevalence across ancestries and sexes. While non-genetic factors are likely to drive many of the observed health disparities characterized to date, differences in some underlying genetics including differences in frequency among populations, unique unobserved rare variants and/or difference in effect sizes at common SNPs play a significant role in differential disease etiology.146 In the meta-analysis on both sexes we observed three novel loci and one newly defined locus that are ancestry-specific, and a well-known POAG locus that showed ancestry-specific effects. Sex-stratified association analyses identified additional three African-specific novel loci that were associated with POAG in males only: ATPB10, TMEM167B-C1orf194 and ARMC4, and one African-specific novel locus PRKG2;RASGEF1B that is associated with POAG in females only. PRKG2, the closest gene to the lead female associated variant, is induced by estrogen and progesterone.147 Interestingly, estrogen has been reported as having protective effects in glaucoma.148 14 additional loci show significant effect size differences between ancestries, three of which also have significant difference in effect between males and females. Genes prioritized using TWAS that are near the lead variants for the loci these at show effect difference between the sexes have sex-biased gene expression patterns.95, 96 We further performed in-silico validation of the novel male associated African-specific TMEM167B-C1orf194 locus and showed that the variant is a male-specific regulator of CELSR2. CELSR2 has previously been reported to be differentially expressed between males and females and associated with cardiovascular traits.96 ARMC4 has also been shown to be differentially expressed between males and females.96, 139, 140
Results from polygenic risk score PheWAS across the BioVU, UKBB and EstBB also suggest that genetic risk burdens of POAG are associated with vascular related traits, especially in African ancestry, and neoplasms in European ancestry populations. Moreover, we observed higher incidence of POAG comorbidity with vascular related traits in African ancestry relative to individuals of European ancestry. These potentially suggest unique shared biology between POAG and circulatory traits in African ancestry populations. It is not clear if our findings are linked to other reports that show higher genetic burden of systemic vascular complications in individuals of African ancestry compared to individuals of European ancestry.149–151 Consistent with previous findings,152 our study also shows gender related disparity in POAG in African ancestry. A larger epidemiological and genetic study in individuals of different ancestries will clarify the role of GWAS signals identified here or other additional signals in differential risk burden between ancestries and genders, distinct from what has been reported globally.153
We further performed extensive in-silico validation of the SIX6 and CDKN2B-AS1 loci, using GTEx and EHR data across BioVU and UKBB. In our study we found evidence of significant interaction between SIX6 rs33912345 and causal variants in chr9p21.3, with concomitant effect on expression of a primary cilia gene CDKN2A, and CDKN2B in the CDKN2B-AS1 locus. These effects are particularly enhanced between homozygous rs33912345 and homozygous causal variants in chr9p21.3. Our results highlight a regulatory mechanism for rs33912345, which aligns to previously reported regulatory roles for this variant. For example, a Brazilian population study reported additive association effect of an independent chromosome 16 homozygous rs1362756 in SALL1 gene in the presence of rs33912345 CC genotype, but not with heterozygous or wildtype genotypes.154 Furthermore, Carnes and co-authors reported that patients who have rs33912345 CC genotype have a significantly thinner retinal nerve fiber layer, (a risk factor for glaucoma) than patients homozygous with non-risk allele.155 This suggest that trans-effect of the rs33912345 variant in POAG risk is complex and potentially involve multiple loci and genes. In mouse model with elevated IOP, the SIX6 missense variant was shown to increase the risk of glaucoma-associated vision loss by disrupting the development of the neural retina by upregulating the CDKN2A expression, leading to a reduced number of retinal ganglion cells (RGCs)68, 70, 155, 156 In addition, TWAS-PheWAS analysis for the two chr9p21.3 locus genes revealed associations with circulatory and neoplasm traits. Thus, the SIX6/CDKN2A/B association pattern observed here potentially reflects the balance between tumor suppression, senescence (apoptosis, autophagic) which favors angiogenesis, and cell proliferation or tumorigenesis. Future analyses will clarify on the trans effect of the rs33912345 with other POAG loci in the genome in the context of glaucoma etiology.
By integrating evidence from vascular and cell proliferation mechanisms that emerged from our results, we suggest that primary cilia may potentially be a common link between vascular diseases and glaucoma pathophysiology. The novel associated genes identified in the GBMI GWAS, CCDC13, CNTF and INTU, the gene with male-specific association in Africans, CELSR2, and more than 40 other genes prioritized using TWAS, have been found to be linked to primary cilia, a microtubule-based cellular structure located on the surfaces of vertebrate cells.157–159 In addition, the other male specific associations were in ATP10B and ARMC4 genes, which are cilia-related genes implicated in ocular, vascular, and neoplastic traits.81, 139, 160–164 Primary cilia cells are crucial in normal function of the smooth muscles and play important roles in cell proliferation mechanisms.165 In fact, primary cilia, through their dysfunction, contribute to cancer via interference in cancer signaling pathways, such as the Wnt signaling pathway.165, 166 Primary cilia on vascular endothelium have been proposed to play a critical role in the regulation of vascular barrier.165 The vascular barrier controls the exchange of molecules and ions between blood and tissues and prevents dangerous substances from entering the tissue and causing damage.166, 167 Primary cilia are present in the eye, i.e. retinal pigment epithelium, and they carry out sensory function through various signaling pathways.167, 168 Loss of primary cilia is associated with several pathologies that have anomalies in these mechanisms, such as retinopathy and Leber congenital amaurosis.169, 170
In the eye, the TM also contains primary cilia and these structures change with IOP. When the IOP is elevated, primary cilia shorten, promoting the expression of tumor necrosis factor α and transforming growth factor β (TGF-β), perhaps initiating mechanisms leading to glaucoma.171 Some investigations have suggested a direct link between Wnt signaling and glaucoma, since Wnt signaling is involved in the regeneration of the optic nerve after injury and also in IOP regulation.135, 172, 173 Recent studies have shown that one of the first genes linked to glaucoma, MYOC producing myocilin, is present in the base of primary cilia and its expression affects pathways related to ciliary signaling, such as TGF-β.174, 175 This suggests that ciliary pathways are associated with the secretion of myocilin, of which accumulation in the TM in mutant MYOC causing several open-angle glaucoma subtypes.176 Investigating the mechanisms that influence specific primary cilia functionality will contribute to understanding the role of this structure in the pathogenesis of glaucoma, and eventually to develop novel drugs and therapies.
Strengths and limitations of the study
In this study, there are several strong methodological points as well as limitations. The study represents the largest multi-ancestry genetic study conducted in POAG. The large sample size provided great statistical power to detect novel associations that were also replicated, with the same direction of effect, in the independent IGGC data. Furthermore, the meta-analysis of two datasets and vast resources from different biobanks afforded an opportunity to explore genetic interactions between likely causal variants of glaucoma, and to conduct in silico investigation of the potential shared biology between POAG, cancer, vascular traits and POAG.
Whereas our sample size is larger compared with other POAG studies, several issues that are inherent in a biobank collaborative context like GBMI might lead to a conservative signal detection. POAG phenotyping methods in GBMI were different across the biobanks. In addition, most of the phenotyping was for glaucoma and not specifically POAG, potentially introducing biological heterogeneity. Even though this is not a major issue in biobanks with a majority of European and African ancestry subjects where POAG is the predominant glaucoma subtype, the signals detected from Asians will also probably be from primary angle-closure glaucoma, where this subtype predominates.6,86–88 In addition, signals from other minor glaucoma subtypes (such as exfoliation glaucoma) might also introduce noise. Moreover, despite using a powerful approach to mitigate the effect of unbalanced case-control sampling inherent in a biobank context, the attenuating effect of inappropriate controls will persist, making our signal detection conservative.
Furthermore, despite the fact that this analysis used multi-ancestry meta-analysis, the subjects are still predominantly of European ancestry. We believe that recruitment of cohorts that include ethnic minorities will improve knowledge transferability and health equity.
Finally, even though we did extensive in-silico validation using available data and gleaned broad biological pathways that might be implicated in POAG, we did not perform functional validation of the novel genes identified in this study. Therefore, further in vitro studies using relevant human tissues and in vivo using model organisms are still necessary to define the biological role of these genes in POAG and potential link to vascular and neoplastic traits.
Conclusion and future directions
Our study, the largest and most diverse GWAS in POAG up to date, identified ten ancestry-specific loci, five of which were associated with POAG in either male or female only. A larger genetic study in individuals of diverse ancestries will clarify the role of GWAS signals identified here or other additional signals in differential risk burden between different continental populations, and males and females for this trait. Moreover, there is a need for further studies to ascertain the magnitude of inter-loci interactions and the roles of the SIX6 missense variant in the overall interaction landscape. Further, the role of these interactions in difference in burden for POAG and vascular/cancer related traits between the ancestries need to be elucidated. In summary, by integrating evidence from vascular and cell proliferation mechanisms that emerged from our results, we suggest that primary cilia may potentially be involved in glaucoma pathophysiology. Further investigations are needed to elucidate their role in glaucoma.
Author contributions
Study design: V.L.F, W.Z, N.J.C, J.H.
Data collection/contribution: V.L.F, S.S., N.J.C., J.H., Y.W., E.L., K.L., M.K., Y.O., P.S., P.P., A.R.M., N.M.J., GBMI
Data analysis: V.L.F, A.B., W.Z, D.Z., P.S, J.H.
Writing: V.L.F, J.H
Revision: V.L.F., A.B., W.Z., D.Z., Y.W., K.L, M.K, E.L., P.S., P.P., R.T., X.Z., S.N., S.S., I.M.N., H.S., I.S., A.R.M., M.A.B., C.W., S.M., N.I, E.R.G., N.M.J., K.J., N.J.C., J.H.
Important Links
https://github.com/gamazonlab/MR-JTI
https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/
https://ldlink.nci.nih.gov/?tab=home
Supplementary Information
Checking for Potential for phenotype heterogeneity in BioVU Samples
POAG was ascertained in the BioVU dataset using phecodes in VUMC electronic health records. All subjects (n=144,017) with an ophthalmology examination code (CPT = 92002, 92004, 92012, or 92014) were selected using at least two instances of POAG phecode that covers ICD-9 and/or ICD-10 codes 365.11, H40.11XX, 365.12, H40.12XX (n=17,824) while controls exclude those with codes for glaucoma or optic nerve disease (365.XX, 377.XX, H40.XXXX, H47.XXXX (n=53,919). The groups were then filtered to include only subjects with genotyping data from Illumina MEGA-Array.
The de-identified electronic health records of the glaucoma subjects were then reviewed by their BioVU number to confirm the diagnoses. The patients’ names, addresses, other forms of identification, and the treating physicians’ names were removed. Records which were available included the problem list stating POAG or NTG, medication lists, clinical notes including detailed ophthalmic examinations, ophthalmic operative notes, correspondence letters and discharge summaries. Additional high-level confirmation included an anterior segment examination without secondary-glaucoma evidence, and open angles by gonioscopy.
Only those subjects (n=1040) who had POAG reported in any of the above categories by the treating physicians were included as cases in the manual review. Any subject, who was coded for POAG without supporting confirmation or who had contradictory information in the medical record, was excluded to minimize inclusion errors. Supporting information including: reported peripheral vision loss, intraocular pressure, glaucomatous optic nerve appearance, glaucoma surgeries and use of glaucoma medications was reviewed and noted. A total of 716 individuals who had MEGA array genotyped have their EHR manually curated. A total of 220 out of the 716 individuals conflicted with manually curated records of which 172 are those with only a single ICD9+ICD10 mention of POAG. A total of 48 phecode assignations conflicted with manually curated records: 5 cases were missed by phecode phenotyping, 4 were pigmentary glaucoma, 7 POAG suspect, 9 pseudoexfoliation syndromes (XFS) and 23 primary angle closure glaucoma (PACG) (Suppl. Table S2). Using any mention of ICD9 or ICD10 POAG had a far much higher number of conflicts (114) with manually curated information. Thus conservatively, we estimate that ∼1.2% (∼320) and ∼3.2% (∼560) might potentially be XFS and PACG cases, respectively, among the total GBMI 26,848 POAG cases.
Therefore, as further quality check step, we compared our GBMI-IGGC-GGLAD meta-analysis signals with well powered GWASs for XFS and PACG, and determined that three of the loci might be attributed to these two glaucoma subtypes: 1) The rs3825942_LOXL1 loci that is main XFS signal and previously reported in GWAS of glaucoma or recent POAG study that include Biobank level.1, 2 2) The rs11024102_PLEKHA7 has been associated only with PACG and glaucoma in East Asian ancestry individuals, while the lead variant in rs58812088_ FNDC3B in our study is just ∼12kb away from and in high LD (r2=0.7) with PACG lead variant in rs16856870_FNDC3B locus.3, 4 However, three loci: rs993471_COL11A1, rs2276035_ARHGEF12, rs12150284_GAS7 have previously been shown to be POAG-PACG shared loci.4, 5
Groningen Longitudinal Glaucoma Study (GLGS)
The GLGS study consists of cases from the Groningen Longitudinal Glaucoma Study (GLGS). The GLGS consisted of Dutch individuals with a diagnosis of POAG from the hospital-based Groningen Longitudinal Glaucoma Study (GLGS). The original GLGS cohort has been described in detail by Heeg et al. (2005).6 After the inclusion of the initial cohort in 2000-2001, the GLGS continued as a dynamic population, that is, new participants were added during follow-up. We included glaucoma patients who visited the outpatient department of the University Medical Center Groningen in 2015 and who gave written informed consent for a blood sample being taken for genetic analyses. In the GLGS, glaucoma patients had to show glaucomatous visual field loss in at least one eye. For glaucomatous visual field loss, two consecutive tests had to be abnormal in at least one eye, after an initial test that was discarded to reduce the influence of learning. Defects had to be compatible with glaucoma and without any other explanation. Those with pseudoexfoliative or pigment dispersion glaucoma or a history of angle-closure or secondary glaucoma were excluded, leaving only POAG patients.7 In GLGS, genomic DNA was extracted from the peripheral blood, using Gentra Systems Purogene chemistry. The genotyping was done using the Illumina Infinium Global Screening Array® (GSA) MultiEthnic Disease beadchip version, which contains 692,367 markers.8, 9
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Disclosures
Eric R. Gamazon receives an honorarium from the journal Circulation Research of the American Heart Association as a member of the Editorial Board.
Biobanks that contributed GWAS summary statistics to POAG study
Biobank Japan Project
The BioBank Japan Project was supported by the Tailor-Made Medical Treatment program of the Ministry of Education, Culture, Sports, Science, and Technology (MEXT), the Japan Agency for Medical Research and Development (AMED).
S.N. was supported by Takeda Science Foundation. Y.O. was supported by JSPS KAKENHI (19H01021, 20K21834), and AMED (JP21km0405211, JP21ek0109413, JP21ek0410075, JP21gm4010006, and JP21km0405217), JST Moonshot R&D (JPMJMS2021, JPMJMS2024), Takeda Science Foundation, and Bioinformatics Initiative of Osaka University Graduate School of Medicine, Osaka University.
BioMe - The Mount Sinai BioMe Biobank
The Mount Sinai BioMe Biobank has been supported by The Andrea and Charles Bronfman Philanthropies and in part by Federal funds from the NHLBI and NHGRI (U01HG00638001; U01HG007417; X01HL134588). We thank all participants in the Mount Sinai Biobank. We also thank all our recruiters who have assisted and continue to assist in data collection and management and are grateful for the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai.
BioVU
The BioVU projects at Vanderbilt University Medical Center are supported by numerous sources: institutional funding, private agencies, and federal grants. These include the NIH-funded Shared Instrumentation Grant S10OD017985 and S10RR025141; CTSA grants UL1TR002243, UL1TR000445, and UL1RR024975 from the National Center for Advancing Translational Sciences. Its contents are solely the responsibility of the authors and do not necessarily represent official views of the National Center for Advancing Translational Sciences or the National Institutes of Health. Genomic data are also supported by investigator-led projects that include U01HG004798, R01NS032830, RC2GM092618, P50GM115305, U01HG006378, U19HL065962, R01HD074711; and additional funding sources listed at https://victr. vumc.org/biovu-funding/.
Colorado Center for Personalized Medicine (CCPM)
The Colorado Center for Personalized Medicine (CCPM) would like to thank Richard Zane, Steve Hess, Sarah White, Emily Hearst, Emily Roberts and the entire Health Data Compass team. CCPM was developed with support from UCHealth, Children’s Hospital Colorado, CU Medicine, CU Department of Medicine and CU School of Medicine.
deCODE Genetics
We thank participants in deCODE cardiovascular and obesity studies and collaborators for their cooperation.
Estonian Biobank
This research was supported by the European Union through Horizon 2020 research and innovation programme under grant no 810645 and through the European Regional Development Fund project no. MOBEC008, by the Estonian Research Council grant PUT (PRG1291, PRG687 and PRG184) and by the European Union through the European Regional Development Fund project no. MOBERA21 (ERA-CVD project DETECT ARRHYTHMIAS, GA no JTC2018-009), Project No. 2014-2020.4.01.15-0012 and Project No. 2014-2020.4.01.16-0125. We would like to acknowledge Dr. Tõnu Esko; Dr. Lili Milani; Dr. Reedik Mägi, Dr. Mari Nelis and Dr. Andres Metspalu, all from the Institute of Genomics, University of Tartu, Tartu, Estonia.
FinnGen
The FinnGen project is funded by two grants from Business Finland (HUS 4685/31/2016 and UH 4386/31/2016) and the following industry partners: AbbVie Inc., AstraZeneca UK Ltd, Biogen MA Inc., Bristol Myers Squibb (and Celgene Corporation & Celgene International II Sàrl), Genentech Inc., Merck Sharp & Dohme Corp, Pfizer Inc., GlaxoSmithKline Intellectual Property Development Ltd., Sanofi US Services Inc., Maze Therapeutics Inc., Janssen Biotech Inc, and Novartis AG. Following biobanks are acknowledged for delivering biobank samples to FinnGen: Auria Biobank (www.auria.fi/biopankki), THL Biobank (www.thl.fi/biobank), Helsinki Biobank (www.helsinginbiopankki.fi), Biobank Borealis of Northern Finland (https://www.ppshp.fi/Tutkimus-ja-opetus/Biopankki/Pages/Biobank-Borealis-briefly-in-English.aspx),Finnish Clinical Biobank Tampere (www.tays.fi/en-US/Research_and_development/Finnish_Clinical_Biobank_Tampere), Biobank of Eastern Finland (www.ita-suomenbiopankki.fi/en), Central Finland Biobank (www.ksshp.fi/fi-FI/Potilaalle/Biopankki), Finnish Red Cross Blood Service Biobank (www.veripalvelu.fi/verenluovutus/biopankkitoiminta) and Terveystalo Biobank (www.terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki/). All Finnish Biobanks are members of BBMRI.fi infrastructure (www.bbmri.fi). Finnish Biobank Cooperative -FINBB (https://finbb.fi/) is the coordinator of BBMRI-ERIC operations in Finland. The Finnish biobank data can be accessed through the Fingenious® services (https://site.fingenious.fi/en/) managed by FINBB.
Generation Scotland
Generation Scotland received core support from the Chief Scientist Office of the Scottish Government Health Directorates [CZD/16/6] and the Scottish Funding Council [HR03006] and is currently supported by the Wellcome Trust [216767/Z/19/Z]. Genotyping of the GS:SFHS samples was carried out by the Genetics Core Laboratory at the Edinburgh Clinical Research Facility, University of Edinburgh, Scotland and was funded by the Medical Research Council UK and the Wellcome Trust (Wellcome Trust Strategic Award “STratifying Resilience and Depression Longitudinally” (STRADL) Reference 104036/Z/14/Z).”
The HUNT Study
A special thanks to all the HUNT participants for donating their time, samples and information to help others. The Trøndelag Health Study (HUNT) is a collaboration between HUNT Research Centre (Faculty of Medicine and Health Sciences, NTNU, Norwegian University of Science and Technology), Trøndelag County Council, Central Norway Regional Health Authority, and the Norwegian Institute of Public Health. The genotyping in HUNT was financed by the National Institutes of Health; University of Michigan; the Research Council of Norway; the Liaison Committee for Education, Research and Innovation in Central Norway; and the Joint Research Committee between St Olavs hospital and the Faculty of Medicine and Health Sciences, NTNU. The genetic investigations of the HUNT Study, is a collaboration between researchers from the K.G. Jebsen Center for Genetic Epidemiology, NTNU and the University of Michigan Medical School and the University of Michigan School of Public Health. The K.G. Jebsen Center for Genetic Epidemiology is financed by Stiftelsen Kristian Gerhard Jebsen; Faculty of Medicine and Health Sciences, NTNU, Norway. We want to thank clinicians and other employees at Nord-Trøndelag Hospital Trust for their support and for contributing to data collection in this research project.
We also acknowledge; HUNT-MI Leadership: Kristian Hveem, Cristen Willer, Oddgeir Lingaas Holmen, Mike Boehnke, Goncalo Abecasis, Bjorn Olav Åsvold, Ben Brumpton; Scientific Advisory Committee: Ele Zeggini, Mark Daly, Bjørn Pasternak; HUNT Research Centre: Jørn Søberg Fenstad, Anne Jorunn Vikdal, Marit Næss; HUNT Cloud: Oddgeir Lingaas Holmen, Sandor Zeestraten, Tom Erik Røberg; Data applications and registry linkages: Maiken E. Gabrielsen, Anne Heidi Skogholt; Low-pass whole sequencing genome bioinformatics and statistical analysis: He Zhang, Hyun Min Kang, Jin Chen; Array genotyping: Sten Even Erlandsen, Vidar Beisvåg; GWAS bioinformatics, QC, imputation and statistical analysis: Wei Zhou, Jonas Nielsen, Lars Fritsche, Hyun Min Kang, Oddgeir Holmen, Ben Brumpton, Laurent Thomas; CNV calling: Ellen Schmidt, Ryan Mills; Statistical methods development for analyzing HUNT data: Wei Zhou, Shawn Lee
Funding
The K. G. Jebsen Centre for Genetic Epidemiology is financed by Stiftelsen Kristian Gerhard Jebsen. The genotyping in HUNT was financed by the National Institutes of Health; University of Michigan; the Research Council of Norway; Stiftelsen Kristian Gerhard Jebsen; the Liaison Committee for Education, Research and Innovation in Central Norway; and the Joint Research Committee between St Olav’s hospital and the Faculty of Medicine and Health Sciences,NTNU.
LifeLines
The Lifelines Biobank initiative has been made possible by funding from the Dutch Ministry of Health, Welfare and Sport, the Dutch Ministry of Economic Affairs, the University Medical Center Groningen (UMCG the Netherlands), University of Groningen and the Northern Provinces of the Netherlands. The generation and management of GWAS genotype data for the Lifelines Cohort Study is supported by the UMCG Genetics Lifelines Initiative (UGLI). UGLI is partly supported by a Spinoza Grant from NWO, awarded to Cisca Wijmenga. The authors wish to acknowledge the services of the Lifelines Cohort Study, the contributing research centers delivering data to Lifelines, and all the study participants.
UMCG Genetics Lifelines Initiative (UGLI) team: Raul Aguirre-Gamboa (1), Patrick Deelen (1), Lude Franke (1), Jan A Kuivenhoven (2), Esteban A Lopera Maya (1), Ilja M Nolte (3), Serena Sanna (1), Harold Snieder (3), Morris A Swertz (1), Peter M. Visscher (3,4), Judith M Vonk (3), Cisca Wijmenga (1)
Department of Genetics, University of Groningen, University Medical Center Groningen,The Netherlands
Department of Pediatrics, University of Groningen, University Medical Center Groningen, The Netherlands
Department of Epidemiology, University of Groningen, University Medical Center Groningen, The Netherlands
Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
Mass General Brigham (MGB) Biobank
Samples, genomic data, and health information were obtained from the Mass General Brigham Biobank, a biorepository of consented patients samples at Mass General Brigham (parent organization of Massachusetts General Hospital and Brigham and Women’s Hospital). We are grateful to all of the participants and clinical and research teams who made this work possible. Support for genotyping was provided through MGB Personalized Medicine. MGB Biobank Leadership: Elizabeth W. Karlson, MD; Shawn N. Murphy, MD, PhD; Susan A Slaugenhaupt, PhD; Jordan W. Smoller, MD, ScD; Scott T. Weiss, MD, MSc
Michigan Genomics Initiative
The authors acknowledge the Michigan Genomics Initiative participants, Precision Health at the University of Michigan, the University of Michigan Medical School Central Biorepository, and the University of Michigan Advanced Genomics Core for providing data and specimen storage, management, processing, and distribution services, and the Center for Statistical Genetics in the Department of Biostatistics at the School of Public Health for genotype data curation, imputation, and management in support of the research reported in this publication.
QIMR Berghofer Biobank (QSKIN)
This work is supported by a project grant (APP1063061) and a Program grant (APP1073898) from the Australian National Health and Medical Research Council (NHMRC). SM is supported by a Research Fellowship from the NHMRC (Aust). NI received scholarship support from the University of Queensland and QIMR Berghofer Medical Research Institute. We thank research staff (David Whiteman, Catherine Olsen, Rachel Neale) and participants from the Australian QSKIN study (https://www.qimrberghofer.edu.au/study/qskin/). SEM is supported in part by APP1172917 from the Australian National Health and Medical Research Council (NHMRC).
Taiwan Biobank
This research has been conducted using the Taiwan Biobank resource. We thank all the participants and investigators of the Taiwan Biobank. We thank the National Center for Genome Medicine of Taiwan for the technical support in genotyping. We thank the National Core Facility for Biopharmaceuticals (NCFB, MOST 106-2319-B-492-002) and National Center for High-performance Computing (NCHC) of National Applied Research Laboratories (NARLabs) of Taiwan for providing computational and storage resources.
UK Biobank
Access to data from the UK BioBank was obtained through Application #31063 PI: Ben Neale, Claire Churchhouse Overview: “Methodological extensions to estimate genetic heritability and shared risk factors for phenotypes of the UK Biobank”. Website for Pan-UKBB results can be found: https://pan.ukbb.broadinstitute.org/
Supplementary Figures
Acknowledgements
We would like to acknowledge Melinda Aldrich for critical review on an earlier version of the manuscript. We would like to acknowledge the organizing committee of the International Common Disease Alliance for intellectual contributions on the set up of the GBMI as a nascent activity to the larger effort. We also thank Daniel King from the Hail team and Sam Bryant from the Stanley Center Data Management team for helping with the Google bucket set up and data sharing. Eric R. Gamazon is supported by the National Institutes of Health (NIH) Awards R35HG010718, R01HG011138, R01GM140287, and NIH/NIA AG068026. Valeria Lo Faro was supported by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No.675033 (EGRET plus). Additional funding was provided by the Rotterdamse Stichting Blindenbelangen (grant ID B20150036). Karen Joos was supported by the Joseph Ellis Family and William Black Research Funds, and an Unrestricted Departmental Grant to the Vanderbilt Eye Institute from Research to Prevent Blindness, Inc., NY. We acknowledge all the participants in the study and biobanks that were involved in generating POAG summary data: BioBank Japan (Yukinori Okada, Koichi Matsua, and Masahiro Kanai), BioMe (Ruth Loos, Judy Cho, Eimear Kenny, Michael Preuss, and Simon Lee), BioVU (Nancy Cox and Jibril Hirbo), Colorado Center for Personalized Medicine (Kathleen Barnes, Michelle Daya, and Chris Gignoux), deCODE Genetics (Kári Stefánsson and Unnur Þorsteinsdóttir), Estonian Biobank (Andres Metspalu, Reedik Mägi, Tõnu Esko, and Priit Palta), FinnGen (Aarno Palotie, Mark Daly, Samuli Ripatti, Mitja Kurki, and Juha Karjalainen), Generation Scotland (Caroline Hayward and Riccardo Marioni), HUNT (Kristian Hveem, Cristen Willer, and Sarah Graham, Ben Brumpton, and Brooke Wolford), Lifelines (Esteban Lopera, Serena Sanna, Harold Snieder), Michigan Genomics Initiative (Sebastian Zoellner, Michael Boehnke, Lars Fritsche, and Anita Pandit), Taiwan Biobank (Yen-Feng Lin, Yen-Chen Feng, and Hailiang Huang), and UK Biobank (Konrad Karczewski and Alicia Martin).
REFERENCES
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.↵
- 25.
- 26.↵
- 27.
- 28.↵
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.↵
- 37.↵
- 38.↵
- 39.
- 40.
- 41.
- 42.
- 43.↵
- 44.
- 45.
- 46.
- 47.
- 48.↵
- 49.
- 50.↵
- 51.↵
- 52.
- 53.
- 54.
- 55.↵
- 56.
- 57.
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.
- 74.
- 75.
- 76.
- 77.↵
- 78.↵
- 79.
- 80.
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.
- 88.↵
- 89.↵
- 90.↵
- 91.
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.
- 99.
- 100.
- 101.
- 102.
- 103.
- 104.
- 105.
- 106.
- 107.
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.
- 120.
- 121.↵
- 122.↵
- 123.↵
- 124.↵
- 125.
- 126.
- 127.↵
- 128.↵
- 129.↵
- 130.↵
- 131.↵
- 132.↵
- 133.↵
- 134.↵
- 135.↵
- 136.
- 137.↵
- 138.↵
- 139.↵
- 140.↵
- 141.
- 142.
- 143.
- 144.↵
- 145.↵
- 146.↵
- 147.↵
- 148.↵
- 149.↵
- 150.
- 151.↵
- 152.↵
- 153.↵
- 154.↵
- 155.↵
- 156.↵
- 157.↵
- 158.
- 159.↵
- 160.
- 161.
- 162.
- 163.
- 164.↵
- 165.↵
- 166.↵
- 167.↵
- 168.↵
- 169.↵
- 170.↵
- 171.↵
- 172.↵
- 173.↵
- 174.↵
- 175.↵
- 176.↵