Novel Aggregative trans-eQTL Association Analysis of Known Genetic Variants Detect Trait-specific Target Gene-sets

Diptavo Dutta; Yuan He; Ashis Saha; Marios Arvanitis; Alexis Battle; Nilanjan Chatterjee

doi:10.1101/2020.09.29.20204388

Abstract

Large scale genetic association studies have identified many trait-associated variants and understanding the role of these variants in downstream regulation of gene-expressions can uncover important mediating biological mechanisms. In this study, we propose Aggregative tRans assoCiation to detect pHenotype specIfic gEne-sets (ARCHIE), as a method to establish links between sets of known genetic variants associated with a trait and sets of co-regulated gene-expressions through trans associations. ARCHIE employs sparse canonical correlation analysis based on summary statistics from trans-eQTL mapping and genotype and expression correlation matrices constructed from external data sources. We propose a resampling based procedure to test for significant trait-specific trans-association patterns in the background of highly polygenic regulation of gene-expression. By applying ARCHIE to available trans-eQTL summary statistics reported by the eQTLGen consortium, we identify 71 gene networks which have significant evidence of trans-association with groups of known genetic variants across 29 complex traits. A majority (50.7%) of the genes do not have any strong trans-associations and could not have been detected by standard trans-eQTL mapping. We provide further evidence for causal basis of the target genes through a series of follow-up analyses. These results show ARCHIE is a powerful tool for identifying sets of genes whose trans regulation may be related to specific complex traits.

Introduction

Genome-wide association studies (GWAS) have identified tens of thousands of common variants associated with a variety of complex traits ¹ and a majority of these identified trait-related variants are in the non-coding regions of the genome^2–4. It has been shown that these GWAS identified variants have a substantial overlap with variants that are associated with the expression levels of genes (eQTL) ^5–7. A number of tools ^8–11 have been developed to identify potential target genes through which genetic associations may be mediated by investigating the effect of variants on local genes (cis-eQTL), typically within 1Mb region around the variant, underlying causal interpretation remains complicated due to linkage disequilibrium and pleiotropy. A recent study has shown that a modest fraction of trait-heritability can be explained cis-mediated bulk gene-expressions ¹², but future studies with more cell-type specific information has the potential to explain further.

Compared to cis-eQTL, studies of trans-eQTL have received much less attention for illuminating causal mechanisms underlying GWAS identified loci. However, they have the potential to illuminate downstream genes and pathways that would shed light on disease mechanism. A major challenge has been the limited statistical power for detection of trans-eQTL effects due to much weaker effects of SNPs on expressions of distal genes compared to those in cis-regions and a very large burden of multiple testing. However, trans-effects, when detected, has been shown to be more likely to have tissue-specific effects^13,14 and are more enriched than cis-eQTLs among disease loci¹⁵. Trans-eQTLs are, in general, known to act on regulatory circuits governing broader groups of genes¹⁶ and thus have the potential to uncover gene networks and pathways consequential to complex traits ^17,18. Limited studies of trans-eQTL effect of known GWAS loci have identified complex downstream effects on known consequential genes for diseases^15,19. In fact, an “omnigenic” model of complex traits has been hypothesized under which a large majority of genetic associations is mediated by cascading trans-effects on a few “core genes” ^20,21.

In this article, we propose a novel method based on a sparse canonical correlation analysis (sCCA)^22–24 framework, termed Aggregative tRans assoCiation to detect pHenotype specIfic gEne-sets (ARCHIE), for detecting trans-effects of groups of GWAS SNPs associated with a trait on sets of genetically co-regulated genes. Using summary statistics from standard SNP-gene expression trans-eQTL mapping, estimates of linkage disequilibrium (LD) between the variants and co-expression between genes, we select sets distal target genes (termed gene component) that are associated with a group of the trait-related variants (termed variant component). Together, the selected variants and genes (jointly termed ARCHIE components) reflect significant trait-specific patterns of trans-association (Figure 1A shows an illustration for the functionality of ARCHIE).

Figure 1. Overview of ARCHIE and eQTLGen data analysis.

(A) An illustrative example for ARCHIE. Association statistics (-log₁₀ p-value) from trans-eQTL mapping for P variants and G genes are shown in the left panel heatmap. Using LD and co-expression estimated from reference datasets, ARCHIE aggregates multiple weaker trans-eQTL associations to select subsets of variants and genes which capture phenotype specific trans-association patterns (right panel heatmap). (B) Overview of workflow for eQTLGen data analysis (See Results and Methods). (C) Number of significant ARCHIE components that capture phenotype-specific trans-eQTL association patterns for the 29 phenotypes analyzed in eQTLGen.

Compared to standard trans-eQTL mapping, the proposed method improves power for detection of signals by aggregating multiple trans-association signals across GWAS loci and genes. We propose a resampling-based method to assess the statistical significance of the top components (top sparse canonical correlation values) resulting from sCCA for testing enrichment of trait-specific signals in the background of broader genome-wide trans-associations. If multiple ARCHIE components are significant, they reflect approximately orthogonal patterns of trans-associations for the trait-related variants, with the selected target genes pertaining to distinct downstream mechanisms of trans-regulation. Finally, we propose a novel way of independently supporting the results based on an analysis of trait heritability explained by the identified target genes compared to random genes. We apply the proposed method to analyze large-scale trans-association summary statistics for SNPs associated with 29 traits reported by the eQTLGen consortium¹⁹. The results show that ARCHIE can identify trait-specific patterns of trans-associations and relevant sets of variants and co-regulated target genes. The majority (50.7%) of the detected target genes are novel, meaning they would not have been identified by standard trans-eQTL mapping alone. We provide independent evidence supporting our results, using a series of downstream analysis to show that the selected target genes are enriched in known trait-related pathways and define directions of associations for the SNPs that are more enriched for underlying trait heritability than expected by chance. Further we show an example that the identified trans-associations can be explained using biological cis-mediation mechanisms as well.

Results

Trait-specific patterns of trans-associations in Whole Blood

For each of 29 traits, we applied ARCHIE to the trans-eQTL summary statistics for the set of GWAS loci identified for that trait and tested in eQTLGen, across all genes and select the trait-specific target genes via the significant gene components (See Methods and Figure 1B for analysis details). On average, across these traits, we detect 2 (max = 7 for “Height”) significant sets of variant and gene components capturing phenotype-specific trans-association patterns (Figure 1C). Of the target genes selected by ARCHIE in the significant gene-components for each trait, only 49.3% genes displayed a strong association in standard analysis (trans-eQTL p-value < 1 × 10^-06 reported in eQTLGen) with any variant associated to that traits. The remaining 50.7% genes (termed “novel genes”) harbors only weaker (0.05 > p-value > 1 × 10^-06) associations and hence cannot be detected by standard trans-eQTL mapping alone; these genes display a similar pattern of trans-association with corresponding selected trait-related variants and are detectable via the significant ARCHIE components. We made the list of target genes and variants selected by ARCHIE for each phenotype publicly available through an openly accessible database (See URL). Here, we focus on results for three different phenotypes their corresponding trans- association patterns, the selected target gene-sets and the novel genes detected by ARCHIE.

Schizophrenia

Schizophrenia is a neuropsychiatric disorder that affects perception and cognition. The eQTLGen consortium reports complete (non-missing) trans-association statistics for 218 SNPs, curated from multiple large-scale GWAS, associated with Schizophrenia (SCZ) across 7,756 genes. Of these, 7,047 genes were expressed in whole blood of Genotype-Tissue Expression (GTEx)²⁵ v8 individuals. We identified one significant ARCHIE component capturing trans-association patterns significantly related to SCZ (Figure 2A-B) consisting of 27 variants and 75 genes. Of the selected genes, only 16 (21.4%) had evidence of at least one strong association (p-value < 1 × 10^-06) and possibly multiple weaker (0.05 > p-value > 1 × 10^-06) association as reported by eQTLGen. The remaining 59 genes (78.6%) only had weaker trans-associations with SCZ-related variants and could not have been identified using traditional trans-eQTL mapping (Table 1). Using an expression imputation approach (See Methods for details), we found the target genes mediate significant trait heritability (p-value < 0.001) than expected by chance (Figure 2D).

View this table:

Table 1:

The number of variants, genes and novel genes selected by ARCHIE for each the significant components for the analysis of SCZ, UC and PC. For a full list of selected variants and genes see Supplementary Table 1.

View this table:

Table 2:

Pathway enrichment results for the target genes selected for SCZ. Several selected top pathways containing novel genes across different categories are shown. See Supplementary Table 2 and 3 for results on all significant pathways and transcription factor target gene-sets for the selected genes.

Figure 2. Trans-association pattern and properties of selected target gene-set for SCZ.

(A) Top 10 sparse canonical correlation values (cc-values; red) and the corresponding competitive null distributions (black box plot) for ARCHIE analysis of SCZ. Top 1 ARCHIE component is significant. (B) Reported -log₁₀ (p-values) for trans-eQTL association between variants and genes selected in ARCHIE component 1. Any association p-value < 10^-08 is collapsed to 10^-08 for the ease of viewing. (C) Pathway enrichment results for some selected top pathways for the target genes selected in gene-component 1. For full pathway enrichment results for the target genes selected see Supplementary Table 6. The dashed vertical line corresponds to a suggested FDR threshold of 0.001. (D) trans-heritability enrichment analysis for target genes selected for SCZ. The boxplots represent the distribution of pseudo-r² for a random gene-set of the same size and the box-points represent the observed pseudo-r² for the target genes (See Results and Methods for details).

Several of the 59 identified novel genes have been previously been reported to be associated with neurological functions. For example, chemokine receptor 4 (CXCR4), a gene that underlies interneuron migration and several neurodegenerative diseases²⁶, was identified by aggregating weaker associations from 20 SCZ-related SNPs in the variant component, but does not have any significant trans-associations. Similarly, caveolin-1 (CAV1), which is a known regulator of a SCZ risk gene (DISC1)²⁷, aggregates 13 weaker association to SCZ-related variants in the variant component. Notably, the target genes identified by ARCHIE include genes such as HSPA5 and AP5S1, which not only harbor multiple trans-associations from SCZ-related variants but have also been reported to have cis-variants associated with psychiatric disorders^28,29. We investigated whether in general the genes selected by ARCHIE had have evidence of association with SCZ through cis variants. Aggregating results from several large-scale cis-eQTL studies across tissues^9,30, we found that 12 of the 59 of the (enrichment p-value = 2.8×10^-05) novel genes have nominally significant (p-value < 1×10^-04) evidence of cis-regulatory SNPs to be associated with SCZ or other different neuropsychiatric diseases.

By performing pathway enrichment analysis of the target genes we investigated if the selected genes represented known SCZ-related biological mechanisms (See Supplementary Methods for details). Among the significantly enriched pathways, the majority (51.3%) were immune related. In particular, we identified 42 GO pathways³¹, 36 canonical pathways^32–35 and 4 hallmark pathways³⁶ to be strongly enriched (FDR adjusted p-value < 0.05) for the selected genes with 73 (89.0%) of them containing at least one novel gene (Figure 2C and Supplementary Table 2). Several pathways, previously reported in connection to SCZ, are identified to be enriched (FDR adjusted p-value < 0.05) for the selected genes (Figure 2C). For example, among the enriched gene ontology (GO) terms, GO-0034976: response to endoplasmic reticulum stress³⁷ (FDR adjusted p-value=0.013), GO-055065 metal ion homeostasis³⁸ (adjusted p-value=0.029), GO-0006915: apoptotic process³⁹ (adjusted p-value = 0.029), GO-0043005: neuron projection (adjusted p-value = 0.021) have previously been suggested to be linked to SCZ. Four hallmark gene-sets are also found to be significantly enriched for the selected genes including glycolysis⁴⁰, hypoxia⁴¹, mTORC1 signaling⁴² and unfolded protein response⁴³, all of which have suggestive evidence of being associated to SCZ. Using numerous TF databases^44,45, we found that the selected target genes were enriched (adjusted p-value < 0.05) for targets of 10 TFs (Supplementary Table 3), several of which have been previously reported to be associated with neuropsychiatric disorders^46,47.

Protein-protein interaction (PPI) enrichment analysis using STRING (v11.0)⁴⁸ showed a significant enrichment (p-value = 1.1×10^-03) indicating that the corresponding proteins may physically interact. Next, we performed a differential expression enrichment analysis to investigate whether the target genes were differentially expressed in any of the 54 tissues in GTEx v8 dataset. For each tissue, we curated lists of differentially expressed genes across the genome. We defined a gene to be differentially expressed in a tissue if the corresponding gene expression level in that tissue was significantly different from that across the rest of the tissues (See Supplementary Methods for details). Using such pre-computed lists of differentially expressed genes for each tissue, we found that the target genes selected by ARCHIE were enriched within the set of differentially expressed genes in 12 different tissues including 4 brain tissues in GTEx v8 (Supplementary Figure 1). For example, 3 novel genes (PADI2, KCNJ10, MLC1), were highly differentially expressed in several brain tissues (Supplementary Figure 2), in comparison to their expression in rest of the tissues.

Ulcerative Colitis

Ulcerative colitis (UC) is a form of inflammatory bowel disease, affecting the innermost lining of colon and rectum, causing inflammation and sores in the digestive tract and can lead to several colon-related symptoms and complications including colon cancer^49–51. The eQTLGen consortium reports complete (non-missing) trans-association summary statistics for 163 SNPs associated with Ulcerative Colitis, curated from multiple large-scale GWAS, across 12,010 genes. Of these, 10,307 genes were expressed in Whole Blood from GTEx v8 individuals. Using ARCHIE, we detected two significant variant-gene components comprising of 74 SNPs and 148 genes in total (Figure 3A and Supplementary Figure 3; Supplementary Table 1) that reflect trans- association patterns specific to UC. Of the selected genes, 68 genes (45.9%) were novel, meaning they did not have any strong trans-association (Table 1, Supplementary Table 1) with the variants related to UC. Further, similar to SCZ, we found the associations of the SNPs with target genes was strongly enriched (p-value < 0.001) for heritability of UC than expected by chance alone (Figure 3D).

Figure 3. Analysis of selected target gene-set for UC.

(A) Top 10 cc-values (red) and the corresponding competitive null distributions (black box plot) for ARCHIE analysis of UC. Top 2 ARCHIE components are significant. (B) Pathway enrichment results for some selected top pathways for the selected target genes. The dashed vertical line corresponds to a suggested FDR threshold of 0.001. For full pathway enrichment results for the target genes selected for both the components see Supplementary Table 3 and 4. (C) PPI enrichment for the two significant ARCHIE components for UC. (D) trans-heritability enrichment analysis for target genes selected for UC. The boxplots represent the distribution of pseudo-r2 for a random gene-set of the same size and the box-points represent the observed pseudo-r² for the target genes (See Results and Methods for details).

Several of the novel target genes detected have been previously linked to intestinal inflammations and diseases. For example, glycoprotein A33 (GPA33) is known to impact intestinal permeability⁵² and is an established colon cancer antigen⁵³. Recent research using mouse-models have reported a connection between the regulation of GPA33 and the development of colitis and other colon related inflammatory syndromes⁵⁴. We also identify spermine oxidase (SMOX) through its weaker association with 9 UC-related variants. SMOX is significantly upregulated in individuals with inflammatory bowel diseases⁵⁵ and has been implicated in gastric and colon inflammations as well as carcinogenesis⁵⁶.

Using a series of follow-up analyses, we identify several pathways to be enriched (FDR adjusted p-value < 0.05) for the selected target genes (Supplementary table 4-5), majority of them being immune related (59.6%). Among others, the hallmark interleukin-2-STAT5 signaling pathway (FDR adjusted p-value = 1.6 ⨯10^-08) has previously been reported to be associated to development of UC via suppression of immune response ⁵⁷. Various GO pathways related to endocytosis, lymphocyte activation, T-cell activation are found to be overrepresented in the selected target genes as well (Figure 3B and Table 3). Further enrichment analysis using broad TF databases, we found the selected target genes across both gene-components are enriched (adjusted p-value < 0.01) for targets of 18 different TFs, majority of which have been previously reported to be involved in mucosal inflammation, inflammation of the intestine and epithelial cells and in immune-related responses (Supplementary Table 6).

View this table:

Table 3:

Pathway enrichment results for the target genes selected for UC. Several selected top pathways containing novel genes across different categories are shown here. See Supplementary Tables 3 and 4 for results on all the significant pathways for the selected in ARCHIE components 1 and 2 respectively.

Protein-protein interaction (PPI) enrichment analysis shows that the resultant proteins interact more often than random (p-value= 8.8×10^-03 and 1.3×10^-03 respectively for two significant ARCHIE components) (Figure 3C). Additionally, the selected genes were found to be enriched for genes significantly differentially expressed in several relevant tissues like colon-sigmoid and small-intestine ileum among others (Supplementary Figure 4). We further investigated if any known mechanism can explain how the selected genes are associated with the selected variants, including mechanisms reflecting cis mediation¹⁵. In one example from our analysis, we observe that, among the 41 variants selected by variant-component 1, one UC-related variant rs3774959 is a cis-eQTL of NFKB1 (p-value = 6.2 × 10^-41 in eQTLGen and 6.3 × 10^-05 in GTEx in whole blood). The Nuclear factor κB (NF-κB) family of transcription factors (TF) including NFKB1, has been extensively reported to be involved in immune⁵⁸ and inflammatory responses⁵⁹. In particular, mutations in the promoter region of Nuclear factor κB1 (NFKB1) have been strongly implicated to be associated to UC⁶⁰, although the downstream target genes of NFKB1 that are associated with UC, are largely unknown. Among 106 target genes selected in the first gene component, there are 6 genes (CD74, CD83, IL1B, Il2RA, PTPN6, FOXP3) that are reported targets for NFKB1 (adjusted enrichment p-value = 7.5 × 10^-03) in TRRUST v2.0⁴⁴. Thus, it can be conceptualized that the selected UC-related variant may regulate the expression levels of the 6 selected targets of NFKB1 via cis-regulation of NFKB1 expression levels, influencing UC-status downstream.

Prostate Cancer

Prostate cancer (PC) is one of the most common types of cancers in middle aged and older men, having a high public health burden with more than 3 million new cases in USA per year. The eQTLGen consortium reports complete (non-missing) trans-association summary statistics for 122 SNPs associated with prostate cancer, curated from multiple large-scale GWAS, across 12,951 genes. Of these, 11,385 genes were expressed in Whole Blood from GTEx v8 individuals. Using ARCHIE, we detected two significant variant-gene components comprising 33 SNPs and 53 genes in total (Figure 4A; Supplementary Table 1) that reflect trans- association patterns specific to PC, of which 44 genes (83.1%) were novel (Table 1, Supplementary Table 1). Additionally, similar to SCZ and UC, we found evidence of enrichment of trans-heritability of PC that can be mediated by the target genes (Figure 4D), but the level of significance achieved was relatively weaker (p-value = 0.002 and 0.008; See Methods for details).

Figure 4. Analysis of selected target gene-set for PC.

(A) Top 10 cc-values (red) and the corresponding competitive null distributions (black box plot) for ARCHIE analysis of PC. Top 2 ARCHIE components are significant. (B) Reported p-values for trans-eQTL association between variants and genes selected in (B) ARCHIE component 1 and (C) component 2. TP53 and SMAD3 are highlighted. Any association p-value < 10^-08 is collapsed to 10^-08 for the ease of viewing. (D) trans-heritability enrichment analysis for target genes selected for PC. The boxplots represent the distribution of pseudo-r2 for a random gene-set of the same size and the box-points represent the observed pseudo-r² for the target genes (See Results and Methods for details).

Among the novel genes, we identified several key genes that are generally implicated in different types of cancers. For example, TP53 aggregates weaker trans-associations with 9 PC-related variants in ARCHIE component 1 (Figure 4B). The TP53 gene encodes tumor protein p53 which acts as a key tumor suppressor and regulates cell division in general. TP53 is implicated in a large spectrum of cancer phenotypes and has been considered to be one of the most important cancer genes studied⁶¹. Further, genes associated with the second gene component included SMAD3 (Figure 4C) which is also a well-known tumor suppressor gene that plays a key role in transforming growth factor β (TGF-β) mediated immune suppression and also in regulating transcriptional responses suitable for metastasis^62–64. TP53 and SMAD3 belong to two different ARCHIE components meaning that they might pertain to two relatively distinct biological processes that are independently affected by different sets of PC related variants. Additionally, the second gene-component included EEA1 which is reported to have significantly altered expression levels in prostate cancer patients⁶⁵.

Using enrichment analyses, we found several pathways, including broadly ubiquitous pathways to be significantly overrepresented in the selected genes for both the gene-components like regulation of intracellular transport (adjusted p-value=0.017) and mRNA 3’-UTR binding (adjusted p-value = 0.008). Notably, we found the selected genes to be enriched for targets of several transcription factors many of which have been associated with different types and subtypes of cancer. For example, we found a TF target enrichment for SPAG9 (adjusted p-value = 0.016) which has been identified to be associated to breast cancer, ovarian cancer, colorectal cancer and others ⁶⁶. We also found enrichment for targets of SSRP1 (adjusted p-value = 0.016), which is differentially regulated in a wide spectrum of malignant tumors ⁶⁷. However, we did not find any evidence for significance enrichment of PPI among identified genes.

The downstream analysis suggests that a majority of the pathways (78.1%) enriched for the selected genes are immune related as observed in the previous examples as well. This might have been driven by the fact that eQTLGen reports summary trans-associations in whole blood. In general, whole blood might not be the ideal candidate tissue to identify trans-associations pertaining to PC. It is conceivable that relevant tissue-specific analysis for PC could have illuminated further trans-association patterns and identified key tissue-specific target genes. Despite that, we can identify several genes which have been elaborately reported to be key target genes for various cancers. This underlines the utility of our aggregative approach and that it can illuminate important target genes pertaining to a trait.

Conclusion

In this article, we have proposed ARCHIE, a novel method for identifying groups of trait-associated genetic variants the effects of which may be mediated through trans-effect on groups of coregulated genes. By aggregating association signals across multiple SNP-gene expression pairs, the method improves power for the detection of patterns of weak trans-effects. Further, we develop a resampling-based method to allow testing for the statistical significance of trait-specific enrichment patterns in the background of expected highly polygenic broad trans association signals. By applying the method to summary-level trans-association statistics available from the eQTLGen consortium, we identify novel target gene sets across a wide variety of traits. We explore the results in depth specifically for three complex diseases and provide further validation of and insight to causal basis of the identified target genes. The method can be applied to investigate complex mediatory effects of GWAS variants on traits through networks of other molecular traits, such as proteins and metabolites.

While modern genome-wide association studies have been successful in identifying large number of genetic variants associated with complex traits, the underlying biological mechanisms by which these association arise has remained elusive. While trans genetic regulations, mediated through cis- or otherwise, has been proposed for detecting important target genes for GWAS variants, identification of trans associations using standard univariate SNP vs gene-expression association analysis is notoriously difficult due to weak effect-sizes and large multiple testing burdens. Also due to pleiotropy, there could be abundant associations between genetic variants and distal gene expression, but these associations may not reflect any trait specific patterns. ARCHIE addresses these limitations of traditional trans-mapping of GWAS variants. Application of the method to eQTLGen consortium trans-eQTL statistics not only identified many novel trans-associations for trait-related variants, but also it helped to contextualize the individual associations in terms of broader patterns that were detected by underlying correlation components and were shown to be highly trait-specific.

The set of selected target genes in the gene component is one of the key outputs of ARCHIE. Using a series of follow-up analyses for three different types of traits, we showed that the selected genes are often overrepresented in known disease-relevant pathways, enriched in protein-protein interaction networks, shows co-regulations across tissues and contains targets for known transcription factors implicated to the disease (SCZ and UC) and key tumor suppressor genes (PC). Further, using a trans-expression imputation approach, we demonstrated that the selected genes can significantly mediate heritability associated trait related variants. All of these analyses point out that the trans-association patterns we detect are likely to have trait specific biological basis.

There are several limitations of the proposed method and current analysis. First, in the current version ARCHIE, we begin with a set of genetic variants associated with a trait, but we do not incorporate the underlying association directions and effect sizes in the analysis. This approach allowed us to independently investigate identified target genes through testing for consistency of directions of association of the SNPs with the trait and those with the expressions of the target through the trans-heritability enrichment analysis. However, it is likely that incorporation of the direction of trait association for the SNPs in the sCCA analysis itself will lead to improved power for detection of the trait specific target genes. Further, incorporation of known functional annotation of genetic variants and other prior information regarding the relationship between genes can improve the power of the analysis as well. Currently, the resampling-based testing method we used to test for trait-specific enrichment patterns for trans associations is computationally intensive. In the future, further research is merited to develop analytical approximation techniques to reduce the computational burden of ARCHIE.

In this article we have analyzed summary statistics reported by eQTLGen in the whole blood. This is primarily because of the substantial effective sample size of eQTLGen. While the approach can be applied to eQTL results from other tissues, the underlying sample sizes may be too limited to yield sufficient power. Although blood might not be the most relevant tissue for a number of traits, our analysis did detect trans association patterns that appear to have a broader biological basis in the disease genetics, from multiple independent lines of evidence. Nevertheless, it is likely that our analysis has missed many trans-association patterns that will be present only in specific disease-relevant tissues, cell types or/and dynamic stages ⁶⁸. In the future, we will seek applications for ARCHIE in various types of emerging eQTL databases to provide a more complete map of networks of genetics variants and trans-regulated gene expressions and relevant contexts.

Overall, in this article we have developed a novel summary-based method, ARCHIE, to detect trait-specific gene-sets by aggregating trans-associations from multiple trait-related variants. ARCHIE is a powerful tool for identifying target gene sets through which the effect of genetic variants on a complex trait may be mediated. Applications of the methods to a variety of existing data on association between genetics variants with high-throughput molecular traits can provide insights to biological mechanisms underlying genetic basis of complex traits.

URL

eQTLGen (trans-eQTL summary statistics): https://www.eqtlgen.org/trans-eqtls.html

GTEx: https://www.gtexportal.org/home/

1000 Genomes: https://www.internationalgenome.org/data/

UKBiobank: https://www.ukbiobank.ac.uk/

FUMA: https://fuma.ctglab.nl/

ShinyGO: http://bioinformatics.sdstate.edu/go/

STRING: https://string-db.org/cgi/

GitHub: https://github.com/diptavo/ARCHIE (initial release)

Data Availability

The results from the analysis has been and will continue to be updated in Github.

Sample Description

eQTLGen

The eQTLGen consortium¹⁹ is a large-scale multi-study effort to identify to study the downstream effects of trait-related variants via their effects on gene-expression in whole blood. The consortium consists of 37 individual studies with a collective sample size of 31,684 participants. With this sample size, the study has relatively higher power to detect moderate to weaker effects of variants on gene-expression. 10,317 variants related to complex traits, compiled from several GWAS databases, were tested for trans-associations with the expression levels of 19,964 genes in whole blood. The authors have made summary statistics (Z-score, p-value) for these trans-eQTL mapping analyses freely available to public.

GTEx

The Genotype-Tissue Expression (GTEx) project²⁵ aims to study tissue-specific gene expression and regulation. We used individual level data from GTEx (v8) whole blood to construct the co-expression matrix (Σ_EE) and further downstream validation of the gene-sets selected using ARCHIE. In our analysis, we used the latest version (v8) of GTEx having gene-expression and genotype data with samples from 54 different tissues. In particular, 755 individuals had expression data on 20,315 genes for whole blood.

UK Biobank

UK Biobank is a large biobank study with above 500,000 participants. Among several data resources available, the genotype data, hospitalization records and health-records data are available. We used individual level genotype data from UK Biobank to construct LD matrix (Σ_GG) and for further downstream analysis of the selected target genes.

The phenotype data constructed from hospitalization and health-data records were used in the quantification and testing of enrichment in trans-heritability explained by the selected target genes (See Methods). We included the individuals with European ancestry in the analysis. For example, in the analysis of schizophrenia (SCZ), we used a sample of 366,326 participants from UK Biobank to construct the imputed gene-expression levels and evaluate the corresponding regression r² as an estimate of trans-heritability on SCZ as a binary phenotype.

Methods

Estimating trait-specific pattern of trans-associations

Our proposed method, Aggregative tRans assoCiation to detect pHenotype specIfic gEne-sets (ARCHIE), can select target genes trans-associated with trait-related variants using summary statistics in a sparse canonical correlation framework. To apply ARCHIE, we start with the summary statistics from trans-eQTL mapping (Z-value, p-value). Given the trans-association summary statistics across the variants related with the trait and all the corresponding distant genes (variant > 5Mb away from the transcription start site of the gene), we first adjust for the correlation within the variants and genes through appropriate linkage disequilibrium and co-expression matrices respectively as follows: where Σ_GG and Σ_EE are estimates of LD-matrix and co-expression matrix (see Supplementary Methods), and Σ_GE is a matrix of Z-values obtained from the standard trans-eQTL mapping across all pairs of variants and gene-expressions.

Using W, the correlation-adjusted matrix of trans-associations, ARCHIE employs sparse canonical correlation analysis^22,24 (sCCA) which produces a sparse linear combination of the variants (u; termed variant-component) that is strongly correlated with a sparse linear combination of genes (v; termed gene-component) by solving the following optimization problem where ‖x‖_h is the L_h norm of a vector x; c_u (or c_v) is the sparsity parameter on the variant (or gene) component for the lasso-type L₁ penalty. Sparsity aids in interpretation since each non-zero element of a variant or gene component indicates that the respective variant (or gene) is selected in that component. Thus, (u, v), which are the resultant variant and gene components (jointly termed ARCHIE components) can be interpreted as the sparse latent factors that explain the majority of the aggregated association between all the trait-related variants and all the genes. The corresponding sparse canonical correlation (cc-value) between each pair of variant and gene components, defined as q²=(v^TWu)²would ideally represent the cumulative correlation between the selected sets of variants and genes by aggregating multiple (possibly weaker) associations (Figure 1A shows an illustration using P variants and G genes). Multiple such components (u, v) can be extracted to reflect approximately orthogonal latent factors of the aggregative correlation, corresponding to possibly distinct mechanisms of trans-regulation (See Supplementary Methods).

At suitable levels of sparsity (See Supplementary Methods), ARCHIE components produce a much smaller number of selected target genes which harbor multiple moderate to weak trans-association from a selected set of trait-associated variants, thus reflecting a trait-specific pattern of trans-association. A detailed algorithm for the estimation of the ARCHIE components is provided in the Supplementary Section A.

Testing Hypothesis of Enrichment of Trait-Specific trans-Association using a Competitive Null Hypothesis Framework

To test which ARCHIE components significantly capture the phenotype-specific trans-association pattern we evaluated the results from the original analysis against a competitive null hypothesis. Since trait-related variants are expected to be enriched for trans-eQTLs in general, we test whether the cc-values obtained in the original analysis are higher than that obtained using the trans-summary statistics between a random set of GWAS-identified variants and genes of similar size, that do not reflect any trait-specific pattern. For this, we first construct a null matrix by taking a random sample of p variants from the pool of all variants available and extracting the corresponding trans-summary statistics for another set of randomly chosen g genes. Since eQTLGen reports the trans-summary statistics across about 10,000 variants associated with different traits, we can construct the null matrix using the trans-summary statistics from these variants that are associated with different traits and not with the trait of interest. This matrix of trans-associations, by design, should not reflect phenotype-specific patterns. For example, in the analysis for Schizophrenia (SCZ) using summary statistics across 218 variants and 7,047 genes, we construct the null matrix using 1 variant selected at random from 218 randomly chosen traits and extracting their corresponding trans-summary statistics across 7,047 randomly chosen genes.

Then we use ARCHIE with the same sparsity levels as the original analysis, to extract the gene and variant components and calculate corresponding cc-values. We repeat this step multiple (M) times to generate a competitive null distribution of cc-values. We then evaluate the observed cc-values from the original analysis against the corresponding competitive null distributions to calculate the p-value. In particular, the p-value of the k^th ARCHIE component is given as where denotes the k^th cc-value in the original analysis and denotes the elements of the null distribution of the k^th cc-value. We declare that the top L components significantly capture phenotype-specific trans- association patterns if The random set of p variants should be carefully chosen so that none of the variants associated to the phenotype in consideration or any phenotype sharing substantial genetic correlation, are included. Further the set should be such that it does not include a large fraction of the variants from the same phenotype (different from the original phenotype), which may bias the competitive null distribution towards the trans-association cc-values for that phenotype.

We performed an evaluation of the statistical properties including type-I error of the proposed testing procedure using extensive resampling experiments across four different traits. Our results show that the test against competitive null hypothesis can preserve correct type-I error rate at significance threshold of 0.001. Further, it can detect presence of trait-specific trans-associations with high probability (Supplementary Section B).

Analysis of eQTLGen data

To identify phenotype-specific trans-associations, we applied ARCHIE on the trans-association summary statistics for 10,317 trait-related variants across 19,942 genes reported by the eQTLGen consortium¹⁹ (See Sample Description for details on the study). In line with the consortium, we defined any gene to be trans to a variant if the variant was located at least 5Mb from the transcription start site of the gene or on another chromosome. The data contains multiple variants associated with the same trait analyzed for trans-eQTL mapping. Our analysis was restricted to phenotypes that had at least 100 associated variants tested for trans-mapping in the consortium, producing 29 phenotypes. Figure 1B shows a graphical representation of the major steps of our workflow. Briefly, for each phenotype, we extracted the summary trans-eQTL association statistics (Z-score, p-value) and removed all genes that were in within 5Mb of any of the trait related variants. In the preprocessing step, we filtered for any missing data and retained the genes that were also expressed in GTEx (v8)²⁵ whole blood. This produced a list of approximately 129 (min: 112; max: 533) variants and 10,219 (min: 3426; max: 13910) genes on an average per phenotype. ARCHIE requires two additional matrices representing the correlation among the variants themselves (a linkage-disequilibrium matrix) and among the gene-expression levels (a co-expression matrix), which can be estimated using reference data. We constructed the LD-matrix for the variants from individual-level genotype information using 5,000 randomly selected, unrelated European samples in UK Biobank⁶⁹. For the correlation between gene-expressions, we used a penalized co-expression matrix⁷⁰ of the corresponding genes constructed from the covariate-adjusted gene-expression levels for individuals in GTEx v8 data (See Supplementary Methods). Subsequently, for the given trait, we extracted the selected variants and genes using the significant components and were evaluated for presence of false-positives due to cross-mapping.

Cross-Mappability

Alignment errors due to similarity in sequenced reads can lead a substantial rise in false positives for detecting trans-eQTL associations ⁷¹. With the selected ARCHIE components, we extracted the nearby genes expressed in GTEx v8 whole blood for the selected variants (TSS within ±500 kb of the variant) and evaluated the cross-mapping scores for these genes with the selected target genes. Across the 3 traits analyzed in this article, we found that all such gene-pairs were mostly non cross-mappable (SCZ: 99.98%, UC: 99.17%, PC: 99.93%), indicating that the trans-association patterns were less likely to be affected by false positive arising from alignment errors.

Follow-up Analysis

Quantifying and Testing for Enrichment for Trait Heritability Explained by Identified Target Genes

In the following, we propose a method for quantifying trait heritability explained by the GWAS variants that would be mediated by the identified target-genes and develop a corresponding test for enrichment through comparison of such estimates of mediated heritability associated with that from random genes. For or a particular trait of interest, we start with the Z-scores for regression-based trans-eQTL mapping for a set of underlying p variants and g genes. We will assume that, using ARCHIE, we have identified G target genes that capture trait-specific trans-association patterns. To perform the test as proposed above, we require individual level phenotype and genotype data independent of the samples used in the original analysis. Given genotypes (or dosages) at the p variant sites for an individual k, for each target gene, we define the trans-imputed expression scores (TIES) as the predicted expression value for the j^th target gene as where Z_ij is the z-score for the effect of the i^th trait-related variant on the j^th gene, x_ik is the genotype or dosage for the k^th individual at the i^th variant and m_i is the minor allele frequency of the i^th variant. We construct the TIES under two different schemes:

Using all the trait-related variants with complete trans-association statistics reported in eQTLGen
Using only the trait-related variants selected in the significant components

To evaluate how strongly the TIES for the G target genes are associated with the phenotype levels, we use the following multiple regression model where y_k is the phenotype value (e.g. disease status) for the kth individual; g[.] is a canonical link function and can be set to be the identity function for continuous phenotype or the logistic function for binary (disease status) phenotypes. We record the pseudo-r² from this regression model as a measure of association between the TIES and the phenotype value. The pseudo-r² would provide an estimate of trans-heritability, meaning it can quantify the heritability explained by the trait-related variants that is expected to be mediated via the selected target genes in context of the trans-associations reported. To test whether the observed r² is significant in comparison to what is expected at random, we adopt a resampling based approach. We sampled g genes (excluding the originally selected target genes) from the genome, constructed the corresponding TIES for the individuals and recorded the r² for the regression model. We performed resampling multiple (1000) times to generate a control (null) distribution of r² to reflect the associations expected from random genes. We then calculated the p-value of the observed r² using the originally selected g genes from this control distribution to evaluate whether the TIES have any significant association with the phenotype.

Approximately, the observed r² reflects the proportion of trait-variance explained by the TIES. Thus, a significantly higher r² would imply that the selected genes harbor several trans-associations and mediate the effects of the trait-related variants more than any random set of genes. As the analysis of association between TIES and trait (for both the selected trans genes and random genes) is performed in an independent dataset, and no information on directions or magnitudes of trait association for the SNPs are used in the original ARCHIE analysis, the test for heritability enrichment provides independent validation of the relevance of selected target-genes in explaining genetic associations for the trait. In our application, we used individual level phenotype and genotype data from UK Biobank participants to estimate association between TIES and traits.

We also performed several other follow up analyses including PPI enrichment, pathway enrichment, differentially expressed genes enrichment. These analyses were carried out using pre-established standard pipelines. For full details on these see Supplementary Section C.

References

1.↵
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
OpenUrl CrossRef PubMed
2.↵
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
OpenUrl CrossRef PubMed
3.
Visscher, P. M. et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet. 101, 5–22 (2017).
OpenUrl CrossRef PubMed
4.↵
Eicher, J. D. et al. GRASP v2.0: An update on the Genome-Wide Repository of Associations between SNPs and Phenotypes. Nucleic Acids Res. (2015). doi:10.1093/nar/gku1202
OpenUrl CrossRef PubMed
5.↵
Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S. & Snyder, M. Linking disease associations with regulatory information in the human genome. Genome Res. (2012). doi:10.1101/gr.136127.111
OpenUrl Abstract/FREE Full Text
6.
Maurano, M. T. et al. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science (80-.). 337, 1190–1195 (2012).
OpenUrl Abstract/FREE Full Text
7.↵
Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: Annotation to enhance discovery from GWAS. PLoS Genet. (2010). doi:10.1371/journal.pgen.1000888
OpenUrl CrossRef PubMed
8.↵
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47(9), 1091–1098 (2015).
OpenUrl CrossRef PubMed
9.↵
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. (2016). doi:10.1038/ng.3506
OpenUrl CrossRef PubMed
10.
Giambartolomei, C. et al. Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics. PLoS Genet. 10, e1004383 (2014).
OpenUrl CrossRef PubMed
11.↵
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLOS Genet. 15, e1007889 (2019).
OpenUrl CrossRef PubMed
12.↵
Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. (2020). doi:10.1038/s41588-020-0625-2
OpenUrl CrossRef
13.↵
Aguet, F. et al. Genetic effects on gene expression across human tissues. Nature (2017). doi:10.1038/nature24277
OpenUrl CrossRef PubMed Web of Science
14.↵
Nica, A. C. et al. The architecture of gene regulatory variation across multiple human tissues: The muTHER study. PLoS Genet. (2011). doi:10.1371/journal.pgen.1002003
OpenUrl CrossRef PubMed
15.↵
Westra, H. J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. (2013). doi:10.1038/ng.2756
OpenUrl CrossRef PubMed
16.↵
Yao, C. et al. Dynamic Role of trans Regulation of Gene Expression in Relation to Complex Traits. Am. J. Hum. Genet. (2017). doi:10.1016/j.ajhg.2017.02.003
OpenUrl CrossRef
17.↵
Brynedal, B. et al. Large-Scale trans-eQTLs Affect Hundreds of Transcripts and Mediate Patterns of Transcriptional Co-regulation. Am. J. Hum. Genet. (2017). doi:10.1016/j.ajhg.2017.02.004
OpenUrl CrossRef
18.↵
Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Reviews Genetics (2006). doi:10.1038/nrg1964
OpenUrl CrossRef PubMed Web of Science
19.↵
Võsa, U. et al. Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis. bioRxiv (2018). doi:10.1101/447367
OpenUrl Abstract/FREE Full Text
20.↵
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell 169, 1177–1186 (2017).
OpenUrl CrossRef PubMed
21.↵
Liu, X., Li, Y. I. & Pritchard, J. K. Trans Effects on Gene Expression Can Drive Omnigenic Inheritance. Cell (2019). doi:10.1016/j.cell.2019.04.014
OpenUrl CrossRef PubMed
22.↵
Witten, D. M., Tibshirani, R. & Hastie, T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics (2009). doi:10.1093/biostatistics/kxp008
OpenUrl CrossRef PubMed Web of Science
23.
Hardoon, D. R. & Shawe-Taylor, J. Sparse canonical correlation analysis. Mach. Learn. (2011). doi:10.1007/s10994-010-5222-7
OpenUrl CrossRef Web of Science
24.↵
Witten, D. M. & Tibshirani, R. J. Extensions of sparse canonical correlation analysis with applications to genomic data. Stat. Appl. Genet. Mol. Biol. (2009). doi:10.2202/1544-6115.1470
OpenUrl CrossRef
25.↵
Aguet, F. et al. The GTEx Consortium atlas of genetic regulatory effects across human tissues. bioRxiv (2019). doi:10.1101/787903
OpenUrl Abstract/FREE Full Text
26.↵
Volk, D. W., Chitrapu, A., Edelson, J. R. & Lewis, D. A. Chemokine receptors and cortical interneuron dysfunction in schizophrenia. Schizophr. Res. (2015). doi:10.1016/j.schres.2014.10.031
OpenUrl CrossRef
27.↵
Kassan, A. et al. Caveolin-1 regulation of disrupted-in-schizophrenia-1 as a potential therapeutic target for schizophrenia. J. Neurophysiol. (2017). doi:10.1152/jn.00481.2016
OpenUrl CrossRef PubMed
28.↵
Kakiuchi, C. et al. Functional polymorphisms of HSPA5: Possible association with bipolar disorder. Biochem. Biophys. Res. Commun. (2005). doi:10.1016/j.bbrc.2005.08.248
OpenUrl CrossRef PubMed Web of Science
29.↵
Martin, J. et al. A brief report: de novo copy number variants in children with attention deficit hyperactivity disorder. Transl. Psychiatry (2019). doi:10.1101/2019.12.12.19014555
OpenUrl Abstract/FREE Full Text
30.↵
Mancuso, N. et al. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am. J. Hum. Genet. (2017). doi:10.1016/j.ajhg.2017.01.031
OpenUrl CrossRef
31.↵
Hill, D. P., Smith, B., McAndrews-Hill, M. S. & Blake, J. A. Gene Ontology annotations: What they mean and where they come from. in BMC Bioinformatics (2008). doi:10.1186/1471-2105-9-S5-S2
OpenUrl CrossRef
32.↵
Kanehisa, M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. (2000). doi:10.1093/nar/28.1.27
OpenUrl CrossRef PubMed Web of Science
33.
Croft, D. et al. Reactome: A database of reactions, pathways and biological processes. Nucleic Acids Res. (2011). doi:10.1093/nar/gkq1018
OpenUrl CrossRef PubMed Web of Science
34.
Schaefer, C. F. et al. PID: The pathway interaction database. Nucleic Acids Res. (2009). doi:10.1093/nar/gkn653
OpenUrl CrossRef PubMed Web of Science
35.↵
Adriaens, M. E. et al. The public road to high-quality curated biological pathways. Drug Discovery Today (2008). doi:10.1016/j.drudis.2008.06.013
OpenUrl CrossRef PubMed Web of Science
36.↵
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst 1(6), 417–425 (2015).
OpenUrl PubMed
37.↵
Patel, S., Sharma, D., Kalia, K. & Tiwari, V. Crosstalk between endoplasmic reticulum stress and oxidative stress in schizophrenia: The dawn of new therapeutic approaches. Neuroscience and Biobehavioral Reviews (2017). doi:10.1016/j.neubiorev.2017.08.025
OpenUrl CrossRef
38.↵
Landek-Salgado, M. A., Faust, T. E. & Sawa, A. Molecular substrates of schizophrenia: Homeostatic signaling to connectivity. Molecular Psychiatry (2016). doi:10.1038/mp.2015.141
OpenUrl CrossRef PubMed
39.↵
Chen, X. et al. Apoptotic engulfment pathway and schizophrenia. PLoS One (2009). doi:10.1371/journal.pone.0006875
OpenUrl CrossRef PubMed
40.↵
Liu, M.-L. et al. Severe disturbance of glucose metabolism in peripheral blood mononuclear cells of schizophrenia patients: a targeted metabolomic study. J. Transl. Med. 13, 226 (2015).
OpenUrl
41.↵
Cannon, T. D., Yolken, R., Buka, S. & Torrey, E. F. Decreased Neurotrophic Response to Birth Hypoxia in the Etiology of Schizophrenia. Biol. Psychiatry 64, 797–802 (2008).
OpenUrl CrossRef PubMed Web of Science
42.↵
Ryskalin, L., Limanaqi, F., Frati, A., Busceti, C. & Fornai, F. mTOR-Related Brain Dysfunctions in Neuropsychiatric Disorders. Int. J. Mol. Sci. 19, 2226 (2018).
OpenUrl
43.↵
Kim, P., Scott, M. R. & Meador-Woodruff, J. H. Dysregulation of the unfolded protein response (UPR) in the dorsolateral prefrontal cortex in elderly patients with schizophrenia. Mol. Psychiatry (2019). doi:10.1038/s41380-019-0537-7
OpenUrl CrossRef
44.↵
Han, H. et al. TRRUST v2: An expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. (2018). doi:10.1093/nar/gkx1013
OpenUrl CrossRef PubMed
45.↵
Zheng, G. et al. ITFP: An integrated platform of mammalian transcription factors. Bioinformatics (2008). doi:10.1093/bioinformatics/btn439
OpenUrl CrossRef PubMed Web of Science
46.↵
Aberg, K. A. et al. Methylome-Wide Association Study of Schizophrenia. JAMA Psychiatry 71, 255 (2014).
OpenUrl
47.↵
Martínez, G. et al. Regulation of Memory Formation by the Transcription Factor XBP1. Cell Rep. 14, 1382–1394 (2016).
OpenUrl
48.↵
Szklarczyk, D. et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. (2019). doi:10.1093/nar/gky1131
OpenUrl CrossRef PubMed
49.↵
Bassotti, G. et al. Gastrointestinal motility disorders in inflammatory bowel diseases. World J. Gastroenterol. (2014). doi:10.3748/wjg.v20.i1.37
OpenUrl CrossRef PubMed
50.
Farrokhyar, F., Marshall, J. K., Easterbrook, B. & Irvine, E. J. Functional gastrointestinal disorders and mood disorders in patients with inactive inflammatory bowel disease: Prevalence and impact on health. Inflamm. Bowel Dis. (2006). doi:10.1097/01.MIB.0000195391.49762.89
OpenUrl CrossRef PubMed Web of Science
51.↵
Lakatos, P. L. & Lakatos, L. Risk for colorectal cancer in ulcerative colitis: Changes, causes and management strategies. World Journal of Gastroenterology (2008). doi:10.3748/wjg.14.3937
OpenUrl CrossRef PubMed Web of Science
52.↵
Van Der Post, S. et al. Structural weakening of the colonic mucus barrier is an early event in ulcerative colitis pathogenesis. Gut (2019). doi:10.1136/gutjnl-2018-317571
OpenUrl Abstract/FREE Full Text
53.↵
Rageul, J. et al. KLF4-dependent, PPARγ-induced expression of GPA33 in colon cancer cell lines. Int. J. Cancer (2009). doi:10.1002/ijc.24683
OpenUrl CrossRef PubMed Web of Science
54.↵
Williams, B. B. et al. Glycoprotein A33 deficiency: A new mouse model of impaired intestinal epithelial barrier function and inflammatory disease. DMM Dis. Model. Mech. (2015). doi:10.1242/dmm.019935
OpenUrl Abstract/FREE Full Text
55.↵
Gobert, A. P. et al. Distinct immunomodulatory effects of spermine oxidase in colitis induced by epithelial injury or infection. Front. Immunol. (2018). doi:10.3389/fimmu.2018.01242
OpenUrl CrossRef PubMed
56.↵
Hu, T. et al. Spermine oxidase is upregulated and promotes tumor growth in hepatocellular carcinoma. Hepatol. Res. (2018). doi:10.1111/hepr.13206
OpenUrl CrossRef
57.↵
Sadlack, B. et al. Ulcerative colitis-like disease in mice with a disrupted interleukin-2 gene. Cell 75, 253–261 (1993).
OpenUrl CrossRef PubMed Web of Science
58.↵
Hayden, M. S. & Ghosh, S. NF-κB in immunobiology. Cell Research (2011). doi:10.1038/cr.2011.13
OpenUrl CrossRef PubMed Web of Science
59.↵
Liu, T., Zhang, L., Joo, D. & Sun, S. C. NF-κB signaling in inflammation. Signal Transduction and Targeted Therapy (2017). doi:10.1038/sigtrans.2017.23
OpenUrl CrossRef PubMed
60.↵
Borm, M. E. A., Van Bodegraven, A. A., Mulder, C. J. J., Kraal, G. & Bouma, G. A NFKB1 promoter polymorphism is involved in susceptibility to ulcerative colitis. Int. J. Immunogenet. (2005). doi:10.1111/j.1744-313X.2005.00546.x
OpenUrl CrossRef PubMed Web of Science
61.↵
Wang, X. & Sun, Q. TP53 mutations, expression and interaction networks in human cancers. Oncotarget (2017). doi:10.18632/oncotarget.13483
OpenUrl CrossRef
62.↵
Millet, C. & Zhang, Y. E. Roles of Smad3 in TGF-β signaling during carcinogenesis. Critical Reviews in Eukaryotic Gene Expression (2007). doi:10.1615/CritRevEukarGeneExpr.v17.i4.30
OpenUrl CrossRef PubMed Web of Science
63.
Tang, P. M.-K. et al. Smad3 promotes cancer progression by inhibiting E4BP4-mediated NK cell development. Nat. Commun. 8, 14677 (2017).
OpenUrl CrossRef PubMed
64.↵
Lu, S., Lee, J., Revelo, M., Wang, X. & Dong, Z. Smad3 is overexpressed in advanced human prostate cancer and necessary for progressive growth of prostate cancer cells in nude mice. Clin. Cancer Res. (2007). doi:10.1158/1078-0432.CCR-07-1078
OpenUrl Abstract/FREE Full Text
65.↵
Johnson, I. R. D. et al. Endosomal gene expression: a new indicator for prostate cancer patient prognosis? Oncotarget 6, 37919–37929 (2015).
OpenUrl CrossRef
66.↵
Kanojia, D., Garg, M., Gupta, S., Gupta, A. & Suri, A. Sperm-Associated Antigen 9 Is a Novel Biomarker for Colorectal Cancer and Is Involved in Tumor Growth and Tumorigenicity. Am. J. Pathol. 178, 1009–1020 (2011).
OpenUrl CrossRef PubMed
67.↵
Garcia, H. et al. Facilitates Chromatin Transcription Complex Is an “Accelerator” of Tumor Transformation and Potential Marker and Target of Aggressive Cancers. Cell Rep. 4, 159–173 (2013).
OpenUrl CrossRef PubMed
68.↵
Umans, B. D., Battle, A. & Gilad, Y. Where Are the Disease-Associated eQTLs? Trends Genet. 1–16 (2020). doi:10.1016/j.tig.2020.08.009
OpenUrl CrossRef
69.↵
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
OpenUrl CrossRef PubMed
70.↵
Schäfer, J. & Strimmer, K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. (2005). doi:10.2202/1544-6115.1175
OpenUrl CrossRef
71.↵
Saha, A. & Battle, A. False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors. F1000Research (2018). doi:10.12688/f1000research.17145.1
OpenUrl CrossRef PubMed

View the discussion thread.

Posted October 06, 2020.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Genetic and Genomic Medicine

Subject Areas

All Articles

Addiction Medicine (359)
Allergy and Immunology (681)
Anesthesia (182)
Cardiovascular Medicine (2702)
Dentistry and Oral Medicine (319)
Dermatology (231)
Emergency Medicine (408)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (958)
Epidemiology (12343)
Forensic Medicine (10)
Gastroenterology (774)
Genetic and Genomic Medicine (4188)
Geriatric Medicine (394)
Health Economics (690)
Health Informatics (2708)
Health Policy (1012)
Health Systems and Quality Improvement (1006)
Hematology (366)
HIV/AIDS (869)
Infectious Diseases (except HIV/AIDS) (13786)
Intensive Care and Critical Care Medicine (806)
Medical Education (401)
Medical Ethics (110)
Nephrology (447)
Neurology (3974)
Nursing (216)
Nutrition (587)
Obstetrics and Gynecology (754)
Occupational and Environmental Health (705)
Oncology (2093)
Ophthalmology (597)
Orthopedics (245)
Otolaryngology (309)
Pain Medicine (254)
Palliative Medicine (76)
Pathology (474)
Pediatrics (1137)
Pharmacology and Therapeutics (474)
Primary Care Research (464)
Psychiatry and Clinical Psychology (3507)
Public and Global Health (6603)
Radiology and Imaging (1432)
Rehabilitation Medicine and Physical Therapy (837)
Respiratory Medicine (879)
Rheumatology (416)
Sexual and Reproductive Health (415)
Sports Medicine (347)
Surgery (458)
Toxicology (56)
Transplantation (192)
Urology (170)

[1] 1.↵
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
OpenUrl CrossRef PubMed

[2] 2.↵
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
OpenUrl CrossRef PubMed

[3] 3.
Visscher, P. M. et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet. 101, 5–22 (2017).
OpenUrl CrossRef PubMed

[4] 4.↵
Eicher, J. D. et al. GRASP v2.0: An update on the Genome-Wide Repository of Associations between SNPs and Phenotypes. Nucleic Acids Res. (2015). doi:10.1093/nar/gku1202
OpenUrl CrossRef PubMed

[5] 5.↵
Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S. & Snyder, M. Linking disease associations with regulatory information in the human genome. Genome Res. (2012). doi:10.1101/gr.136127.111
OpenUrl Abstract/FREE Full Text

[6] 6.
Maurano, M. T. et al. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science (80-.). 337, 1190–1195 (2012).
OpenUrl Abstract/FREE Full Text

[7] 7.↵
Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: Annotation to enhance discovery from GWAS. PLoS Genet. (2010). doi:10.1371/journal.pgen.1000888
OpenUrl CrossRef PubMed

[8] 8.↵
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47(9), 1091–1098 (2015).
OpenUrl CrossRef PubMed

[9] 9.↵
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. (2016). doi:10.1038/ng.3506
OpenUrl CrossRef PubMed

[10] 10.
Giambartolomei, C. et al. Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics. PLoS Genet. 10, e1004383 (2014).
OpenUrl CrossRef PubMed

[11] 11.↵
Barbeira, A. N. et al. Integrating predicted transcriptome from multiple tissues improves association detection. PLOS Genet. 15, e1007889 (2019).
OpenUrl CrossRef PubMed

[12] 12.↵
Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. (2020). doi:10.1038/s41588-020-0625-2
OpenUrl CrossRef

[13] 13.↵
Aguet, F. et al. Genetic effects on gene expression across human tissues. Nature (2017). doi:10.1038/nature24277
OpenUrl CrossRef PubMed Web of Science

[14] 14.↵
Nica, A. C. et al. The architecture of gene regulatory variation across multiple human tissues: The muTHER study. PLoS Genet. (2011). doi:10.1371/journal.pgen.1002003
OpenUrl CrossRef PubMed

[15] 15.↵
Westra, H. J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. (2013). doi:10.1038/ng.2756
OpenUrl CrossRef PubMed

[16] 16.↵
Yao, C. et al. Dynamic Role of trans Regulation of Gene Expression in Relation to Complex Traits. Am. J. Hum. Genet. (2017). doi:10.1016/j.ajhg.2017.02.003
OpenUrl CrossRef

[17] 17.↵
Brynedal, B. et al. Large-Scale trans-eQTLs Affect Hundreds of Transcripts and Mediate Patterns of Transcriptional Co-regulation. Am. J. Hum. Genet. (2017). doi:10.1016/j.ajhg.2017.02.004
OpenUrl CrossRef

[18] 18.↵
Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Reviews Genetics (2006). doi:10.1038/nrg1964
OpenUrl CrossRef PubMed Web of Science

[19] 19.↵
Võsa, U. et al. Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis. bioRxiv (2018). doi:10.1101/447367
OpenUrl Abstract/FREE Full Text

[20] 20.↵
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell 169, 1177–1186 (2017).
OpenUrl CrossRef PubMed

[21] 21.↵
Liu, X., Li, Y. I. & Pritchard, J. K. Trans Effects on Gene Expression Can Drive Omnigenic Inheritance. Cell (2019). doi:10.1016/j.cell.2019.04.014
OpenUrl CrossRef PubMed

[22] 22.↵
Witten, D. M., Tibshirani, R. & Hastie, T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics (2009). doi:10.1093/biostatistics/kxp008
OpenUrl CrossRef PubMed Web of Science

[23] 23.
Hardoon, D. R. & Shawe-Taylor, J. Sparse canonical correlation analysis. Mach. Learn. (2011). doi:10.1007/s10994-010-5222-7
OpenUrl CrossRef Web of Science

[24] 24.↵
Witten, D. M. & Tibshirani, R. J. Extensions of sparse canonical correlation analysis with applications to genomic data. Stat. Appl. Genet. Mol. Biol. (2009). doi:10.2202/1544-6115.1470
OpenUrl CrossRef

[25] 25.↵
Aguet, F. et al. The GTEx Consortium atlas of genetic regulatory effects across human tissues. bioRxiv (2019). doi:10.1101/787903
OpenUrl Abstract/FREE Full Text

[26] 26.↵
Volk, D. W., Chitrapu, A., Edelson, J. R. & Lewis, D. A. Chemokine receptors and cortical interneuron dysfunction in schizophrenia. Schizophr. Res. (2015). doi:10.1016/j.schres.2014.10.031
OpenUrl CrossRef

[27] 27.↵
Kassan, A. et al. Caveolin-1 regulation of disrupted-in-schizophrenia-1 as a potential therapeutic target for schizophrenia. J. Neurophysiol. (2017). doi:10.1152/jn.00481.2016
OpenUrl CrossRef PubMed

[28] 28.↵
Kakiuchi, C. et al. Functional polymorphisms of HSPA5: Possible association with bipolar disorder. Biochem. Biophys. Res. Commun. (2005). doi:10.1016/j.bbrc.2005.08.248
OpenUrl CrossRef PubMed Web of Science

[29] 29.↵
Martin, J. et al. A brief report: de novo copy number variants in children with attention deficit hyperactivity disorder. Transl. Psychiatry (2019). doi:10.1101/2019.12.12.19014555
OpenUrl Abstract/FREE Full Text

[30] 30.↵
Mancuso, N. et al. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am. J. Hum. Genet. (2017). doi:10.1016/j.ajhg.2017.01.031
OpenUrl CrossRef

[31] 31.↵
Hill, D. P., Smith, B., McAndrews-Hill, M. S. & Blake, J. A. Gene Ontology annotations: What they mean and where they come from. in BMC Bioinformatics (2008). doi:10.1186/1471-2105-9-S5-S2
OpenUrl CrossRef

[32] 32.↵
Kanehisa, M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. (2000). doi:10.1093/nar/28.1.27
OpenUrl CrossRef PubMed Web of Science

[33] 33.
Croft, D. et al. Reactome: A database of reactions, pathways and biological processes. Nucleic Acids Res. (2011). doi:10.1093/nar/gkq1018
OpenUrl CrossRef PubMed Web of Science

[34] 34.
Schaefer, C. F. et al. PID: The pathway interaction database. Nucleic Acids Res. (2009). doi:10.1093/nar/gkn653
OpenUrl CrossRef PubMed Web of Science

[35] 35.↵
Adriaens, M. E. et al. The public road to high-quality curated biological pathways. Drug Discovery Today (2008). doi:10.1016/j.drudis.2008.06.013
OpenUrl CrossRef PubMed Web of Science

[36] 36.↵
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst 1(6), 417–425 (2015).
OpenUrl PubMed

[37] 37.↵
Patel, S., Sharma, D., Kalia, K. & Tiwari, V. Crosstalk between endoplasmic reticulum stress and oxidative stress in schizophrenia: The dawn of new therapeutic approaches. Neuroscience and Biobehavioral Reviews (2017). doi:10.1016/j.neubiorev.2017.08.025
OpenUrl CrossRef

[38] 38.↵
Landek-Salgado, M. A., Faust, T. E. & Sawa, A. Molecular substrates of schizophrenia: Homeostatic signaling to connectivity. Molecular Psychiatry (2016). doi:10.1038/mp.2015.141
OpenUrl CrossRef PubMed

[39] 39.↵
Chen, X. et al. Apoptotic engulfment pathway and schizophrenia. PLoS One (2009). doi:10.1371/journal.pone.0006875
OpenUrl CrossRef PubMed

[40] 40.↵
Liu, M.-L. et al. Severe disturbance of glucose metabolism in peripheral blood mononuclear cells of schizophrenia patients: a targeted metabolomic study. J. Transl. Med. 13, 226 (2015).
OpenUrl

[41] 41.↵
Cannon, T. D., Yolken, R., Buka, S. & Torrey, E. F. Decreased Neurotrophic Response to Birth Hypoxia in the Etiology of Schizophrenia. Biol. Psychiatry 64, 797–802 (2008).
OpenUrl CrossRef PubMed Web of Science

[42] 42.↵
Ryskalin, L., Limanaqi, F., Frati, A., Busceti, C. & Fornai, F. mTOR-Related Brain Dysfunctions in Neuropsychiatric Disorders. Int. J. Mol. Sci. 19, 2226 (2018).
OpenUrl

[43] 43.↵
Kim, P., Scott, M. R. & Meador-Woodruff, J. H. Dysregulation of the unfolded protein response (UPR) in the dorsolateral prefrontal cortex in elderly patients with schizophrenia. Mol. Psychiatry (2019). doi:10.1038/s41380-019-0537-7
OpenUrl CrossRef

[44] 44.↵
Han, H. et al. TRRUST v2: An expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. (2018). doi:10.1093/nar/gkx1013
OpenUrl CrossRef PubMed

[45] 45.↵
Zheng, G. et al. ITFP: An integrated platform of mammalian transcription factors. Bioinformatics (2008). doi:10.1093/bioinformatics/btn439
OpenUrl CrossRef PubMed Web of Science

[46] 46.↵
Aberg, K. A. et al. Methylome-Wide Association Study of Schizophrenia. JAMA Psychiatry 71, 255 (2014).
OpenUrl

[47] 47.↵
Martínez, G. et al. Regulation of Memory Formation by the Transcription Factor XBP1. Cell Rep. 14, 1382–1394 (2016).
OpenUrl

[48] 48.↵
Szklarczyk, D. et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. (2019). doi:10.1093/nar/gky1131
OpenUrl CrossRef PubMed

[49] 49.↵
Bassotti, G. et al. Gastrointestinal motility disorders in inflammatory bowel diseases. World J. Gastroenterol. (2014). doi:10.3748/wjg.v20.i1.37
OpenUrl CrossRef PubMed

[50] 50.
Farrokhyar, F., Marshall, J. K., Easterbrook, B. & Irvine, E. J. Functional gastrointestinal disorders and mood disorders in patients with inactive inflammatory bowel disease: Prevalence and impact on health. Inflamm. Bowel Dis. (2006). doi:10.1097/01.MIB.0000195391.49762.89
OpenUrl CrossRef PubMed Web of Science

[51] 51.↵
Lakatos, P. L. & Lakatos, L. Risk for colorectal cancer in ulcerative colitis: Changes, causes and management strategies. World Journal of Gastroenterology (2008). doi:10.3748/wjg.14.3937
OpenUrl CrossRef PubMed Web of Science

[52] 52.↵
Van Der Post, S. et al. Structural weakening of the colonic mucus barrier is an early event in ulcerative colitis pathogenesis. Gut (2019). doi:10.1136/gutjnl-2018-317571
OpenUrl Abstract/FREE Full Text

[53] 53.↵
Rageul, J. et al. KLF4-dependent, PPARγ-induced expression of GPA33 in colon cancer cell lines. Int. J. Cancer (2009). doi:10.1002/ijc.24683
OpenUrl CrossRef PubMed Web of Science

[54] 54.↵
Williams, B. B. et al. Glycoprotein A33 deficiency: A new mouse model of impaired intestinal epithelial barrier function and inflammatory disease. DMM Dis. Model. Mech. (2015). doi:10.1242/dmm.019935
OpenUrl Abstract/FREE Full Text

[55] 55.↵
Gobert, A. P. et al. Distinct immunomodulatory effects of spermine oxidase in colitis induced by epithelial injury or infection. Front. Immunol. (2018). doi:10.3389/fimmu.2018.01242
OpenUrl CrossRef PubMed

[56] 56.↵
Hu, T. et al. Spermine oxidase is upregulated and promotes tumor growth in hepatocellular carcinoma. Hepatol. Res. (2018). doi:10.1111/hepr.13206
OpenUrl CrossRef

[57] 57.↵
Sadlack, B. et al. Ulcerative colitis-like disease in mice with a disrupted interleukin-2 gene. Cell 75, 253–261 (1993).
OpenUrl CrossRef PubMed Web of Science

[58] 58.↵
Hayden, M. S. & Ghosh, S. NF-κB in immunobiology. Cell Research (2011). doi:10.1038/cr.2011.13
OpenUrl CrossRef PubMed Web of Science

[59] 59.↵
Liu, T., Zhang, L., Joo, D. & Sun, S. C. NF-κB signaling in inflammation. Signal Transduction and Targeted Therapy (2017). doi:10.1038/sigtrans.2017.23
OpenUrl CrossRef PubMed

[60] 60.↵
Borm, M. E. A., Van Bodegraven, A. A., Mulder, C. J. J., Kraal, G. & Bouma, G. A NFKB1 promoter polymorphism is involved in susceptibility to ulcerative colitis. Int. J. Immunogenet. (2005). doi:10.1111/j.1744-313X.2005.00546.x
OpenUrl CrossRef PubMed Web of Science

[61] 61.↵
Wang, X. & Sun, Q. TP53 mutations, expression and interaction networks in human cancers. Oncotarget (2017). doi:10.18632/oncotarget.13483
OpenUrl CrossRef

[62] 62.↵
Millet, C. & Zhang, Y. E. Roles of Smad3 in TGF-β signaling during carcinogenesis. Critical Reviews in Eukaryotic Gene Expression (2007). doi:10.1615/CritRevEukarGeneExpr.v17.i4.30
OpenUrl CrossRef PubMed Web of Science

[63] 63.
Tang, P. M.-K. et al. Smad3 promotes cancer progression by inhibiting E4BP4-mediated NK cell development. Nat. Commun. 8, 14677 (2017).
OpenUrl CrossRef PubMed

[64] 64.↵
Lu, S., Lee, J., Revelo, M., Wang, X. & Dong, Z. Smad3 is overexpressed in advanced human prostate cancer and necessary for progressive growth of prostate cancer cells in nude mice. Clin. Cancer Res. (2007). doi:10.1158/1078-0432.CCR-07-1078
OpenUrl Abstract/FREE Full Text

[65] 65.↵
Johnson, I. R. D. et al. Endosomal gene expression: a new indicator for prostate cancer patient prognosis? Oncotarget 6, 37919–37929 (2015).
OpenUrl CrossRef

[66] 66.↵
Kanojia, D., Garg, M., Gupta, S., Gupta, A. & Suri, A. Sperm-Associated Antigen 9 Is a Novel Biomarker for Colorectal Cancer and Is Involved in Tumor Growth and Tumorigenicity. Am. J. Pathol. 178, 1009–1020 (2011).
OpenUrl CrossRef PubMed

[67] 67.↵
Garcia, H. et al. Facilitates Chromatin Transcription Complex Is an “Accelerator” of Tumor Transformation and Potential Marker and Target of Aggressive Cancers. Cell Rep. 4, 159–173 (2013).
OpenUrl CrossRef PubMed

[68] 68.↵
Umans, B. D., Battle, A. & Gilad, Y. Where Are the Disease-Associated eQTLs? Trends Genet. 1–16 (2020). doi:10.1016/j.tig.2020.08.009
OpenUrl CrossRef

[69] 69.↵
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
OpenUrl CrossRef PubMed

[70] 70.↵
Schäfer, J. & Strimmer, K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. (2005). doi:10.2202/1544-6115.1175
OpenUrl CrossRef

[71] 71.↵
Saha, A. & Battle, A. False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors. F1000Research (2018). doi:10.12688/f1000research.17145.1
OpenUrl CrossRef PubMed

Novel Aggregative trans-eQTL Association Analysis of Known Genetic Variants Detect Trait-specific Target Gene-sets

Abstract

Introduction

Results

Trait-specific patterns of trans-associations in Whole Blood

Schizophrenia

Ulcerative Colitis

Prostate Cancer

Conclusion

URL

Data Availability

Sample Description

eQTLGen

GTEx

UK Biobank

Methods

Estimating trait-specific pattern of trans-associations

Testing Hypothesis of Enrichment of Trait-Specific trans-Association using a Competitive Null Hypothesis Framework

Analysis of eQTLGen data

Cross-Mappability

Follow-up Analysis

Quantifying and Testing for Enrichment for Trait Heritability Explained by Identified Target Genes

References

Citation Manager Formats

Subject Area