Abstract
Heritable diseases often manifest in a highly tissue-specific manner, with different disease loci mediated by genes in distinct tissues or cell types. We propose Tissue-Gene Fine-Mapping (TGFM), a fine-mapping method that infers the posterior probability (PIP) for each gene-tissue pair to mediate a disease locus by analyzing GWAS summary statistics (and in-sample LD) and leveraging eQTL data from diverse tissues to build cis-predicted expression models; TGFM also assigns PIPs to causal variants that are not mediated by gene expression in assayed genes and tissues. TGFM accounts for both co-regulation across genes and tissues and LD between SNPs (generalizing existing fine-mapping methods), and incorporates genome-wide estimates of each tissue’s contribution to disease as tissue-level priors. TGFM was well-calibrated and moderately well-powered in simulations; unlike previous methods, TGFM was able to attain correct calibration by modeling uncertainty in cis-predicted expression models. We applied TGFM to 45 UK Biobank diseases/traits (average N = 316K) using eQTL data from 38 GTEx tissues. TGFM identified an average of 147 PIP > 0.5 causal genetic elements per disease/trait, of which 11% were gene-tissue pairs. Implicated gene-tissue pairs were concentrated in known disease-critical tissues, and causal genes were strongly enriched in disease-relevant gene sets. Causal gene-tissue pairs identified by TGFM recapitulated known biology (e.g., TPO-thyroid for Hypothyroidism), but also included biologically plausible novel findings (e.g., SLC20A2-artery aorta for Diastolic blood pressure). Further application of TGFM to single-cell eQTL data from 9 cell types in peripheral blood mononuclear cells (PBMC), analyzed jointly with GTEx tissues, identified 30 additional causal gene-PBMC cell type pairs at PIP > 0.5—primarily for autoimmune disease and blood cell traits, including the well-established role of CTLA4 in CD8+ T cells for All autoimmune disease. In conclusion, TGFM is a robust and powerful method for fine-mapping causal tissues and genes at disease-associated loci.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This research was funded by National Institutes of Health (NIH) grants R01 MH101244, R37 MH107649, R01 HG006399, R01 MH115676, U01 HG012009, and F32 HG012889. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study used only openly available human data that were originally located at http://www.ukbiobank.ac.uk/ and https://gtexportal.org/home/datasets.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
New supplementary figure (Supplementary Figure 18) and Supplementary Note now added to describe 6 additional examples of fine-mapped disease loci
Data Availability
We have made TGFM PIPs for gene tissue pairs, gene PBMC cell type pairs, genes, and non mediated variants across 45 diseases/traits (for both analyses of 38 GTEx tissues + analyses of 38 GTEx tissues and 9 PBMC cell types) publicly available at https://doi.org/10.7910/DVN/S26PFI, GTEx cis predicted expression models for all gene tissue pairs publicly available at https://doi.org/10.7910/DVN/8IPOPK, PBMC cis predicted expression models for all gene PBMC cell type pairs publicly available at https://doi.org/10.7910/DVN/A6K9QW, GWAS summary statistics for all 45 diseases/traits publicly available at https://doi.org/10.7910/DVN/GTEGPE. To limit the use of computational resources, we refer the reader to UK Biobank in sample LD (337K unrelated British ancestry samples) from ref. 36, which is publicly available at https://registry.opendata.aws/ukbb-ld/. The UK Biobank resource is publicly available via application (http://www.ukbiobank.ac.uk/).