Abstract
Genome-wide association studies have unearthed a wealth of genetic associations across many complex diseases. However, translating these associations into biological mechanisms contributing to disease etiology and heterogeneity has been challenging. Here, we hypothesize that the effects of disease-associated genetic variants converge onto distinct cell type specific molecular pathways within distinct subgroups of patients. In order to test this hypothesis, we develop the CASTom-iGEx pipeline to operationalize individual level genotype data to interpret personal polygenic risk and identify the genetic basis of clinical heterogeneity. The paradigmatic application of this approach to coronary artery disease and schizophrenia reveals a convergence of disease associated variant effects onto known and novel genes, pathways, and biological processes. The biological process specific genetic liabilities are not equally distributed across patients. Instead, they defined genetically distinct groups of patients, characterized by different profiles across pathways, endophenotypes, and disease severity. These results provide further evidence for a genetic contribution to clinical heterogeneity and point to the existence of partially distinct pathomechanisms across patient subgroups. Thus, the universally applicable approach presented here has the potential to constitute an important component of future personalized medicine concepts.
Competing Interest Statement
F.I. receives funding from Open Targets, a public-private initiative involving academia and industry, and performs consultancy for the joint AstraZeneca-CRUK functional genomics centre and for Mosaic Therapeutics. TFMA is a salaried employee of Boehringer Ingelheim Pharma outside the submitted work.
Funding Statement
This work was supported by grants from the BMBF eMed program grant 01ZX1504 to MZ, the Max-Planck-Society and BMBF eMed program grant 01ZX1706 to MZ, HS. and JG. TGS and PF are supported by the Deutsche Forschungsgemeinschaft (German Research Foundation; DFG) within the framework of the projects http://www.kfo241.de and http://www.PsyCourse.de (SCHU 1603/4-1, 5-1, 7-1 FA241/16-1).
TGS received additional support from the German Federal Ministry of Education and Research (BMBF) within the framework of the BipoLife network (01EE1404H), IntegraMent (01ZX1614K), e:Med Program (01ZX1614K) and the Dr. Lisa Oehler Foundation (Kassel, Germany). TGS was further supported by the grants GWPI-BIOPSY (01EW 2005) and MulioBio (01EW 2009) from ERA-NET Neuron (BMBF). UH was supported by European Union Horizon 2020 Research and Innovation Programme (PSY-PGx, grant agreement No 945151). SP received support from the NARSAD Young Investigator Grant.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study uses only existing genetic/phenotypic data from repositories that can be accessed in the following manner: The UKBB data are privacy protected and access can be requested through the UKBB data portal. The GTEx data are available through dbGAP accession number phs000424.v7.p2. The PGC data are privacy protected and can be accessed through a secondary analysis proposal sponsored by a PGC-SCZ working group PI member that needs to be approved by the working group. The German cohorts of CARDIoGRAM consortium is privacy protected and can only be accessed through collaboration with PIs of the consortium, e.g. HS. The PsyCourse Study data are privacy protected but can be accessed by submitting a research proposal (see http://www.psycourse.de/openscience-en.html). The genotype and gene expression data from the CommonMind consortium is privacy protected and can be accessed via the CommonMind knowledge portal: http://dx.doi.org/10.7303/syn2759792. The SHIP-Trend study genotype data is privacy protected and can be accessed through the study PIs: https://www.maelstrom-research.org/study/ship. The PsyCourse data is privacy protected and can be accessed via an analysis request proposal through the study website: http://psycourse.de/openscience-de.html.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The software pipeline is based on R and is available at https://gitlab.mpcdf.mpg.de/luciat/castom-igex. The trained tissue specific PriLer models on GTEx v6p and CMC release 1 reference panels are available at https://doi.org/10.6084/m9.figshare.22347574.v2. TWAS and PALAS summary statistics for CAD and SCZ can be found at https://doi.org/10.6084/m9.figshare.22495561.v1.