Abstract
The combinatorial effect of genetic variants is often assumed to be additive. Although genetic variation can clearly interact non-additively, methods to uncover epistatic relationships remain in their infancy. We develop low-signal signed iterative random forests to elucidate the complex genetic architecture of cardiac hypertrophy. We derive deep learning-based estimates of left ventricular mass from the cardiac MRI scans of 29,661 individuals enrolled in the UK Biobank. We report epistatic genetic variation including variants close to CCDC141, IGF1R, TTN, and TNKS. Several loci where variants were deemed insignificant in univariate genome-wide association analyses are identified. Functional genomic and integrative enrichment analyses reveal a complex gene regulatory network in which genes mapped from these loci share biological processes and myogenic regulatory factors. Through a network analysis of transcriptomic data from 313 explanted human hearts, we found strong gene co-expression correlations between these statistical epistasis contributors in healthy hearts and a significant connectivity decrease in failing hearts. We assess causality of epistatic effects via RNA silencing of gene-gene interactions in human induced pluripotent stem cell-derived cardiomyocytes. Finally, single-cell morphology analysis using a novel high-throughput microfluidic system shows that cardiomyocyte hypertrophy is non-additively modifiable by specific pairwise interactions between CCDC141 and both TTN and IGF1R. Our results expand the scope of genetic regulation of cardiac structure to epistasis.
Competing Interest Statement
E.A.A. is a Founder of Personalis, Deepcell, Svexa, RCD Co, and Parameter Health; Advisor to Oxford Nanopore, SequenceBio, and Pacific Biosciences; and a non-executive director for AstraZeneca. C.S.W. is a consultant for Tensixteen Bio and Renovacor. V.N.P. is an SAB member for and receives research support from BioMarin, Inc, and is a consultant for Constantiam, Inc. and viz.ai. The remaining authors declare no competing interests.
Funding Statement
This work was supported by the Chan Zuckerberg Biohub - San Francisco through the Intercampus Research Awards (2019 - 2022) to R.A., J.R.P., J.B.B., A.J.B., E.A.A., and B.Y. E.A.A. received funding from National Institutes of Health (NIH) through grant number 1R01HL144843. B.Y. received support from National Science Foundation (NSF) through grants DMS-1613002 and IIS 1741340, an NIH grant R01GM152718, and a Weill Neurohub grant. V.N.P. received funding from K08HL143185. T.M.T. was supported by the National Science Foundation (NSF) Graduate Research Fellowship Program DGE-2146752. Q.W. received funding from American Heart Association Postdoctoral Fellowship through grant number 23POST1023278. C.S.W. received support from NIH through grants F32HL160067 and L30HL159413.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The use of human subjects (IRB - 4237) and human-derived induced pluripotent stem cells (SCRO - 568) in this study has been approved by the Stanford Research Compliance Office. The UK Biobank received ethical approval from the North West - Haydock Research Ethics Committee (21/NW/0157).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
We have made the following key changes to improve our study in both computational and experimental perspectives. They include: ● A broader comparison with various commonly used machine learning methods, polygenic risk scores, regression-based pairwise interaction scans, MAPIT, and marginal gene-based methods (e.g., SKAT-O and MAGMA), all of which consistently validate the superior prediction accuracy and prioritizations using lo-siRF. ● Further analyses that underscore the necessity and importance of the binarization step in lo-siRF. ● An additional non-hypertensive cohort analysis which reveals weak pleiotropic effects of the three identified epistatic interactions on hypertension, strengthening the evidence for epistasis. ● New results from additional co-expression network analyses, which reinforced our findings about a strong molecular connectivity between statistical epistasis contributors compared to random gene pairs in healthy hearts and a significantly weakened connectivity in failing hearts at the transcriptomic level. ● Additional assessment of the non-additive (epistatic) effects in the gene-silencing experiments that confirmed the consistency between predicted and biologically observed epistases, demonstrating the existence, strength, and directions of epistatic effects. ● Expanded evaluation of gene knockdown effects on cell size in different scales, which confirmed a stable epistatic effect across different choices of scales and statistical evaluation methods. ● New simulation study which confirmed a small likelihood that variations in gene-silencing efficiencies across experimental batches lead to spurious epistasis signals, particularly in the signal-to-noise ratio regimes relevant to this study. These additional results and extensive justification of our modeling have been incorporated into the revised manuscript and in the following interactive HTML webpage available at: https://yu-group.github.io/epistasis-cardiac-hypertrophy/
Data Availability
All genotype and cardiac MRI data used as input to the lo-siRF pipeline are available from the UK Biobank (https://www.ukbiobank.ac.uk/). This work was conducted under the UK Biobank application 22282. GWAS-filtered SNVs using PLINK and BOLT-LMM are summarized in Extended Data 2. Data for the gene co-expression networks from 313 explanted human hearts is available at https://doi.org/10.5281/zenodo.2600420. All other data produced in this study are available upon reasonable request to the authors.