ABSTRACT
Genome-wide association studies (GWAS) have identified numerous genetic loci associated with breast and prostate cancer risk, suggesting that germline genetic dysregulation influences tumorigenesis. However, the biological function underlying many genetic associations is not well-understood. Previous efforts to annotate loci focused on protein-coding genes (pcGenes) largely ignore non-coding RNAs (ncRNAs) which account for most transcriptional output in human cells and can regulate transcription of both pcGenes and other ncRNAs. Though the biological roles of most ncRNAs are not well-defined, many ncRNAs are involved in cancer development. Here, we explore one regulatory hypothesis: ncRNAs as trans-acting mediators of gene expression regulation in non-cancerous and tumor breast and prostate tissue. Using germline genetics as a causal anchor, we categorize distal (>1 Megabase) expression quantitative trait loci (eQTLs) of pcGenes significantly mediated by local-eQTLs of ncRNAs (within 1 Megabase). We find over 300 mediating ncRNAs and show the linked pcGenes are enriched for immunoregulatory and cellular organization pathways. By integrating eQTL and cancer GWAS results through colocalization and genetically-regulated expression analyses, we detect overlapping signals in nine known breast cancer loci and one known prostate cancer locus, and multiple novel genetic associations. Our results suggest a strong transcriptional impact of ncRNAs in breast and prostate tissue with implications for cancer etiology. More broadly, our framework can be systematically applied to functional genomic features to characterize genetic variants distally regulating transcription through trans-mechanisms.
SIGNIFICANCE This study identifies non-coding RNAs that potentially regulate gene expression in trans-pathways and overlap with genetic signals for breast and prostate cancer susceptibility, with implications for interpretation of cancer genome-wide association studies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
TS was supported by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under Award Number T32NS048004. This research was supported by the National Institute of Mental Health of the National Institutes of Health under Award number 5R01MH115676-04. AG was partially supported by R01 CA227237. SL was partially supported by NIH award R01 CA194393. BP were partially supported by NIH awards R01 HG009120, R01 MH115676, R01 CA251555, R01 AI153827, R01 HG006399, R01 CA244670, U01 HG011715. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was approved by the Office of Human Research Ethics at the University of California, Los Angeles, and written informed consent was obtained from each participant.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Tighten language
Data Availability
GTEx v8 genotype, expression, and covariate data were obtained through dbGAP Study Accession phs000424.v8.p2. TCGA genotype were obtained through dbGAP Study Accession phs000178.v11.p8 and expression and covariate data was obtained from the Broad GDAC Firehose repository (https://gdac.broadinstitute.org). Prostate cancer risk summary statistics were obtained from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) Consortium: http://practical.icr.ac.uk/blog/wp-content/uploads/uploadedfiles/oncoarray/MetaSummaryData/meta_v3_onco_euro_overall_ChrAll_1_release.zip. Breast cancer risk summary statistics were obtained from the Breast Cancer Association Consortium (BCAC): https://bcac.ccge.medschl.cam.ac.uk/bcacdata/oncoarray/oncoarray-and-combined-summary-result/gwas-summary-associations-breast-cancer-risk-2020/. Sample code for this analysis are available at https://github.com/ColetheStatistician/ncRNAInBreastCancer/.
https://gdac.broadinstitute.org
https://github.com/ColetheStatistician/ncRNAInBreastCancer/
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000424.v8.p2
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000178.v11.p8