Abstract
Background Efforts to address the poor prognosis associated with esophageal adenocarcinoma (EAC) have been hampered by a lack of biomarkers to identify early disease and therapeutic targets. Despite extensive efforts to understand the somatic mutations associated with EAC over the past decade, a gap remains in understanding how the atlas of genomic aberrations in this cancer impacts the proteome. Differences in transcript and the corresponding protein abundances remain under-explored, leaving gaps in our understanding of the mechanisms underlying the disease.
Methods We performed a quantitative proteomic analysis of 23 EACs and matched adjacent normal esophageal and gastric tissues. We explored the correlation of transcript and protein abundance used tissue-matched RNAseq and proteomic data from 7 patients and further integrated these data with a cohort of EAC RNA-seq data (n=264 patients), whole-genome sequencing (n=454 patients) and external published datasets.
Results We quantified protein expression from 5897 genes in EAC and patient-matched normal tissues. Several biomarker candidates with EAC-specific expression were identified including the transmembrane protein GPA33. We further verified the EAC-enriched expression of GPA33 in an external cohort of 115 patients and confirm this as an attractive diagnostic and therapeutic target. To further extend the insights gained from our proteomic data, an integrated analysis of protein and RNA expression in EAC and normal tissues revealed several genes with poorly correlated Protein and RNA abundance, suggesting post-transcriptional regulation of protein expression. These outlier genes including SLC25A30, TAOK2, and AGMAT, only rarely demonstrated somatic mutation suggesting post-transcriptional drivers for this EAC-specific phenotype. AGMAT was demonstrated to be over-expressed at the protein level in EAC compared to adjacent normal tissues with an EAC-specific post-transcriptional mechanism of regulation of protein expression proposed.
Conclusions By quantitative proteomic analysis we have identified GPA33 as an EAC-specific biomarker. Integrated analysis of proteome, transcriptome, and genome in EAC has revealed several genes with tumor-specific post-transcriptional regulation of protein expression which may be an exploitable vulnerability.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
OCCAMS was funded by a Programme Grant from Cancer Research UK (RG66287). SA is supported by the Deanship of Scientific Research, The Hashemite University: grants no. 785/48/2022 and 738/54/2022. BV and LH were supported by the European Regional Development Fund - Project ENOCH (CZ.02.1.01/0.0/0.0/ 16_019/0000868) and the Ministry of Health, Czech Republic - conceptual development of research organization (MMCI, 00209805). The study was supported by the project: International Center for Cancer Vaccine Science, which is carried out within the International Agendas Programme of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund. RON received support from the CRUK Cambridge Centre Thoracic Cancer Programme (CTRQQR-2021\100012).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Institutional Review Board Approval. All patients gave prospective written informed consent to the use of tissue, clinical data, and publication with ethical and research governance approvals in place from the Lothian Local Research Ethics Committee (UK, REC references 06/S1101/16), Tayside Committee on Medical Research Ethics (UK, REC 10/S1402/33), Cambridgeshire 4 Research Ethics Committee (UK, REC 10/H0305/1) and the NHS Lothian Research and Development Office (UK, R&D ID 2006/W/PA/01, R&D ID 2011/W/ON.27). Patients were de-identified at the time of consent to the use of tissue and clinical data and therefore no patient-identifiable data have been included. Consent to publication of results arising from the use of tissue and data were prospectively obtained.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵* Co-first authors contributed equally and can prioritize their name when adding this paper’s reference to résumés.
Data Availability
All data produced in the present study are available upon reasonable request to the authors.
List of abbreviations
- EAC
- Esophageal adenocarcinoma
- TMT
- Tandem Mass Tag
- RT
- Room temperature
- MS
- Mass spectrometry
- AGC
- Automated gain control
- FDR
- False discovery rate
- TvE
- Tumor vs normal esophagus
- TvG
- Tumor vs normal gastric tissue
- TMM
- Trimmed-median of M-values
- IHC
- Immunohistochemistry
- TMAs
- Tissue microarrays
- TPM
- Transcript per million
- GTEx consortium
- Genotype-Expression consortium dataset
- ICGC ESAD
- International Cancer Genome Consortium Esophageal Adenocarcinoma cancer genome project
- hg38
- Human reference genome version 38
- GATK
- Genome Analysis Toolkit
- ITLN1
- Intelectin-1
- VIL1
- Villin
- RBM
- RNA binding motif
- Insulin-Like Growth Factor-Binding Protein 1
- IGF2BP1
- GPA33
- Glycoprotein A33