ABSTRACT
Background and Aims The microbiome has long been suspected of a role in colorectal cancer (CRC) tumorigenesis. The mutational signature SBS88 mechanistically links CRC development with the strain of Escherichia coli harboring the pks island that produces the genotoxin colibactin, but the genomic, pathological and survival characteristics associated with SBS88-positive tumors are unknown.
Methods SBS88-positive CRCs were identified from targeted sequencing data from 5,292 CRCs from 17 studies and tested for their association with clinico-pathological features, oncogenic pathways, genomic characteristics and survival.
Results In total, 7.5% (398/5,292) of the CRCs were SBS88-positive, of which 98.7% (392/398) were microsatellite stable/microsatellite instability low (MSS/MSI-L), compared with 80% (3916/4894) of SBS88 negative tumors (p=1.5×10-28). Analysis of MSS/MSI-L CRCs demonstrated that SBS88 positive CRCs were associated with the distal colon (OR=1.84, 95% CI=1.40-2.42, p=1×10-5) and rectum (OR=1.90, 95% CI=1.44-2.51, p=6×10-6) tumor sites compared with the proximal colon. The top seven recurrent somatic mutations associated with SBS88-positive CRCs demonstrated mutational contexts associated with colibactin-induced DNA damage, the strongest of which was the APC:c.835-8A>G mutation (OR=65.5, 95%CI=39.0-110.0, p=3×10-80). Large copy number alterations (CNAs) including CNA loss on 14q and gains on 13q, 16q and 20p were significantly enriched in SBS88- positive CRCs. SBS88-positive CRCs were associated with better CRC-specific survival (p=0.007; hazard ratio of 0.69, 95% CI=0.52-0.90) when stratified by age, sex, study, and by stage.
Conclusion SBS88-positivity, a biomarker of colibactin-induced DNA damage, can identify a novel subtype of CRC characterized by recurrent somatic mutations, copy number alterations and better survival. These findings provide new insights for treatment and prevention strategies for this subtype of CRC.
Competing Interest Statement
Dr. Marios Giannakis received research funding from Servier and Janssen, unrelated to this study. Dr. Stephen B Gruber co-founded Brogent International LLC, unrelated to this study. Dr. Jonathan A. Nowak received research support from Akoya Biosciences, Illumina, and NanoString, unrelated to this study. Dr. Rish K. Pai received consultant income from Alimentiv Inc., Allergan, Eli Lilly, and AbbVie, unrelated to this study. Dr. Robert E. Schoen received research support from Freenome, Immunovia, and Exact Sciences, unrelated to this study. All other authors declare no competing interests.
Funding Statement
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO): National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services (U01 CA137088, U01 CA164930, R21 CA191312, R01 CA176272). Genotyping/Sequencing services were provided by the Center for Inherited Disease Research (CIDR) contract numbers HHSN268201700006I and HHSN268201200008I. This research was funded in part through the NIH/NCI Cancer Center Support Grant P30 CA015704. Scientific Computing Infrastructure at Fred Hutch funded by ORIP grant S10OD028685. Dr. Peter Georgeson was supported by a Cancer Council Victoria research grant. CORSA: The CORSA study was funded by Austrian Research Funding Agency (FFG) BRIDGE (grant 829675, to Andrea Gsur), the "Herzfelder'sche Familienstiftung" (grant to Andrea Gsur) and was supported by COST Action BM1206. CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study-II (CPS-II) cohort. The study protocol was approved by the institutional review boards of Emory University, and those of participating registries as required. CRA: This work was supported by National Institutes of Health grant R01 CA68535 CRCGEN: Colorectal Cancer Genetics & Genomics, Spanish study was supported by Instituto de Salud Carlos III, co-funded by FEDER funds -a way to build Europe- (grants PI14-613 and PI09-1286), Agency for Management of University and Research Grants (AGAUR) of the Catalan Government (grant 2017SGR723), and Junta de Castilla y León (grant LE22A10-2). Sample collection of this work was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d'Oncología de Catalunya (XBTC), Plataforma Biobancos PT13/0010/0013 and ICOBIOBANC, sponsored by the Catalan Institute of Oncology. DACHS: This work was supported by the German Research Council (BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, CH 117/1-1, HO 5117/2-1, HE 5998/2-1, KL 2354/3-1, RO 2270/8-1 and BR 1704/17-1), the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany, and the German Federal Ministry of Education and Research (01KH0404, 01ER0814, 01ER0815, 01ER1505A and 01ER1505B). HCCS: This work was supported by the National Institutes of Health (grant numbers R01 CA155101, U01 HG004726, R01 CA140561, T32 ES013678, U19 CA148107, P30 CA014089) Harvard cohorts (HPFS, NHS): HPFS is supported by the National Institutes of Health (P01 CA055075, UM1 CA167552, U01 CA167552, R01 CA137178, R01 CA151993, and R35 CA197735), and NHS by the National Institutes of Health (R01 CA137178, P01 CA087969, UM1 CA186107, R01 CA151993, and R35 CA197735). S. Ogino was supported in part by the Cancer Research UK Grand Challenge Award (C10674/A27140). IWHS: This study was supported by NIH grants CA107333 (R01 grant awarded to P.J. Limburg) and HHSN261201000032C (N01 contract awarded to the University of Iowa). MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 509348, 209057, 251553 and 504711 and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. PLCO: Intramural Research Program of the Division of Cancer Epidemiology and Genetics and supported by contracts from the Division of Cancer Prevention, National Cancer Institute, NIH, DHHS. Funding was provided by National Institutes of Health (NIH), Genes, Environment and Health Initiative (GEI) Z01 CP 010200, NIH U01 HG004446, and NIH GEI U01 HG 004438. SFCCR: The Seattle site of the Colon CFR Cohort (www.coloncfr.org), is supported in part by the National Cancer Institute (NCI) of the National Institutes of Health (NIH) Award U01 CA167551. Additional support for the SFCCR and the SFCCR Illumina HumanCytoSNP array were through NCI/NIH awards U01 CA074794 (to JDP) and U24 CA074794 and R01 CA076366 (to PAN). Support for case ascertainment was provided from the Surveillance, Epidemiology and End Results (SEER) Program of the NCI. The content of this manuscript does not necessarily reflect the views or policies of the NIH or SFCCR, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government, the SEER Program, or the CCFR. WHI: The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study used openly available human data. The panel-sequenced data used in this study are available at the database of Genotypes and Phenotypes (dbGaP). The Ontario Institute of Cancer Research (OICR) data is available under accession code phs002050.v1.p1. The Center for Inherited Disease Research (CIDR) data is available under accession code phs001905.v1.p1. Mutational signature definitions were downloaded from the COSMIC website at https://cancer.sanger.ac.uk/signatures/downloads/.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵# joint senior authors
Funding statement updated.
Data Availability
The panel-sequenced data used in this study are available at the database of Genotypes and Phenotypes (dbGaP). The Ontario Institute of Cancer Research (OICR) data is available under accession code phs002050.v1.p1. The Center for Inherited Disease Research (CIDR) data is available under accession code phs001905.v1.p1. Mutational signature definitions were downloaded from the COSMIC website at https://cancer.sanger.ac.uk/signatures/downloads/.
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs002050.v1.p1
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001905.v1.p1