Abstract
PURPOSE Tissue-agnostic biomarkers that capture the commonality in cancer biology, may provide a new avenue for treatment development and optimization across cancer types. Here, we aimed to evaluate and validate the clinical value of a tissue-agnostic cellular morphometrics biomarker (CMB) signature, which was discovered by artificial intelligence (AI) from H&E-stained whole-slide images (WSI) of diagnostic slides of colon cancers, in pan-gastrointestinal (pan-GI) pre-cancer lesions and cancers.
METHODS We discovered CMBs from WSI using our well-established CMB-ML pipeline and established a CMB risk score (CMBRS) using multivariate regression models. Based on CMBRS, we assigned individual patients from The Cancer Genome Atlas Colon Adenocarcinoma Cohort (TCGA-COAD) (n=430) to CMB risk groups (CMBRG). We then extensively evaluated tissue-agnostic clinical value of CMB signature, CMBRS and CMBRG in multi-cohorts with different types of GI cancer (n=2,219) and risk assessment of precancerous lesions (n=1,016). We unraveled each CMB-related biological function using bulk RNA-sequencing, single-cell RNA-sequencing (scRNA-seq) and opal multiplex immunohistochemistry (IHC) techniques.
RESULTS From the TCGA-COAD cohort, we developed a 13-CMB signature and constructed CMBRS/CMBRG that predict prognosis of colon cancer patients. Importantly, this 13-CMB signature proved prognostic and predictive values for TCGA patients with rectal, gastric and esophageal cancer independent of traditional clinical factors. These findings were independently validated using multiple cohorts from Drum Tower Hospital. Moreover, 13-CMB signature exhibited the power for risk stratification of colon adenoma and early esophageal neoplastic lesion patients for predicting cancer progression. In addition, we demonstrated and validated independent prognostic impacts of gene signatures and CMB signatures and a significant increase in predictive power by integration of CMB signature, gene signature and clinical factors. Correlations between CMBs and gene expression levels revealed the association of each CMB with biological functions including cell proliferation, epithelial-to-mesenchymal transition and immune microenvironment. The association of CMBs with the immune microenvironment was prospectively validated by scRNA-seq and was further confirmed by Opal multiplex IHC staining in colon cancer.
CONCLUSION This study demonstrates the clinical value of tissue-agnostic AI-empowered CMB signature from WSI with defined biological functions, which can be used in clinical settings to assess risk, diagnose disease, and guide clinical interventions. Tissue-agnostic CMBs potentially provide a new avenue for a rapid, robust and cost-effective cross-cancer prediction that is essential for developing common treatment strategy for multiple cancers.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work has been supported by the National Natural Science Foundation of China (ID Number: 82272952), the Natural Science Foundation of Jiangsu Province for Excellent Young Scholars (ID Number: BK20220094), China Postdoctoral Science Foundation (ID Number: 2022M721579) and funds for Clinical Trials from the Affiliated Drum Tower Hospital, Medical School of Nanjing University (ID Number: 2021-LCYJ-PY-21).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The hospital validation study was approved by the Institutional Review Board (IRB) at the participating hospital and was independently carried out at Nanjing Drum Tower Hospital.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Whole slide images and clinical data of the TCGA cohorts were downloaded from the TCGA GDC portal (https://portal.gdc.cancer.gov/). Processed data related to TCGA and Drum Tower cohorts have been provided with the manuscript. Raw data from the Nanjing Drum Tower Hospital is not currently permitted in public repositories because ethical and legal implications are still being discussed at an institutional level.
Abbreviations
- H&E
- hematoxylin and eosin
- AI
- artificial intelligence
- GI
- gastrointestinal
- CMB
- cellular morphometrics biomarker
- WSI
- whole-slide images
- CMBRS
- CMB risk score
- CMBRG
- CMB risk group
- OS
- overall survival
- scRNA-seq
- single cell RNA sequencing
- IHC
- immunohistochemistry
- TCGA-COAD
- The Cancer Genome Atlas - Colon Adenocarcinoma
- TCGA-STAD
- The Cancer Genome Atlas - Stomach Adenocarcinom
- TCGA-READ
- The Cancer Genome Atlas - Rectum Adenocarcinoma
- TCGA-ESCA
- The Cancer Genome Atlas - Esophageal Carcinoma
- LGIN
- Low-Grade Intraepithelial Neoplasia
- HGIN
- High-Grade Intraepithelial Neoplasia
- CAP
- Colon Adenomatous Polyps
- EEL
- Early Esophageal Lesion
- CRC
- Colorectal Cancer
- DT
- Drum Tower Hospital