Summary
Background Valid stratification factors for patients with epithelial ovarian cancer (EOC) are still lacking and individualisation of care remains an unmet need. Radiomics from routine Contrast Enhanced Computed Tomography (CE-CT) is an emerging, highly promising approach towards more accurate prognostic models for the better preoperative stratification of the subset of patients with high-grade-serous histology (HGSOC). However, requirements of fine manual segmentation limit its use. To enable its broader implementation, we developed an end-to-end model that automates segmentation processes and prognostic evaluation algorithms in HGSOC.
Methods We retrospectively collected and segmented 607 CE-CT scans across Europe and United States. The development cohort comprised of patients from Hammersmith Hospital (HH) (n=211), which was split with a ratio of 7:3 for training and validation. Data from The Cancer Imagine Archive (TCIA) (United States, n=73) and Kliniken Essen-Mitte (KEM) (Germany, n=323) were used as test sets. We developed an automated segmentation model for primary ovarian cancer lesions in CE-CT scans with U-Net based architectures. Radiomics data were computed from the CE-CT scans. For overall survival (OS) prediction, combinations of 13 feature reduction methods and 12 machine learning algorithms were developed on the radiomics data and compared with convolutional neural network models trained on CE-CT scans. In addition, we compared our model with a published radiomics model for HGSOC prognosis, the radiomics prognostic vector. In the HH and TCIA cohorts, additional histological diagnosis, transcriptomics, proteomics, and copy number alterations were collected; and correlations with the best performing OS model were identified. Predicated probabilities of the best performing OS model were dichotomised using k-means clustering to define high and low risk groups.
Findings Using the combination of segmentation and radiomics as an end-to-end framework, the prognostic model improved risk stratification of HGSOC over CA-125, residual disease, FIGO staging and the previously reported radiomics prognostic vector. Calculated from predicted and manual segmentations, our automated segmentation model achieves dice scores of 0.90, 0.88, 0.80 for the HH validation, TCIA test and KEM test sets, respectively. The top performing radiomics model of OS achieved a Concordance index (C-index) of 0.66 ± 0.06 (HH validation) 0.72 ± 0.05 (TCIA), and 0.60 ± 0.01 (KEM). In a multivariable model of this radiomics model with age, residual disease, and stage, the C-index values were 0.71 ± 0.06, 0.73 ± 0.06, 0.73 ± 0.03 for the HH validation, TCIA and KEM datasets, respectively. High risk groups were associated with poor prognosis (OS) the Hazard Ratios (CI) were 4.81 (1.61-14.35), 6.34 (2.08-19.34), and 1.71 (1.10 - 2.65) after adjusting for stage, age, performance status and residual disease. We show that these risk groups are associated with and invasive phenotype involving soluble N-ethylmaleimide sensitive fusion protein attachment receptor (SNARE) interactions in vesicular transport and activation of Mitogen-Activated Protein Kinase (MAPK) pathways.
Funding This article represents independent research funded by 1) the Medical Research Council (#2290879), 2) Imperial STRATiGRAD PhD program, 3) CRUK Clinical PhD Grant C309/A31316, 4) the National Institute for Health Research (NIHR) Biomedical Research Centre at Imperial College, London 5) and the National Institute for Health Research (NIHR) Biomedical Research Centre at the Royal Marsden NHS Foundation Trust and The Institute of Cancer Research, London.
Evidence before this study Epithelial ovarian cancer (EOC) is the deadliest of all gynaecological cancers, causing 4% of all cancer deaths in women. The most prevalent subtype (70% of EOC patients), high-grade serous ovarian cancer (HGSOC), has the highest mortality rate of all histology subtypes. Radiomics is a non-invasive strategy that has been used to guide cancer management, including diagnosis, prognosis prediction, tumour staging, and treatment response evaluation. To the best of our knowledge, Lu and colleague’s radiomics prognostic vector was the first radiomics model developed and validated to predict overall survival (OS) in HGSOC individuals, from contrast enhanced computed tomography (CE-CT) scans. Both this study and subsequent studies utilised manual segmentations, which adds to the radiologist’s/clinician’s workload and limits widespread use. Additionally, while the models by Lu and co-workers were validated in additional datasets, they were neither harmonised through image resampling – a present requirement for radiomics analysis outlined by the image biomarker standardization initiative – nor compared across machine learning/deep learning models, which could potentially improve predictive performance.
Added value of this study The use of adnexal lesion manually delineated segmentations alone to predict outcome is considered demanding and impractical for routine use. By developing a primary ovarian lesion segmentation, our radiomics-based prognostic model could be integrated into the routine ovarian cancer diagnostic workflow, offering risk-stratification and personalised surveillance at the time of treatment planning. Our study is the first to develop an end-to-end pipeline for primary pre-treatment HGSOC prognosis prediction. Several deep learning and machine learning models were compared for prognosis from CE-CT scan-derived, radiomics and clinical data to improve model performance.
Implications of all the available evidence Our research demonstrates the first end-to-end HGSOC OS prediction pipeline from CE-CT scans, on two external test datasets. As part of this, we display the first primary ovarian cancer segmentation model, as well as the largest comparative radiomics study using machine learning and deep learning approaches for OS predictions in HGSOC. Our study shows that physicians and other clinical practitioners with little experience in image segmentation can obtain quantitative imaging features from CE-CT for risk stratification. Furthermore, using our prognosis model to stratify patients by risk has revealed sub-groups with distinct transcriptomics and proteomics biology. This work lays the foundations for future experimental work and prospective clinical trials for quantitative personalised risk-stratification for therapeutic-intent in HGSOC-patients.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This article represents independent research funded by 1) the Medical Research Council (#2290879), 2) Imperial STRATiGRAD PhD program, 3) CRUK Clinical PhD Grant C309/A31316, 4) the National Institute for Health Research (NIHR) Biomedical Research Centre at Imperial College, London 5) and the National Institute for Health Research (NIHR) Biomedical Research Centre at the Royal Marsden NHS Foundation Trust and The Institute of Cancer Research, London.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Retrospective cohort study was conducted with ethical approval for the analysis of human data, which was obtained from the Hammersmith and Queen Charlotte's & Chelsea Research Ethics Committee (approval 05/QO406/178) and the Kliniken Essen-Mitte Research Ethics Committee (informed consent was waived)
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors