Computational prediction of therapeutic response and cancer outcomes
====================================================================

* Matthew Griffiths
* Amanzhol Kubeyev
* Jordan Laurie
* Andrea Giorni
* Luiz A. Zillmann da Silva
* Prabu Sivasubramaniam
* Matthew Foster
* Andrew V. Biankin
* Uzma Asghar

## Abstract

Oncology therapeutic development continues to be plagued by high failure rates leading to substantial costs with only incremental improvements in overall benefit and survival. Advances in technology including the molecular characterisation of cancer and computational power provide the opportunity to better model therapeutic response and resistance. Here we use a novel approach which utilises Bayesian statistical principles used by astrophysicists to measure the mass of dark matter to predict therapeutic response. We construct “Digital Twins” of individual cancer patients and predict response for cancer treatments. We validate the approach by predicting the results of clinical trials. Better prediction of therapeutic response would improve current clinical decision-making and oncology therapeutic development.

## Introduction

Therapeutic development in oncology continues to be challenging. Whilst significant advances have been made in some instances, progress continues to be slow and incremental. The vast majority of candidate therapies fail, and the failure rate in advanced phase 3 clinical trials remains high. This inefficiency costs over $50 billion per annum, which is unsustainable for most health systems and economies.

Advances in the molecular profiling of cancer, coupled with accelerated computing power, provide the promise of moving away from a “trial and error” approach to cancer treatment and therapeutic development, to one where we can predict therapeutic efficacy prior to treatment. “Digital Twins”; in-silico virtual replicas of cancer patients, offer enticing possibilities for improving cancer treatment. The benefits of accurate prediction of therapeutic response and patient outcome can be applied at many points in therapeutic development, from early candidate drug selection through to late phase clinical trials and routine cancer care.

Here, we present a machine learning approach which simulates treatment with cytotoxic and small molecule therapies with applicability across numerous cancer types. We demonstrate how these models can predict overall response rates (ORR) for a range of cancer types and treatments. The digital twins predict drug efficacy, for single agent or drug combinations and can predict if treatment A will perform better than treatment B in individual patients and in virtual clinical trials. The prediction accuracies were tested against actual response rates and overall survival metrics in historical clinical trials. Synthetic controls for comparator arms of clinical trials were constructed to enable benchmarking of predicted clinical efficacy of investigational drugs versus standard of care. The simulated clinical trial can then predict survival. Finally, we demonstrate how this approach can be used for patient cohort enrichment for an investigational drug of interest, and calculate the predicted increase in response rates achieved through such enrichment strategies.

## Results

### Constructing Digital Twins to simulate therapeutic response and clinical trials

The modelling approach we used arose out of a collaboration with astrophysicists1 to develop advanced Bayesian inference software that enables integrative modelling of gravitational lensing and cancer biology. These partnerships motivated a transfer learning approach where detailed molecular and therapeutic data generated from biological experimentation was used to build generalisable Bayesian models that can be applied to predict treatment efficacy for single agents and combinations.

We created a computational framework that could predict in vitro therapeutic response and clinical response and survival using multi-dimensional data that included molecular profiles, predominantly genomic and transcriptomic. Digital twins were created to address specific questions using 3 distinct models: 1) Drug Efficacy Model 2) Treatment Response Model and 3) Overall Survival Model (Figure 1).

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F1)

Figure 1: Schematic of the Digital Twin Simulator designed to model in silico therapeutic response and clinical trials.
The components that underpin the digital twin are the Drug efficacy, the Treatment response and the Overall survival models. **The Drug Efficacy Model** can ingest pre-clinical and/or clinical data. It uses the molecular profiles of tumours or preclinical models such as gene expression and mutation profiles and a drug’s molecular fingerprint derived from the compound structure. The drug efficacy model generates a perturbation kernel, which calculates the similarity of the effect of a drug perturbation between two samples treated with two drugs. Gaussian process regression using this kernel can predict multiple types of treatment response predictions such as *in vitro* IC50 and drug synergy scores, and provides inputs for the Treatment response model. The **Treatment Response Model** predicts patient response with outputs as two states: either response (partial response or complete response using RECIST) or no response (stable disease or progressive disease). The Treatment response model provides input for the Overall Survival model. The **Overall Survival Model** integrates inputs from individual patient clinical data, treatment response, pathology and if available molecular profiles (gene expression +/- mutation profiles +/- copy number alterations). The Overall Survival Model predicts overall survival for individual patients given a specific treatment regimen and can be modified to consider alternative endpoints such as disease-free survival (DFS) to reflect clinical trial study endpoints.

The perturbation kernel is derived from the drug-efficacy model and is leveraged by multiple Bayesian inference models to transfer understanding about shared molecular mechanisms across *in vitro* combination screens and clinical treatment settings. It defines the similarity in the molecular mechanisms between every pair of datapoints (for example a patient treated with a taxane such as docetaxel vs. a different patient treated with an anthracycline such as doxorubicin), which can be used to make predictions of effect using Gaussian processes2.

The drug efficacy and perturbation kernel were built using *in-vitro* dose-response data from the Cancer Therapeutic Response Portal (CTRP)3–5. The CTRP dataset consists of 481 anti-cancer compounds which include chemotherapy and targeted small molecules. These were dosed against 860 cancer cell lines. The molecular data for the cell lines was obtained from the Cancer Cell Line Encyclopedia6. This dataset was used to train the perturbation kernel to predict IC50 for the compounds in the dataset using a Sparse Gaussian Process. The perturbation kernel can also accurately predict synergy scores from the NCI-ALAMANAC dataset (unpublished data) for combination treatments. In this study, the model was tested using the perturbation kernel to predict treatment response in clinical data using the TCGA dataset located at the NCI Genomic Data Commons7. A summary of the datasets used and abbreviations can be found in Table 1 and Table 2. A detailed breakdown of the cohorts used in this study can be found in Extended Data Figure 1.

![Extended Data Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F5.medium.gif)

[Extended Data Figure 1:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F5)

Extended Data Figure 1: Digital Twin input data description
Description of the input data used to train the Drug Efficacy Model (a), Treatment Response Model (b) and Overall Survival Model (c). The distribution of clinical features across the event type (died/censored) (c1). The distribution of time-to-event data across cancer types (c2).

![Extended Data Figure 2](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F6.medium.gif)

[Extended Data Figure 2](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F6)

Extended Data Figure 2 Drug Efficacy Model Performance
We compare observed vs predicted IC50 to evaluate the performance of the Drug Efficacy Model. The root mean squared error (RMSE) of the log10 IC50 was 0.47 the coefficient of determination was 0.61.

![Extended Data Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F7.medium.gif)

[Extended Data Figure 3:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F7)

Extended Data Figure 3: Feature importance
Feature importance: permutation-based importance (lower x-axis) and Cox hazard ratio-based importance (upper x-axis). Results indicate that the “disease (no response)” feature, which is inferred from the DEM plays a significant role in model performance. Results are based on a single split.

View this table:
[Table 1.](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/T1)

Table 1. Datasets used as inputs and dates accessed.

View this table:
[Table 2:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/T2)

Table 2: Input data for predictive modelling.
WXS, whole exome DNA sequencing; CR, complete response; PR, partial response; SD, stable disease; PD, progressive disease; RECIST - Response Evaluation Criteria in Solid Tumours49.

To evaluate the performance of the model across the TCGA dataset we split the dataset into 5 cross-fold splits, stratified by cancer type and overall survival. We then trained the models on 4 of the splits and predicted outcomes for the remaining (omitted) split. All accuracy metrics reported are averages of the metrics calculated for each split in turn. Many patients had missing information, these missing variables were either imputed by the mean value of that column for that patient’s cancer type or from the entire cohort. These mean values were calculated only from the training cross folds when imputing for the validation cohort. The treatment response model was used to calculate treatment response probabilities for all the patients using the training cross-folds. If the patient received no treatment, then the patient was considered to have 0 probability of treatment response.

The model used molecular fingerprints generated by the CDK8–11 and Cinfony12 chemoinformatics libraries from canonical SMILES structures obtained from PubCHEM13 to incorporate structural information about each therapeutic. This process restricts the treatment response predictions to small molecule therapies at this time and hence we focus on chemotherapy drugs as monotherapy and drug combinations.

### Validation of Digital Twin predictions using clinical trial simulations

We simulated digital trial arms for single chemotherapy drugs and combinations to predict treatment response in cancer patients with the goal of assessing the accuracy of digital twin predictions through comparison to historical clinical trial results. TCGA data was used as input, no original individual participant clinical trial data is available and was not used for predictions. Initially, the digital twin predictions were evaluated using an unblinded approach where our technology team was aware of the results. (Figure 2). The model treatment predictions were compared to the results of four historical phase 2 and phase 3 clinical studies (1997 - 2018). These were trials in metastatic pancreatic cancer (Burris, et al.50), advanced breast cancer (Chan, et al.51 and Tutt, et al.52) and platinum-sensitive recurrent ovarian cancer (Cantù, et al.53). We compared the predicted log odds ratio (OR; Figure 2a) generated by the digital twin model for Overall Response Rate (ORR) for each treatment arm tested in the clinical study, and then compared this against the actual reported log odds ratios (log OR) from the historical trial, the ground truth (Figure 2b). We started with single-agent predictions, then progressively increased the complexity through combinations and heterogenous treatments in more sophisticated clinical trial designs (Table 3).

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F2)

Figure 2: Clinical Trials simulations by Digital Twin model (Unblinded and blinded).
The predictions from simulations of eight clinical trials by Digital Twins are shown by comparing the control arm and investigational arm, and predicting the difference in drug efficacy. Model accuracy was tested by a comparison of the predicted log odds ratio log(OR) for Overall Response Rate (ORR) by the model against the actual log Odds Ratios reported from clinical trials50–53. For metastatic/advanced cancer studies, the drug response rate was calculated using complete response + partial response. For adjuvant cancer studies, in the absence of in situ primary cancer, drug response was calculated by defining clinical response as the absence of disease relapse at a specified time point, and lack of response is equivalent to cancer relapse. 95% confidence intervals are shown for each log(OR) value; Digital Twin predictions (purple) and actual reported Clinical trial outcomes (orange). The threshold is set at zero, where >0 suggests the control arm has a better response and < 0 suggests that the investigational arm is better. The number of patients used to generate predictions or recruited into the study is reported on the right, with significance and statistics, see Statistics section in the Online Methods sections for details on how these values were calculated. The numbers used to make these plots are shown in Table 3.

View this table:
[Table 3](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/T3)

Table 3 Summary of clinical trials compared to predictions.
Abbreviations: ORR=Overall Response Rate; OS=Overall survival; mOS= median Overall Survival; DFS=Disease free survival; CAP= Cyclophosphamide, Doxorubicin and Cisplatin, ER+=Estrogen receptor positive, HER2-ve=human epidermal growth factor receptor 2 negative

### Unblinded validation

#### Single-agent chemotherapy arms

##### Study 1

The first clinical trial we simulated reported by Burris et al. in 1997, was a prospective, randomised clinical trial in advanced pancreatic cancer50 (n=126; 17 sites in USA & Canada; 1997), which randomised participants to either single-agent 5-fluorouracil (n=63) or gemcitabine (n=63). Clinical benefit was 23.8% for gemcitabine compared with 4.8% for 5-fluorouracil (5-FU) (P = 0.002). Median survival was 5.65 months for gemcitabine compared to 4.41 months for 5-FU (P = .0025). Survival at 12 months was 18% for gemcitabine and 2% for 5-FU. Both chemotherapy drugs are considered to be anti-metabolites, therefore this experiment tested the model’s ability to detect the difference in drug efficacy for two drugs belonging to the same drug class. The Digital Twin drug response predictions were based upon either no response (stable disease or disease progression) or response (partial response or complete response). The model correctly predicted gemcitabine chemotherapy would have greater clinical benefit than 5-FU (predicted log odds ratio −0.10, P < 0.0001)(Figure 2).

##### Study 2

Two metastatic breast cancer studies were simulated, both designed to prospectively compare single-agent therapeutic arms. These studies considered either an anthracycline, a taxane or a platinum. Chan et al.51 reported a prospectively randomised phase 3 study comparing taxane monotherapy (docetaxel; n=161) vs. anthracycline monotherapy (doxorubicin; n=164) in metastatic breast cancer (n=326) previously treated with an anthracycline-containing regimen (UK & Europe). The Digital Twin model predicted single-agent docetaxel would be a better treatment than single-agent doxorubicin using a relatively small dataset of n=21 (predicted log odds ratio −0.09, P = 0.07)(Figure 2 and Table 3) The borderline P value reflecting the small number of patients available for prediction expanding the confidence interval. The actual trial (n=326) showed that docetaxel had a higher objective response rate than doxorubicin (47.8% vs. 33.3%; P =.008).

##### Study 3

The other Phase 3 Clinical study in breast cancer reported by Tutt et al.52 (TNT; 17 sites, UK) compared carboplatin (n=188) vs. a taxane, docetaxel (n=188) as first-line treatment in metastatic triple negative breast cancer (n=376). This study showed that carboplatin was no more active than docetaxel in the BRCA wildtype subpopulation of patients (ORR, 31.4% vs. 34.0%, respectively; P = 0.66). The Digital twin model predicted, in alignment with the trial results, that neither carboplatin nor docetaxel would have superior efficacy in the BRCA wild-type population. In summary, the Digital twin model accurately predicts chemotherapy responses for different drug classes and can effectively predict the difference in drug activity, if present for single-agent treatments.

#### Single-agent *vs.* combination chemotherapy in relapsed disease

##### Study 4

To continue to ascertain the potential limitations of the Digital Twin model, we then added complexity to the predictions we set by including combination treatments and used a study comparing a single drug against a drug combination. The Digital Twin model virtually simulated a prospective randomised study in ovarian cancer reported by Cantù et al. 202253 (n=97; Italy) allocating participants to either single agent paclitaxel (taxane; n=50) or a combination of cyclophosphamide (alkylating agent), doxorubicin (anthracycline) and cisplatin (platinum) (CAP; n=47). This study recruited patients with recurrent ovarian cancer who had achieved complete remission with previous platinum-based regimens, and whose disease recurred after a progression-free interval of more than 12 months. The Digital Twin model used data inputs from the TCGA cohort to predict that the cisplatin-based combination (CAP) would have a higher response rate than paclitaxel. The Digital twin prediction reflected the actual published higher overall treatment response rate for CAP, which was 55% vs. 44.7% for CAP vs. Paclitaxel. (P = 0.062). Predicted ORR log odds ratio −0.24, P < 0.001 for n= 251 vs. calculated log odds ratio from clinical trial data −.044, P=0.133).

#### Blinded validation

As part of the validation process, the technology team were blinded to the published outcomes of an additional four phase 3 clinical trials. Three clinical studies in early breast cancer evaluated adjuvant chemotherapy regimens54–56 and one evaluated first-line metastatic pancreatic cancer57 (Figure 2). The Digital Twin model correctly predicted drug efficacy and the clinical trial result for all four clinical studies.

##### Study 5 (USA, Europe & Australia)

Von Hoff et al.57 The phase III Metastatic Pancreatic Adenocarcinoma Clinical Trial (MPACT) in metastatic pancreatic cancer compared the combination of nab-paclitaxel plus gemcitabine vs. gemcitabine alone as first-line therapy. The study randomised 861 previously untreated metastatic pancreatic cancer patients between these treatment arms. The response rates were 23% for nab-paclitaxel plus gemcitabine versus 7% for gemcitabine alone (P<0.001). Median overall survival was 8.5 months in the nab-paclitaxel-gemcitabine group vs. 6.7 months with gemcitabine alone (hazard ratio 0.72, p<0.001). Using data for a similar chemotherapy drug, paclitaxel, the Digital Twin model was able to correctly predict that nab-paclitaxel plus gemcitabine was superior to gemcitabine alone (Predicted log odds ratio −0.090, p = <0.001) (Figure 2).

##### Study 6 (National Cancer Institute of Canada Clinical Trials Group)

In order to test the model’s limitations with regard to the number of patients the model needed to train on, we designed a virtual trial with methotrexate chemotherapy, because the model had trained on only eight cancer patients treated with methotrexate. The study was a Phase 3 prospective randomised trial reported by Levine et al.55 (1998), which enrolled high-risk, node positive pre/peri-menopausal women post mastectomy or lumpectomy and axillary dissection (n=716) and randomised them to either adjuvant ECF (epirubicin, cyclophosphamide and fluorouracil), or adjuvant CMF (cyclophosphamide, methotrexate and fluorouracil) treatment. The relapse-free survival for CMF (control arm) was 53% (95% CI, 45-58%) and 63% (95% CI, 57-68; P=0.009) for CEF at 5 years. Although CEF was a more effective chemotherapy regimen, it was associated with significantly more acute toxicities and as a consequence is not widely used. In order to predict treatment response in the adjuvant setting where the cancer has been surgically removed. Virtual simulations by the digital twin model accurately predicted that adjuvant ECF would be superior to adjuvant CMF in early breast cancer (Predicted log odds ratio −0.023, P =<0.001; Table 3).

##### Study 7 (USA)

Reported by Jones et al. 200656 with 1016 participants with a median follow up of 5.5 years, a phase 3 prospective randomised trial in stage 1-3 breast cancer reported disease-free survival at 5.0 years for TC (docetaxel and cyclophosphamide) of 86% vs. 80% for AC (doxorubicin and cyclophosphamide) (HR 0.67; 95% CI 0.5-0.94; P=0.015)(Figure 2). The purpose of this trial was to compare the clinical outcomes in patients treated with a standard adjuvant anthracycline regimen vs. a non-anthracycline regimen. Virtual simulations by the model correctly predicted that adjuvant TC (Docetaxel and Cyclophosphamide) would be superior to adjuvant AC (doxorubicin and cyclophosphamide) in early breast cancer (Predicted log odds ratio −0.059, P < 0.001; Table 3)

##### Study 8 (Japan & South Korea)

To further test the limitations of the Digital Twin’s performance, we aimed to challenge it further. We tested it across mixed populations who received different neoadjuvant chemotherapy regimens containing either an anthracycline, a taxane, or both, and then subsequently received heterogeneous adjuvant therapy. CREATE-X (Masuda et al.54 2017) was a Phase 3 prospective, randomised study (n = 910) that randomised participants with residual disease following different neoadjuvant chemotherapy regimens for breast cancer (stage I-III) to either capecitabine or a no capecitabine. The study participants included both hormone-positive (ER+ve, HER2-ve) and triple-negative (ER-ve, HER2-ve) patients. For simulation purposes, we focused on the hormone-positive subpopulation only and assumed participants in the control arm would receive endocrine treatment but no capecitabine. For the CREATE-X study, the overall survival 95% confidence intervals and hazard ratios for the hormone-positive subgroup crossed 1.0 (n=601; HR 0.73 0.38-1.40; P = 0.41) suggesting adjuvant capecitabine was no better control. Virtual simulations by the Digital Twin model predicted that in people with hormone-positive breast cancer (HER2-ve; stages 1-3), treatment with adjuvant capecitabine would be inferior to standard of care such as adjuvant hormone treatment with tamoxifen (log odds ratio = 0.07). Although the predicted confidence intervals inferred inferiority, the prediction was within the confidence interval of the actual trial results.

### Predicting survival

An Overall Survival (OS) model was integrated into the Digital Twin clinical trial simulator using a Random Survival Forest (RSF)58, a statistical non-parametric ensemble learning method. The learning target is time-to-event and event (censored/deceased) data, and primary output is survival probability vs. time curves.

Clinical data from 10,913 patients was pre-processed to yield a dataset comprising 4029 patients, with ages ranging from 11 to 90 years, spanning 23 different cancer types. This dataset along with the RECIST response categories from the TRM stage was used as input for the OS model.

We analysed five different prediction accuracy metrics: 1) cumulative dynamic AUC ROC, 2) Uno’s concordance index59 (C-index), 3) time-dependent Brier score, 4) the Brier skill score, and, 5) explained variance. Table 4 shows the average outcome across 5-fold training and testing splits. A detailed explanation of the metrics is provided in the Online Methods. Both the dynamic AUC and the C-index scores of the Digital Twin were above 0.7 in a pan-cancer setting, a threshold set for a good predictive model.

View this table:
[Table 4](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/T4)

Table 4 shows pan-cancer Digital Twin overall survival performance evaluation metrics.
Overall, we observe relatively high accuracies evaluated across all types of cancer tissue types. The metrics shown are the average of 5-fold train and test splits.

Additionally, we evaluated the performance of our Digital Twin OS prediction in relation to existing computational models found in the literature. The results are shown in Figure 3, with further details on these benchmarks available in Extended Data Table 1. When evaluating across all tissue types, the Digital Twin model exhibited commendable performance in comparison to the mean of all 29 other model-data methods and tissue types identified in the literature60–77. Regarding Glioma, our model demonstrated favourable performance compared to XGBoost-Surv by Dal Bo et al.69 (2023) and Deep Learning with Cox proportional hazard (CPH) by Jiang et al.77 (2021) on the Udine Hospital and TCGA datasets, respectively. In Breast Cancer and Glioblastoma our model performed comparably. Our model underperformed benchmarks for Ovarian Cancers and Head and Neck Cancers. This may be attributed to the absence of crucial data inputs, specifically, TNM cancer staging data.

View this table:
[Extended Data Table 1:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/T5)

Extended Data Table 1: Overall survival comparisons
C-index metrics for benchmark computational models, with a focus on survival analysis in oncology. Time-dependent survival area under the curve (AUC) is infrequently reported in scientific publications. There are two concordance indices: defined by Harrell et al., (1984)88 and by Uno et al. (2011)59 which is based on the inverse probability of censoring weights. We utilise both of them, as the latter does not overestimate the index when there are a small number of events. However, the majority of studies use Harrell’s C-index.

View this table:
[Extended Data Table 2:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/T6)

Extended Data Table 2: Treatment Response Model Performance
A summary of the performance evaluation metrics of the treatment response model (TRM). The metrics were evaluated in a cross-fold validation with 5 splits. The reported value is the mean across the held-out cross-fold validation sets. The output of the TRM is a probability of response, so for the binary accuracy metrics, the prediction was thresholded such that the predicted response rates matched that of the training set.

![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F3.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F3)

Figure 3: Benchmarking accuracy of overall survival method against existing methods
(a) A comparison of the C-index for 29 method-data benchmark approaches in modelling survival within the field of oncology, compared against our Digital Twin OS model for pan-cancer. (b-f) A similar comparison per specific cancer type. Details can be found in Extended Data Table 1.

However, benchmark studies typically do not address the prediction of survival curves for various drugs, including novel ones, and primarily focus on predicting survival curves for specific cancer types. In contrast, our model possesses the capability to address hypothetical scenarios, offering insights into questions such as the projected survival curve when a specific patient undergoes treatment with a novel drug.

### Cohort Enrichment

The FDA defines cohort enrichment as the “prospective use of any patient characteristic to select a study population in which detection of a drug effect (if one is in fact present) is more likely than it would be in an unselected population.” Enrichment strategies should accelerate drug development, increase the magnitude of drug responses and therefore accelerate the path to drug approval.

Here we evaluated the effectiveness of using the predicted response score to segment cohorts into responder and non-responder cohorts. For each cohort we chose two thresholds to split the cohort into positive, intermediate and negative groups and evaluated the log odds ratio of overall response rate to assess the potential for cohort enrichment.

The cohorts were split into 3 groups to assess the interpretability and quality of risk stratification of the response scores predicted by the model.This cohort molecular enrichment approach was tested in 19 different solid tumour types and 17 different cytotoxic drugs (Figure 4). The Digital Twin output data suggests that 16 solid tumour types, except ovarian cancer, would benefit from a molecular predictive enrichment strategy integrated into clinical trial designs. This data also confirms that machine learning approaches can successfully identify molecular patterns and enrich drug response for commonly used chemotherapy drugs such as docetaxel or cisplatin, which currently do not have robust predictive biomarkers and are used in unselected cancer populations.

![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/01/18/2024.01.17.24301444/F4.medium.gif)

[Figure 4.](http://medrxiv.org/content/early/2024/01/18/2024.01.17.24301444/F4)

Figure 4. Assessing log odds increase in response rate through cohort enrichment
Cohort enrichment improves predicted therapeutic response rates showing that all cancers, perhaps with the exception of ovarian cancer would benefit from a molecular enrichment strategy to select responders. Similarly, with the exception of Irinotecan, pemetrexed and capecitabine the model did identify a significant benefit using the molecular data used to enrich for responders given the available patient numbers. The odds ratio in observed response rates between the biomarker positive and negative cohorts as segmented by treatment response predictions. The 95% confidence intervals are shown for each drug and cancer type.

## Discussion

Predicting therapeutic response has the potential to transform cancer care and oncology therapeutic development. We developed a Bayesian statistical approach that is similar to modelling gravitational lensing to predict the mass of dark matter by astrophysicists, where the complexity and interactivity of multiple data points is required. We show that this approach predicts with reasonable accuracy the response of therapeutics preclinically and clinically which we validated through comparison to clinical trials.

This approach can be applied at various points in therapeutic development. These include, but are not limited to: 

1.  Predicting response in preclinical models to inform decisions regarding which cancer and specific indication is more likely to be associated with response;

2.  Combination strategies;

3.  Selecting which therapeutics to advance through early clinical development;

4.  Selecting which therapeutic to advance to late-stage development;

5.  Improved patient selection for clinical development;

6.  Construction of synthetic controls for clinical trials. In addition, other potential applications, not tested here, include prediction of toxicity.

One of the aspects of this approach is that the model can predict likely response for individual patients as well as cohorts. Predicting an individual’s response to a specific treatment ahead of time has the potential to substantially impact on routine cancer care. This would better inform clinical decision-making for an individual patient, avoiding likely ineffective therapies, and selecting the better option or a clinical trial. Moreover, increasing the accuracy of prediction for an individual would mean that they could potentially serve as their own control in a clinical trial. If the survival of an individual patient with standard of care could be predicted with a known level of accuracy, more meaningful information could be drawn from that individual’s response or lack of response to a novel therapy.

An important factor to consider is how accurate a prediction needs to be in order for it to be useful. With the high failure rate of oncology therapeutic development, an incremental increase in predictive accuracy for critical decisions would have potential significant impact on the probability of success.

Important current limitations that need to be addressed are the variability in predictive accuracy between different classes of therapeutics and different cancer types. Whilst this may be simply the amount and quality of data ingested, adjustments to the model may need to be made to reflect the mechanism of action of therapeutics, where known. Biological inferences from the model as it stands need to be developed further so as to better define candidate biomarkers that could be rapidly translated into the clinic.

## Data Availability

All data produced in the present study are available upon reasonable request to the authors

## Online Methods

Here, we describe a Digital Twin, an integrated machine-learning model for simulating clinical trials and estimating clinical trial endpoints for various treatment scenarios including novel drugs. It consists of three main models: the Drug Efficacy Model (DEM), the Treatment Response Model (TRM), and the Overall Survival (OS) model and employs multi-modal data input as illustrated previously in Figure 1. The DEM is trained to estimate the impact of treatment. It provides an estimation of a patient’s treatment response in the prospective cohort under the prospective treatment, entirely from preclinical data. With treatment response estimation, the overall survival model predicts the clinical trial endpoints.

### Drug Efficacy Model (DEM)

The DEM learns to estimate half-maximal inhibitory concentration (IC50) based on the pre-clinical information from cell-line studies. The model takes a set of features for the drug-patient pair: Whole Exome Sequencing (WXS), RNA sequencing (RNAseq), drug structures, and dose-response curves. The input data includes 545 compounds using SMILES (simplified molecular-input line-entry system) describing the structure of chemical species, 1343 genes in RNAseq, 1342 copy number variation (CNVs) from the WXS data and 130,000 Hill parameters for the dose–response curves. The drug structures were encoded as CDK8–10 molecular descriptors from their SMILES ID using cinfony12. The description of the input data for the Digital Twin DEM is shown in Extended Data Figure 1a.

Here, we deliberately opted for Random Forest78,79 (RF) because it allowed us to examine the leaf assignments that create the IC50 estimates. These leaf assignments are later used in the estimation of treatment response categories in the next stages of Digital Twin. Besides, RF tends to outperform other models in prognostic value80. DEM produces three types of outputs: 1) an estimate of the IC50 of a given drug in the given patient’s tissue, 2) drug combination synergy curves (the phenomenon where the combined effect of two or more drugs is greater than the sum of their individual effects), and 3) Perturbation Kernel81 leaf assignments. The latter is the main output, and input to the next stage of the Digital Twin, to the Treatment Response Model.

To evaluate the performance of the DEM, i.e. evaluate how good the predictions of a model on unseen validation data, we compare predicted against observed IC50 in Extended Data Figure 2. The root mean squared error (RMSE) of the log10 IC50 was 0.47 the coefficient of determination was 0.61.

### Perturbation Kernel

Here, a Perturbation Kernel is a method based on a Random Forest Kernel (RFK)81, that helps find patterns and relationships between data points in a more complex space where those patterns might be more apparent. It is constructed using the random partition sampling scheme, generating several RF decision trees, which are trained on a subset of features. In trained RF, each leaf in each tree is considered a partition. In learning algorithms, a kernel is a function that calculates the similarity or distance between pairs of data points. It is calculated based on counting the fraction of times these data points share a partition. Thus, the more these points share the same partition, the less distance between them and the more similar they are.

The Perturbation Kernel leaf assignments are output derived from the DEM and are utilised to transfer knowledge concerning shared molecular mechanisms across in vitro studies, combination screens, and clinical treatments. It outlines the similarity in molecular mechanisms between pairs of patients, for example, a patient treated with a taxane and another patient treated with an anthracycline. Subsequently, these are employed to train the Treatment Response Model for the prediction of treatment response categories.

### Treatment Response Model (TRM)

The TRM learns to accurately predict four RECIST treatment response categories49 from the output of DEM: 1) clinical progressive disease 2) stable disease, 3) partial response 4) complete response. The input to the model are the leaf assignments from the Perturbation Kernel, that is the output from the DEM. Random Forest in DEM allows the use of leaf assignments to form a kernel function between inputs. Assuming that similarity in IC50 estimates correlates with similarity in treatment response, it was possible to conduct kernel regression to estimate treatment response categories. To perform kernel regression we use the Gaussian Process2.

The performance of the TRM was evaluated based on its ability to classify the response categories in the overall pan-cancer setting, considering individual cancer types and distinct cancer treatments (see Extended Data Table 2). The following grouped RECIST categories were analysed: 1) Disease Control (combined complete response, partial response and stable disease) 2) Response (combined complete response and partial response) and 3) Complete Response. Overall, TRM demonstrated high accuracy, scoring 0.75, 0.74 and 0.69 in the respective grouped categories, along with Area Under the Curve (AUC) values of 0.63, 0.73, and 0.72. AUC reports the Receiver Operating Characteristic area under the curve for the non-thresholded prediction, and can be interpreted as the probability that a positive responder had a higher predicted probability of response than a negative responder. These were evaluated in a 5-fold cross-validation, and the reported value is the mean across the held-out cross-fold validation sets.

In addition, Extended Data Table 2 shows detailed model performance evaluation. Specifically, it displays accuracy, AUC, precision, recall and F1 score across a) grouped RECIST categories b) 9 cancer types and c) 12 cancer drugs. The Weighted Average provides scores weighted by the support for positive and negative responses. Average precision is the area under the precision-recall curve, measuring the average precision overall of all classification thresholds as a function of recall, with higher values being preferable.

Significant variations in performance are evident across treatments and tissue types, and much of this variability is likely attributed to the limited availability of response data for many cohorts.

### Statistics

The clinical trials confidence intervals are calculated according to Altman82 and Sheskin83. To calculate the predicted log(OR) (log odds ratio), the log(OR) was calculated for each individual patient in the dataset based on the treatment response model’s prediction, and the mean log(OR) and standard error were directly calculated. Because the log(OR) can be calculated for each individual patient the confidence intervals are much smaller than for an equivalent clinical study which is measuring the difference in response between two *populations*. When calculating the log odds Overall Response Rate (ORR) for a combination therapy the highest log odds for all treatments in the combination was taken.

### Overall Survival

We built an Overall Survival (OS) model as an integral component of the Digital Twin clinical trial simulator. A survival model is a statistical method to predict the time until an event, such as death. It deals with the censored data, indicating that the event of interest did not happen during the study period, and produces survival probability vs. time curves. We employed Random Survival Forest (RSF), a non-parametric ensemble learning method that can incorporate censored and time-to-event data58. The learning process involves the creation of multiple decision trees, and the model is selected based on the accuracy of predictions on unseen data.

The inputs for the OS are: 1) clinical records data, 2) cancer tissue type and 3) RECIST response categories from the TRM stage of the Digital Twin. Initially, 10,967 patients with survival data were pre-processed to end up with data from 4,029 patients with cancer tumour stages 0-4, lymph node stages 0-3, metastasis status (yes/no), and ages ranging from 11 to 90 years, across 23 cancer tissue types. Cancer types with small amounts of patients were removed from the analysis.

For patients with missing data, missing values were imputed with the mean values. The imputation was performed separately in the train set and validation set. Here, we group RECIST response categories into binary: 1) Disease (clinical progressive disease and stable disease, and 2) Response (partial response and complete response). The descriptive figure about the data is shown in Extended Data Figure 1.

To evaluate the performance of the model, we analysed five different prediction accuracy metrics across the solid tumours: 1) area under the receiver operating characteristic (AUC ROC) averaged for all times also known as cumulative dynamic AUC84 and 2) Uno’s concordance index (C-index) based on the inverse probability of censoring weights59. It is a goodness of fit measure for models that produce risk scores, commonly used in survival analysis. The intuition behind the C-index is – when comparing patients against each other if the patient with the higher risk score has a shorter time-to-event; 3) the integrated time-dependent Brier score85. It provides an overall calculation of the model performance at all available times. The smaller numerical values represent higher prediction accuracy (0 is the best achievable score with perfect accuracy and 1 is the worst score). 4) the Brier skill score that is the difference between the Brier score of the reference mode and the Brier score of the forecast model, divided by the Brier score of the reference model (1 is the best achievable score) and 5) explained variance, which is a proportion to which a model accounts for the variation of a given data set. Time-dependent metrics are integrated over time. We used scikit-survival86 for calculating the majority of metrics.

We conduct the performance evaluation using a 5-fold split (train/test splits), and all survival metrics reported are averaged across all cross-folds. In the pan-cancer setting, the model attained high accuracies as described by the following: AUC ROC = 0.78, C-index = 0.71, Brier score = 0.168, Brier skill = 0.42, and Explained Variance = 0.44. The performance of the OS per cancer was shown in Table 5 of the main body.

For the comparison with the benchmark studies (Figure 3 in the main body), we conducted a literature survey, selecting studies that focus on the survival analysis modelling within the oncology field. The complete list of studies is available in Extended Data Table 1. Given that the majority of studies utilise Harrell’s C-index88, we computed and compared it on a per-cancer basis.

### Feature Importance

We performed the feature importance analysis which emphasised the fact that the predicted output from the DEM modulates the Overall Survival, influencing clinical trial outcomes.

We used two methods: permutation-based importance and Cox proportional hazards model. The former is calculated by randomly permuting a feature and measuring the difference between the model prediction score after permutation and without. Cox estimates the impact of individual covariates on the hazard ratio (HR), allowing to quantify how changes in specific features affect the risk of an event occurring over time. The latter is a standard survival analysis model in oncology87.

Results are shown in Extended Data Fig 3. Results indicate that the “Disease (No response)”, which is inferred from the DEM plays a significant role in model performance. A stronger “disease” results in a positive log(HR) on the Cox model with the highest HR among all features. The permutation importance method shows that this feature has high importance, almost comparable to the patient’s age, which is considered one of the most important factors in cancer survival.

### Integration

Finally, we integrate DEM, TRM and OS models in order to be able to simulate clinical trials for both existing and novel cancer therapies. We call it a Digital Twin, a virtual model designed to accurately reflect clinical trials of cancer treatments with cytotoxic and small molecule therapies across various cancer types. The simulation output can then be any desired endpoint as predicted by the overall survival model.

The integration of distinct models involves utilising outputs from the preceding model as inputs for the subsequent model. DEM incorporates pre-clinical data, including gene expression and mutation profiles, drug response curves and drug compounds and produces a perturbation kernel. This perturbation kernel is subsequently employed in the TRM for predicting RECIST response categories. These predicted response categories are then incorporated into the OS model, along with clinical records data, to forecast overall patient survival over time.

## Acknowledgements

This work was supported by Innovate UK [grant number 50074]. The data used in this study is in whole or part based upon data generated by the TCGA Research Network.

*   Received January 17, 2024.
*   Revision received January 17, 2024.
*   Accepted January 18, 2024.


*   © 2024, Posted by Cold Spring Harbor Laboratory

The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission.

## References

1.  1.Nightingale, J. W., Hayes, R. G. & Griffiths, M. ‘PyAutoFit’: A Classy Probabilistic Programming Language for Model Composition and Fitting. J. Open Source Softw. 6, 2550 (2021).
    
    
2.  2.Rasmussen, C. E. & Williams, C. K. I. Gaussian processes for machine learning. (MIT Press, 2006).
    
    
3.  3.Rees, M. G. et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat. Chem. Biol. 12, 109–116 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nchembio.1986&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26656090&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

4.  4.Seashore-Ludlow, B. et al. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer Discov. 5, 1210–1223 (2015).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiY2FuZGlzYyI7czo1OiJyZXNpZCI7czo5OiI1LzExLzEyMTAiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyNC8wMS8xOC8yMDI0LjAxLjE3LjI0MzAxNDQ0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

5.  5.Basu, A. et al. An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules. Cell 154, 1151–1161 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2013.08.003&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23993102&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000323767300026&link_type=ISI) 

6.  6.Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-019-1186-3&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

7.  7.Grossman, R. L. et al. Toward a Shared Vision for Cancer Genomic Data. N. Engl. J. Med. 375, 1109–1112 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMp1607591&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27653561&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

8.  8.Willighagen, E. L. et al. The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J. Cheminformatics 9, 33 (2017).
    
    
9.  9.Steinbeck, C. et al. Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics. Curr. Pharm. Des. 12, 2111–2120 (2006).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2174/138161206777585274&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16796559&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000238778600005&link_type=ISI) 

10. 10.May, J. W. & Steinbeck, C. Efficient ring perception for the Chemistry Development Kit. J. Cheminformatics 6, 3 (2014).
    
    
11. 11.Steinbeck, C. et al. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics. J. Chem. Inf. Comput. Sci. 43, 493–500 (2003).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1021/ci025584y&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12653513&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000181889200020&link_type=ISI) 

12. 12.O’Boyle, N. M. & Hutchison, G. R. Cinfony – combining Open Source cheminformatics toolkits behind a common interface. Chem. Cent. J. 2, 24 (2008).
    
    
13. 13.Kim, S. et al. PubChem 2023 update. Nucleic Acids Res. 51, D1373–D1380 (2023).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gkac956&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36305812&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

14. 14.Abeshouse, A. et al. Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas. Cell 171, 950–965.e28 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2017.10.014&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

15. 15.Ally, A. et al. Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma. Cell 169, 1327–1341.e23 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2017.05.046&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28622513&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

16. 16.The Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061-1068 (2008).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature07385&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18772890&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000260252600035&link_type=ISI) 

17. 17.The Cancer Genome Atlas Network. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 517, 576-582 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature14129&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25631445&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000348775000035&link_type=ISI) 

18. 18.The Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519-525 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11404&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22960745&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000309167100041&link_type=ISI) 

19. 19.The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43-49 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature12222&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23792563&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000321285600029&link_type=ISI) 

20. 20.The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202-209 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature13480&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25079317&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000341362800044&link_type=ISI) 

21. 21.The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330-337 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11252&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22810696&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000306506500035&link_type=ISI) 

22. 22.Robertson, A. G. et al. Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer. Cell 171, 540–556.e25 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2017.09.007&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28988769&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

23. 23.The Cancer Genome Atlas Research Network. Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma. N. Engl. J. Med. 374, 135-145 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1505917&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26536169&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

24. 24.Fishbein, L. et al. Comprehensive Molecular Characterization of Pheochromocytoma and Paraganglioma. Cancer Cell 31, 181–193 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccell.2017.01.001&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

25. 25.The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507, 315-322 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature12965&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24476821&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000333029000025&link_type=ISI) 

26. 26.The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61-70 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature11412&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23000897&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000309446800032&link_type=ISI) 

27. 27.Ciriello, G. et al. Comprehensive Molecular Portraits of Invasive Lobular Breast Cancer. Cell 163, 506–519 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2015.09.033&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26451490&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

28. 28.The Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543-550 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature13385&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25079552&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000339566300025&link_type=ISI) 

29. 29.Zheng, S. et al. Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma. Cancer Cell 29, 723–736 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccell.2016.04.002&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27165744&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

30. 30.The Cancer Genome Atlas Research Network. Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas. N. Engl. J. Med. 372, 2481-2498 (2015).
    
    
31. 31.Cancer Genome Atlas Research Network et al. Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat. Genet. 48, 607-616 (2016).
    
    
32. 32.The Cancer Genome Atlas Research Network. Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia. N. Engl. J. Med. 368, 2059-2074 (2013).
    
    
33. 33.Akbani, R. et al. Genomic Classification of Cutaneous Melanoma. Cell 161, 1681–1696 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2015.05.044&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26091043&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

34. 34.The Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 (2011).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature10166&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21720365&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000292204300032&link_type=ISI) 

35. 35.The Cancer Genome Atlas Research Network. Integrated genomic and molecular characterization of cervical cancer. Nature 543, 378-384 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature21386&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28112728&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

36. 36.The Cancer Genome Atlas Research Network & Levine, D. A. Integrated genomic characterization of endometrial carcinoma. Nature 497, 67-73 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature12113&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23636398&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000318221500035&link_type=ISI) 

37. 37.The Cancer Genome Atlas Research Network. Integrated genomic characterization of oesophageal carcinoma. Nature 541, 169-175 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature20805&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28052061&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

38. 38.Raphael, B. J. et al. Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma. Cancer Cell 32, 185–203.e13 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccell.2017.07.007&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28810144&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

39. 39.Agrawal, N. et al. Integrated Genomic Characterization of Papillary Thyroid Carcinoma. Cell 159, 676–690 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2014.09.050&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25417114&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000344521700021&link_type=ISI) 

40. 40.Shen, H. et al. Integrated Molecular Characterization of Testicular Germ Cell Tumors. Cell Rep. 23, 3392–3406 (2018).
    
    
41. 41.Cherniack, A. D. et al. Integrated Molecular Characterization of Uterine Carcinosarcoma. Cancer Cell 31, 411–423 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccell.2017.02.010. PubMed PMID: 28292439; PubMed Central PMCID: PMC5599133&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28292439&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

42. 42.Robertson, A. G. et al. Integrative Analysis Identifies Four Molecular and Clinical Subsets in Uveal Melanoma. Cancer Cell 32, 204–220.e15 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccell.2017.07.003&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28810145&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

43. 43.Farshidfar, F. et al. Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles. Cell Rep. 18, 2780–2794 (2017).
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

44. 44.Hmeljak, J. et al. Integrative Molecular Characterization of Malignant Pleural Mesothelioma. Cancer Discov. 8, 1548–1565 (2018).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiY2FuZGlzYyI7czo1OiJyZXNpZCI7czo5OiI4LzEyLzE1NDgiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyNC8wMS8xOC8yMDI0LjAxLjE3LjI0MzAxNDQ0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

45. 45.Radovich, M. et al. The Integrated Genomic Landscape of Thymic Epithelial Tumors. Cancer Cell 33, 244–258.e10 (2018).
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29438696&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

46. 46.Abeshouse, A. et al. The Molecular Taxonomy of Primary Prostate Cancer. Cell 163, 1011–1025 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2015.10.025&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26544944&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

47. 47.Davis, C. F. et al. The Somatic Genomic Landscape of Chromophobe Renal Cell Carcinoma. Cancer Cell 26, 319–330 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccr.2014.07.014&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25155756&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000341873800006&link_type=ISI) 

48. 48.Brennan, C. W. et al. The Somatic Genomic Landscape of Glioblastoma. Cell 155, 462–477 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2013.09.034&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24120142&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000325719800021&link_type=ISI) 

49. 49.Eisenhauer, E. A. et al. New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1). Eur. J. Cancer 45, 228–247 (2009).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ejca.2008.10.026&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19097774&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000262948300002&link_type=ISI) 

50. 50.Burris, H. A. et al. Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: a randomized trial. J. Clin. Oncol. Off. J. Am. Soc. Clin. Oncol. 15, 2403–2413 (1997).
    
    
51. 51.Chan, S. et al. Prospective Randomized Trial of Docetaxel Versus Doxorubicin in Patients With Metastatic Breast Cancer. J. Clin. Oncol. 17, 2341–2341 (1999).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjk6IjE3LzgvMjM0MSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDI0LzAxLzE4LzIwMjQuMDEuMTcuMjQzMDE0NDQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

52. 52.Tutt, A. et al. A randomised phase III trial of carboplatin compared with docetaxel in BRCA1/2 mutated and pre-specified triple negative breast cancer “BRCAness” subgroups: the TNT Trial. Nat. Med. 24, 628–637 (2018).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-018-0009-7&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

53. 53.Cantù, M. G. et al. Randomized controlled trial of single-agent paclitaxel versus cyclophosphamide, doxorubicin, and cisplatin in patients with recurrent ovarian cancer who responded to first-line platinum-based regimens. J. Clin. Oncol. Off. J. Am. Soc. Clin. Oncol. 20, 1232–1237 (2002).
    
    
54. 54.Masuda, N. et al. Adjuvant Capecitabine for Breast Cancer after Preoperative Chemotherapy. N. Engl. J. Med. 376, 2147–2159 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1612645&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

55. 55.Levine, M. N. et al. Randomized trial of intensive cyclophosphamide, epirubicin, and fluorouracil chemotherapy compared with cyclophosphamide, methotrexate, and fluorouracil in premenopausal women with node-positive breast cancer. National Cancer Institute of Canada Clinical Trials Group. J. Clin. Oncol. 16, 2651–2658 (1998).
    
    [Abstract](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjk6IjE2LzgvMjY1MSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDI0LzAxLzE4LzIwMjQuMDEuMTcuMjQzMDE0NDQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

56. 56.Jones, S. E. et al. Phase III Trial Comparing Doxorubicin Plus Cyclophosphamide With Docetaxel Plus Cyclophosphamide As Adjuvant Therapy for Operable Breast Cancer. J. Clin. Oncol. 24, 5381–5387 (2006).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjEwOiIyNC8zNC81MzgxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDEvMTgvMjAyNC4wMS4xNy4yNDMwMTQ0NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

57. 57.Von Hoff, D. D. et al. Increased Survival in Pancreatic Cancer with nab-Paclitaxel plus Gemcitabine. N. Engl. J. Med. 369, 1691–1703 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1304369&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24131140&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000326311300007&link_type=ISI) 

58. 58.Ishwaran, H., Kogalur, U. B., Blackstone, E. H. & Lauer, M. S. Random survival forests. Ann. Appl. Stat. 2, 841–860 (2008).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1214/08-AOAS169&link_type=DOI) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000261057900003&link_type=ISI) 

59. 59.Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data. Stat. Med. 30, 1105–1117 (2011).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.4154&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21484848&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

60. 60.Jing, B. et al. A deep survival analysis method based on ranking. Artif. Intell. Med. 98, 1–9 (2019).
    
    
61. 61.Marcinak, C. T. et al. Accuracy of models to prognosticate survival after surgery for pancreatic cancer in the era of neoadjuvant therapy. J. Surg. Oncol. (2023).
    
    
62. 62.Huang, Z. et al. Deep learning-based cancer survival prognosis from RNA-seq data: approaches and evaluations. BMC Med. Genomics 13, 1–12 (2020).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12920-020-00814-w&link_type=DOI) 

63. 63.Courtiol, P. et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 25, 1519–1525 (2019).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-019-0583-3&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

64. 64.Katzman, J. L. et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18, 1–12 (2018).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12874-018-0495-9&link_type=DOI) 

65. 65.Royston, P. & Altman, D. G. External validation of a Cox prognostic model: principles and methods. BMC Med. Res. Methodol. 13, 1–15 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2288-13-1&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23297754&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

66. 66.Hao, J., Kim, Y., Mallavarapu, T., Oh, J. H. & Kang, M. Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data. BMC Med. Genomics 12, 1–13 (2019).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12920-019-0514-7&link_type=DOI) 

67. 67.Koo, K. C. et al. Long short-term memory artificial neural network model for prediction of prostate cancer survival outcomes according to initial treatment strategy: development of an online decision-making support system. World J. Urol. 38, 2469–2476 (2020).
    
    
68. 68.Starke, S. et al. Longitudinal and multimodal radiomics models for head and neck cancer outcome prediction. Cancers 15, 673 (2023).
    
    
69. 69.Dal Bo, M., et al. Machine learning to improve interpretability of clinical, radiological and panel-based genomic data of glioma grade 4 patients undergoing surgical resection. J. Transl. Med. 21, 450 (2023).
    
    
70. 70.Andrearczyk, V. et al. Multi-task Deep Segmentation and Radiomics for Automatic Prognosis in Head and Neck Cancer. in Predictive Intelligence in Medicine (eds. Rekik, I., Adeli, E., Park, S. H. & Schnabel, J.) 147–156 (Springer International Publishing, 2021). doi:10.1007/978-3-030-87602-9_14.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/978-3-030-87602-9_14&link_type=DOI) 

71. 71.Boehm, K. M. et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. *Nat*. Cancer 3, 723–733 (2022).
    
    
72. 72.Ueno, H. et al. New criteria for histologic grading of colorectal cancer. Am. J. Surg. Pathol. 36, 193–201 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/PAS.0b013e318235edee&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22251938&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

73. 73.Kawai, K. et al. Nomograms for colorectal cancer: A systematic review. World J. Gastroenterol. 21, 11877–11886 (2015).
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

74. 74.Schumacher, M. et al. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. J. Clin. Oncol. 12, 2086–2093 (1994).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjEwOiIxMi8xMC8yMDg2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjQvMDEvMTgvMjAyNC4wMS4xNy4yNDMwMTQ0NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

75. 75.Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature10983&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22522925&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000305466800033&link_type=ISI) 

76. 76.Knaus, W. A. et al. The SUPPORT prognostic model: Objective estimates of survival for seriously ill hospitalized adults. Ann. Intern. Med. 122, 191–203 (1995).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/0003-4819-122-3-199502010-00007&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=7810938&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1995QD73300007&link_type=ISI) 

77. 77.Jiang, S., Zanazzi, G. J. & Hassanpour, S. Predicting prognosis and IDH mutation status for patients with lower-grade gliomas using whole slide images. Sci. Rep. 11, 16849 (2021).
    
    
78. 78.Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1023/A:1010933404324&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28752533&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000170489900001&link_type=ISI) 

79. 79.Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 785–794 (2016).
    
    
80. 80.Peng, J. et al. The prognostic value of machine learning techniques versus cox regression model for head and neck cancer. Methods 205, 123–132 (2022).
    
    
81. 81.Davies, A. & Ghahramani, Z. The Random Forest Kernel and other kernels for big data from random partitions. Preprint at doi:10.48550/arXiv.1402.4293 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.48550/arXiv.1402.4293&link_type=DOI) 

82. 82.Altman, D. G. Practical statistics for medical research. (CRC press, 1990).
    
    
83. 83.Sheskin, D. J. Handbook of parametric and nonparametric statistical procedures. (Chapman and hall/CRC, 2003).
    
    
84. 84.Hung, H. & Chiang, C.-T. Estimation methods for time-dependent AUC models with survival data. Can. J. Stat. 38, 8–26 (2010).
    
    
85. 85.Graf, E., Schmoor, C., Sauerbrei, W. & Schumacher, M. Assessment and comparison of prognostic classification schemes for survival data. Stat. Med. 18, 2529–2545 (1999).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/(SICI)1097-0258(19990915/30)18:17/18<2529::AID-SIM274>3.0.CO;2-5&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10474158&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000082507800027&link_type=ISI) 

86. 86.Pölsterl, S. scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. J. Mach. Learn. Res. 21, 1–6 (2020).
    
    
87. 87.Moncada-Torres, A., van Maaren, M. C., Hendriks, M. P., Siesling, S. & Geleijnse, G. Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival. Sci. Rep. 11, 6968 (2021).
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 

88. 88.Harrell Jr, F. E., Lee, K. L., Califf, R. M., Pryor, D. B. & Rosati, R. A. Regression modelling strategies for improved prognostic prediction. Stat. Med. 3, 143–152 (1984).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.4780030207&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=6463451&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F01%2F18%2F2024.01.17.24301444.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1984SW54000004&link_type=ISI)