Association of Graph-based Spatial Features with Overall Survival Status of Glioblastoma Patients ================================================================================================= * Joonsang Lee * Shivali Narang * Juan Martinez * Ganesh Rao * Arvind Rao ## Abstract **Background and purpose** Glioblastoma multiforme (GBM) is the most common malignant brain tumor with less than 15 months median survival. To aid prognosis, there is a need for decision tools that leverage diagnostic modalities such as MRI to inform survival. In this study, we examine higher-order spatial proximity characteristics from habitats and propose two graph-based methods (minimum spanning tree and graph run-length matrix) to characterize spatial heterogeneity over tumor MRI-derived intensity habitats and assess their relationships with overall survival as well as immune signature status of patients with GBM. **Material and methods** A data set of 74 patients was studied based on the availability of post-contrast T1-weighted and T2-weighted fluid attenuated inversion recovery (FLAIR) image data in The Cancer Image Archive (TCIA). We assessed the predictive value of MST- and GRLM-derived features from 2D images for prediction of 12-month survival status and immune signature status of patients with GBM via a receiver operating characteristic curve analysis. **Results** For 12-month survival prediction using MST-based method, sensitivity and specificity were 0.82 and 0.79 respectively. For GRLM-based method, sensitivity and specificity were 0.73 and 0.77 respectively. For immune status, sensitivity and specificity were 0.91 and 0.69, respectively, for the GRLM-based method with an immune effector. **Conclusion** Our results show that the proposed MST- and GRLM-derived features are predictive of 12-month survival status as well as the immune signature status of patients with GBM. To our knowledge, this is the first application of MST- and GRLM-based proximity analyses for the study of radiologically-defined tumor habitats in GBM. Keywords * MRI * Glioblastoma multiforme * MST * Graph RLM * Habitat * Overall survival * immune status ## Introduction Glioblastoma multiforme (GBM) is a common primary brain tumor known for its aggressive malignant behavior. Prognosis for patients with GBM remains very poor with the median overall survival duration between 12 months and 15 months despite multimodality treatments such as surgical resection followed by combination of radiation therapy and chemotherapy (temozolomide) 1,2. Several studies have been proposed to improve the diagnostic performance of MRI for cancer using various techniques such as computer-based image analyses 3,4, imaging features analysis 5-7, machine learning techniques 8, imaging-genomics analysis 9,10. From MRI analysis studies of GBM patients, it has been suggested that intensity-level heterogeneity within the tumor is indicative of multiple tumor regions with distinct MRI intensity characteristics that might respond differently to treatment regimens 11. This has implications for the assessment of patient prognosis in GBM 12,13. Based on multiparametric measurements of different MRI sequences, these habitats characterize regional variations in blood flow, cell density, and necrosis 11. Apart from the abundance of these habitats, the spatial extents and proximity of these habitats have physiologic and clinical relevance for assessment of treatment response 11,14. In this study, we identified four distinct groups of pixel intensities (habitats) within the tumor ROI across different MR sequences and characterized the spatial relationships of these derived habitats with graph-based methods such as a minimum spanning tree (MST) construction and graph run-length matrices (GRLM). Previously, graph (MST-derived) features have been used to understand the proximity relationships between distinct immunohistochemical entities (cell types) within hematoxylin and eosin (H&E) pathology slides 15. These MST-based features successfully distinguished samples of high and low lymphocytic infiltration extent with a classification accuracy greater than 90% 15. Based on the success of this approach, we hypothesized that characterizing the spatial relationship between radiologically-distinct habitats of a tumor using MST or GRLM approaches might have predictive value for underlying clinical outcome in GBM. Another graph-based characterization called GRLM 16 was also used to characterize the spatial heterogeneity of a tumor to predict clinical outcome in GBM, based on the idea of graylevel run-length matrices. The gray-level run-length method is one of popular methods extracting high order statistical features in texture analysis 17. In this study, we used GRLM to compute the runs of radiologically defined tumor habitats on a region of interest (ROI) image to obtain a runlength matrix instead of counting runs of pixel intensities. The purpose of this study is to evaluate the prognostic significance of higher-order spatial proximity characteristics from habitats using quantitative metrics derived from graph based methods. We aim to investigate the association of MST and GRLM-derived spatial proximity features of these tumor habitats with overall survival of GBM patients as well as predicting immune signature status. In the broader context, this study aims to investigate relationships between imaging and phenotypic characterizations of the tumor, thereby augmenting the foundation for population-based correlation studies in GBM. ## Materials and methods ### Data A data set of 74 patients was studied based on the availability of post-contrast T1-weighted and T2-weighted fluid attenuated inversion recovery (FLAIR) image data in The Cancer Genome Atlas (TCGA), The Cancer Imaging Archive (TCIA - [http://www.cancerimagingarchive.net/](http://www.cancerimagingarchive.net/)). The dataset consisted of 25 female and 49 male patients with de-novo (primary) GBM. The patient demographics are summarized in Table 1. The mRNA expression data and clinical data such as survival information for these cases 18 were obtained from the cBioPortal for Cancer Genomics ([http://www.cbioportal.org](http://www.cbioportal.org)). In this study, the previously defined immune effector and immune suppressor response 19 were used to derive the immune gene signature status using single-sample gene set enrichment analysis: ssGSEA for each patient 20-23. For predicting immune signature status, 34 patients were used based on availability among 74 patients. The MR images were preprocessed with registration, non-uniformity correction using N3 24, pixel reslicing, and intensity normalization 25 before subsequent analysis. Registration of the T1 post-contrast image and T2 FLAIR image along with non-uniformity correction for MRI-artifacts were performed using the Medical Image Processing, Analysis, and Visualization (MIPAV) software 26. The FLAIR MR image is registered to the T1 post-contrast image using affine transformation with 12 degrees of freedom along with trilinear interpolation. Segmentation was a semi-automated process for which MITK3M3 Image analysis toolkit was used. The clinicians used this tool to contour the tumor region on multiple slices with interpolation performed to obtain a 3D volumetric tumor mask. This step was performed independently on T1-post contrast as well as T2-FLAIR. In all the processes, readers were blinded to clinical/molecular characteristics. Pixel reslicing was performed using the NIFTI toolbox in MATLAB to make pixel sizes isotropic (1mm). The resulting T1 post-contrast image and T2 FLAIR image after preprocessing is shown in Figure 1. View this table: [Table 1.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/T1) Table 1. Patient demographics ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/18/2022.12.16.22283587/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/F1) Figure 1. T1 post-contrast image (left) and T2 FLAIR image (right) after preprocessing. Arrows point to the enhanced tumor areas. ### Region of interest (ROI) Delineation A tumor ROI was segmented semi-automatically by two experts (radiologists) using the Medical Imaging Interaction Toolkit (MITK). The extent of tumor was defined using the contrast enhancing tumor region within the T1-post contrast image, and the areas of solid tumor, infiltrating tumor and edema regions within the T2-FLAIR image. The slice with the maximum tumor area in T1 post-contrast image and the corresponding slice from the T2 FLAIR image were selected for both MST and GRLM analysis. The pixel intensity values within the ROI for each patient were scaled to lie between zero and one by linear transformation, and then fitted using a two-component Gaussian mixture model (GMM) 27,28. The threshold between two Gaussian groups was determined by calculating the average of the means of the two Gaussian populations underlying the low intensity pixel group and high intensity pixel group within the tumor, following a process similar to prior work 28. ### Minimum spanning tree (MST) and graph run-length matrices (GRLM) A spanning tree of a graph represents a tree that connects all the vertices. Each edge has an associated weight (or length), the weight of a tree represents the sum of all weights of its edges. Then, the MST can be defined as a spanning tree with the minimum weight of a tree among any other spanning trees. The gray-level run-length matrix *M*(*i, j*) is a popular method in texture analysis that measures the variation of the pixel intensities to quantify intuitive qualities such as smoothness, coarseness, and roughness. Gray-level run-length can be defined as the number of runs with pixels of gray level *i* and run length *j* in a given direction. Basically, run-length matrix provides the coarseness of a texture in a specified direction. Runs of data represent sequences in which the same data value, gray level intensity, occurs in many consecutive pixel elements. In general, fine texture or high frequency tends to have more short runs with similar gray level intensities and coarse texture or low frequency tends to have longer runs of similar gray level intensities. ### Feature extraction with minimum spanning tree (MST) In this study, we used two different MR sequence images (T1 post-contrast image and T2 FLAIR image). For each MRI image-sequence, the pixels within the tumor ROIs was separated into low and high intensity group (habitat) using the two-component GMM. Four binary masks were prepared from these groups (for every pairwise combination of habitats). A two-dimensional grid line was overlaid on each binary mask. The grid lines were equally spaced with a distance of 8 pixels (8mm × 8mm), chosen empirically. Next, we computed the coordinate of the centroid that specifies the center of mass of the region from each habitat inside of the small bounding grid box 29. Coordinates from all grid boxes in each habitat were combined into one map. An MST was constructed across all these co-ordinates as vertices. Thus we have four MSTs (one for each habitat: T1 high intensity group, T1 low, T2 high, or T2 low intensity group, respectively). Figure 2 shows MST of each of the four groups on an ROI map. The mean, median, standard deviation, skewness, kurtosis, min/max ratio, and disorder of the branch lengths in MST were computed to obtain a set of seven features 15 (for each habitat. Expressions for the features are listed below: Mean edge weight, *f**µ*: ![Formula][1] where *w**i* is an individual weight or branch length in MST. ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/18/2022.12.16.22283587/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/F2) Figure 2. MSTs of the four habitats within the tumor ROI (a) T1 post-contrast high pixel intensity group, (b) T1 post-contrast low pixel intensity group, (c) T2 FLAIR high intensity pixel group, and (d) T2 FLAIR low intensity pixel group, respectively. Different gray levels within the ROI represent different habitats and their overlapping areas. Median edge weight, *f**median*, is the value separating the higher half of branch lengths from the lower half. Standard deviation of the distribution of edge weights, *f**α* : ![Formula][2] Skewness of distribution of edge weights, *f**skewness*: ![Formula][3] Kurtosis of distribution of edge weights, *f**kurtosis*,: ![Formula][4] Finally, the min/max ratio, *f**r*, is the ratio between maximum of **W** divided by the minimum value of **W** where **W** is a set of weights (branch lengths), **W** ={*w*1,*w*2, …, *w**n*} .’Disorder’ is the standard deviation *f**α* divided by the mean value of **W**, *f**μ*. ### Feature extraction with graph run-length matrices (GRLM) Similar to the construction of the MST, we computed the coordinate of the centroid from each habitat inside of the rectangular grid box. For the extraction of GRLM features, we combined all coordinates from all four habitats into one set (map). Again, these four habitat groups represent T1 high- and low-intensity pixel groups and T2 high- and low-intensity pixel groups. Figure 3 shows an ROI map with all coordinates (across all four habitats) as vertices. Then, we constructed a Delaunay triangulation 30 to connect these vertices. There are several ways to triangulate any given set of points and a Delaunay triangulation is one of the most widely used in scientific computing of various applications. In the Delaunay triangulation, all triangles for a set of points will have empty circumscribed circles 30. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/18/2022.12.16.22283587/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/F3) Figure 3. An example of ROI map with a Delaunay triangulation across all vertices. Different gray levels within the ROI represent different habitats and their overlapping areas. After constructing the ROI map with the Delaunay triangulation, a run-length matrix was computed. In this study, we used the GRLM method in the manner proposed by Tosun et al. for histopathological image segmentation 16 because this aims to represent the spatial separation between point set entities. GRLM *G*(*t, l*) can be defined as the number of graph-edge runs with an edge type *t* and a path length *l* for a single node. The algorithm starts from the initial node to the furthermost node in the path within a circular window. Figure 4 shows the calculation of a graph run-length matrix for a single node. For the entire region, the algorithm accumulates the run-length matrices of the nodes in an ROI. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/18/2022.12.16.22283587/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/F4) Figure 4. (a) Illustration of a single initial node located at the center and (b) a graph run-length matrix for this single initial node. This method was adapted from Tosun et al. For extracting GRLM-based features, we used six measurements: short path emphasis (SPE), long path emphasis (LPE), edge type nonuniformity (ETN), and path length nonuniformity (PLN). The expressions for these features are listed below: Short path emphasis (SPE) and SPE(*t*) for each edge type *t* : ![Formula][5] ![Formula][6] where *n**r* is the total number of runs in the GRLM and *n**r* (*t*) is the total number of runs corresponding to edge type *t*. Long path emphasis (LPE) and LPE(*t*) for each edge type *t*: ![Formula][7] ![Formula][8] Edge type nonuniformity (ETN) and path length nonuniformity (PLN): ![Formula][9] ![Formula][10] ### Classification of Immune signature status The immune signature scores from the TCGA cohort were dichotomized at the median value; either an up regulated as signature score > median value or down regulated as signature score ≤ median value. This binary designation is used as the class label in the classification task. These gene signatures associated with immune status in GBM, immune effector response and immune suppression response, have been previously validated within GBM and were evaluated in every patient in our dataset 19. ### Statistical analysis A total of 28 MST based features (seven MST-features from each of the four habitats) and 24 GLCM based (SPE, LPE, ETN, PTN, and 10 SPE(*t*) and 10 LPE(*t*)) features for 10 edge types were extracted and analyzed in this study. Survival was dichotomized at the 12 month time point (i.e. ≤ or > 12 months) based on the median overall survival duration 1,2 and imbalance in sample size of the two survival groups was controlled using class-proportional sampling 31 in Waikato Environment for Knowledge Analysis (WEKA v3.7.12) 32. Classification of the survival labels using all of the MST features was performed using random forest (RF) classification 33 (10000 trees using random forest classifier within WEKA). This performance was evaluated using the receiver operating characteristic (ROC) curve. The ranking of the features was also obtained using a classifier-based attribute evaluator (where RF was set as the classifier) within WEKA. Sample size imbalance between classes is handled using class-proportional sampling 31. ## Results Classifier models were obtained using the random forest classification approach 33 with 5-fold cross-validation for the prediction of 12-month overall survival status based on all of the features derived from MST and GRLM-based methods, respectively. Figure 5 shows the ROC curves for the classifiers for MST and GRLM. The optimal cutoff point was determined by maximizing the sum of sensitivity and specificity. The results for the area under the ROC curves, the true positive rate, and true negative rate were summarized in Table 2. The area under the ROC curves was 0.832 for MST and 0.773 for GRLM. The accuracy was 0.807 for MST and 0.747 for GRLM computed using Eq. (11) ![Formula][11] where TP, FP, TN, and FN represent true positive, false positive, true negative, and false negative, respectively. The most important features based on rank (top five) are ratio of T1-low MST, ratio of T2-high MST, ratio of T1-high MST, disorder of T1-low MST, and standard deviation of T1-high for MST and LPE1, SPE4, LPE4, LPE10, and SPE10 for GRLM. View this table: [Table 2.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/T2) Table 2. The results of the ROC analysis for the 12-month survival prediction ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/18/2022.12.16.22283587/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/F5) Figure 5. ROC curve for prediction of 12-month survival status. The x-axis is the false positive rate (or 1 – specificity); the y-axis is the true positive rate (or sensitivity). The area under the ROC curve is 0.832 for MST and 0.773 for GRLM. The optimal cut off points are (0.23, 0.73) and (0.21, 0.82) for GRLM and MST, respectively. For classification of immune signature status, table 3 summarized the true positive rate, and true negative rate, and classification error for MST and GRLM, and Figure 6 shows the ROC curves for the classifiers for MST and GRLM, respectively. The optimal cutoff points were determined by maximizing the sum of sensitivity and specificity. View this table: [Table 3.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/T3) Table 3. The results of the ROC analysis for the immune status ![Figure 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/18/2022.12.16.22283587/F6.medium.gif) [Figure 6.](http://medrxiv.org/content/early/2022/12/18/2022.12.16.22283587/F6) Figure 6. ROC curve for prediction of immune signature status. The x-axis is the false positive rate (or 1 – specificity); the y-axis is the true positive rate (or sensitivity). (a) The area under the ROC curves for MST are 0.747 and 0.743 for IE and IS, respectively, and (b) the area under the ROC curves for GRLM are 0.681 and 0.789, for IE and IS, respectively. ## Discussion In this study, we identified four distinct groups of pixel intensities (habitats) within the tumor ROI obtained from different MR sequences and separated them into high and low intensities using Gaussian mixture model. All studies were performed in a 2D slice with the maximum tumor area in both T1 post-contrast image and the T2 FLAIR image. These four habitats are considered as distinct entities within an ROI and their spatial relationships are characterized using MST and GRLM approaches. The MST is one of the most common characterization of the spatial proximity of points distributed topologically in space 34. Gray-level run-length matrix is also a widely used method in texture analysis that characterizes image texture based on gray-levels run-length of image, introduced by Galloway 17. In this work, we applied these two methods to radiologically defined regions in GBM tumors and investigated the association between graph-based features of habitat proximity from each method (MST or GRLM) with 12-month overall survival status in GBM patients. In this study, we used random forest classification for the following reasons; (i) this classifier can handle large number of features, (ii) it works efficiently in scenarios where the number of instances is smaller than the number of features that are being used for classification, (iii) it is capable of performing cross-validation intrinsically, and (iv) it gives estimates of which variables are important in the classification 33. The proposed MST and GRLM features with existing methods/features could assist towards the assessment of overall survival and serve as a prognostic tool based on routine MRI scans obtained in these patients. To our knowledge, this is the first instance of the investigation of both of these habitat proximity characterizations (MST- and GRLM-based analyses) in the context of multiparametric MRI data and for its application to survival prognostication in GBM. In this study, we used graph-based methods that could have potential advantages to connect between radiologic imaging and cellular evolution within tumors 28. These results point to clinically relevant relationships of tumor-derived phenotype with overall survival. Such data could enable generation of valuable hypotheses for the investigation of phenotype relationships based on public domain datasets like TCGA. Further evaluation on clinically-matched patient cohorts with standardized imaging protocols is essential to strengthen evidence for the clinical translation of this finding. In this study, we also showed that these graph-based features are associated with immune signature status in GBM patients with analysis of ROC curves, especially for immune suppressor in GRLM with an AUC value of 0.789. One potential limitation is that this is a retrospective analysis performed using a publicly available database encompassing various scanning protocols and MRI systems resulting in differences in pixel resolution (256×256 or 512×512), slice thickness (1.4 ∼ 5.0mm), repetition time (4.9 ∼3285.6ms for T1 and 400 ∼ 1100ms for T2), and echo time (2.1 ∼ 20ms for T1 and 14 ∼ 155ms for T2). In this study, we performed image preprocessing steps such as pixel reslicing and intensity normalization to make the MR image aspects comparable across various patients. However, these variations in MR images need to be examined more systematically with both MST- and GRLM-based features. In this study, we presented graph-based methods for characterizing the spatial proximity of radiologically-defined habitats in GBM tumors, using a minimal spanning tree and graph run-length matrix construction. According to our results, the features from both MST- and GRLM-based methods provided quantitative metrics of image heterogeneity that have prognostic value for patient survival with high accuracy for both methods. We surmise that the spatial proximity features of habitats based on the MST and GRLM approaches may offer a promising method as a clinical prognostic tool in glioblastomas. However, this approach needs to be validated further in an independent patient cohort to confirm its predictive potential. ## Data Availability All data produced in the present study are available upon reasonable request to the authors ## Data Availability The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. ## Author Contributions Project conception and design were by J.L. and A.R. The data collection and preprocessing were performed by J.L., S.N., J.M and G.R. The software programming, statistical analysis, and interpretation were performed by J.L. The manuscript was written by J.L. and all authors reviewed the manuscript. ## Additional Information ### Competing Interests The authors declare no competing interests. ## Acknowledgements We would like to thank scientific editors, Markeda Wade, Tamara Locke, and Arthur Gelmis, for their help with manuscript editing and suggestions. The authors acknowledge the support of NCI P30 CA016672. * Received December 16, 2022. * Revision received December 16, 2022. * Accepted December 18, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission. ## References 1. Johnson, D. R. & O’Neill, B. P. Glioblastoma survival in the United States before and during the temozolomide era. J Neuro-Oncol 107, 359–364, doi:Doi 10.1007/S11060-011-0749-4 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s11060-011-0749-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22045118&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 2. Sathornsumetee, S. et al. Molecularly targeted therapy for malignant glioma. Cancer 110, 13–24, doi:10.1002/cncr.22741 (2007). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/cncr.22741&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=17520692&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000247384200002&link_type=ISI) 3. Assefa, D. et al. Robust texture features for response monitoring of glioblastoma multiforme on T1-weighted and T2-FLAIR MR images: a preliminary investigation in terms of identification and segmentation. Medical physics 37, 1722–1736 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1118/1.3357289&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20443493&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 4. Agner, S. C. et al. Textural kinetics: a novel dynamic contrast-enhanced (DCE)-MRI feature for breast lesion classification. Journal of digital imaging 24, 446–463, doi:10.1007/s10278-010-9298-1 (2011). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10278-010-9298-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20508965&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000290520200009&link_type=ISI) 5. Pope, W. B. et al. MR imaging correlates of survival in patients with high-grade gliomas. AJNR. American journal of neuroradiology 26, 2466–2474 (2005). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16286386&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000233497500006&link_type=ISI) 6. Mazurowski, M. A., Desjardins, A. & Malof, J. M. Imaging descriptors improve the predictive power of survival models for glioblastoma patients. Neuro Oncol 15, 1389–1394, doi:10.1093/neuonc/nos335 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/neuonc/nos335&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23396489&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 7. Itakura, H. et al. Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities. Science translational medicine 7, 303ra138, doi:10.1126/scitranslmed.aaa7582 (2015). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6InNjaXRyYW5zbWVkIjtzOjU6InJlc2lkIjtzOjE0OiI3LzMwMy8zMDNyYTEzOCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEyLzE4LzIwMjIuMTIuMTYuMjIyODM1ODcuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 8. Macyszyn, L. et al. Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques. Neuro Oncol 18, 417–425, doi:10.1093/neuonc/nov127 (2016). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/neuonc/nov127&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26188015&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 9. Gutman, D. A. et al. MR imaging predictors of molecular profile and survival: multi-institutional study of the TCGA glioblastoma data set. Radiology 267, 560–569, doi:10.1148/radiol.13120118 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1148/radiol.13120118&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23392431&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000318069700028&link_type=ISI) 10. Nicolasjilwan, M. et al. Addition of MR imaging features and genetic biomarkers strengthens glioblastoma survival prediction in TCGA patients. Journal of neuroradiology. Journal de neuroradiologie 42, 212–221, doi:10.1016/j.neurad.2014.02.006 (2015). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neurad.2014.02.006&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24997477&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 11. Gatenby, R. A., Grove, O. & Gillies, R. J. Quantitative imaging in cancer evolution and ecology. Radiology 269, 8–15, doi:10.1148/radiol.13122697 (2013). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1148/radiol.13122697&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24062559&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 12. Miles, K. A., Ganeshan, B., Griffiths, M. R., Young, R. C. & Chatwin, C. R. Colorectal cancer: texture analysis of portal phase hepatic CT images as a potential marker of survival. Radiology 250, 444–452, doi:10.1148/radiol.2502071879 (2009). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1148/radiol.2502071879&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19164695&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 13. Cheng, N. M. et al. Textural features of pretreatment 18F-FDG PET/CT images: prognostic significance in patients with advanced T-stage oropharyngeal squamous cell carcinoma. Journal of nuclear medicine : official publication, Society of Nuclear Medicine 54, 1703–1709, doi:10.2967/jnumed.112.119289 (2013). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Njoiam51bWVkIjtzOjU6InJlc2lkIjtzOjEwOiI1NC8xMC8xNzAzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTIvMTgvMjAyMi4xMi4xNi4yMjI4MzU4Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 14. Podlaha, O., Riester, M., De, S. & Michor, F. Evolution of the cancer genome. Trends in genetics : TIG 28, 155–163, doi:10.1016/j.tig.2012.01.003 (2012). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.tig.2012.01.003&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22342180&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000302920400002&link_type=ISI) 15. Basavanhally, A. N. et al. Computerized image-based detection and grading of lymphocytic infiltration in HER2+ breast cancer histopathology. IEEE transactions on bio-medical engineering 57, 642–653, doi:10.1109/TBME.2009.2035305 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/TBME.2009.2035305&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19884074&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 16. Tosun, A. B. & Gunduz-Demir, C. Graph run-length matrices for histopathological image segmentation. IEEE transactions on medical imaging 30, 721–732, doi:10.1109/TMI.2010.2094200 (2011). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/TMI.2010.2094200&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21097378&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 17. Galloway, M. M. Texture analysis using gray level run lengths. Computer graphics and image processing 4, 172–179 (1975). 18. Verhaak, R. G. et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer cell 17, 98–110, doi:10.1016/j.ccr.2009.12.020 (2010). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ccr.2009.12.020&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20129251&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000274471300013&link_type=ISI) 19. Doucette, T. et al. Immune heterogeneity of glioblastoma subtypes: extrapolation from the cancer genome atlas. Cancer immunology research 1, 112–122, doi:10.1158/2326-6066.CIR-13-0028 (2013). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FuaW1tIjtzOjU6InJlc2lkIjtzOjc6IjEvMi8xMTIiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8xMi8xOC8yMDIyLjEyLjE2LjIyMjgzNTg3LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 20. Barbie, D. A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112, doi:10.1038/nature08460 (2009). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature08460&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19847166&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000271419200042&link_type=ISI) 21. Cho, Y. J. et al. Integrative genomic analysis of medulloblastoma identifies a molecular subgroup that drives poor clinical outcome. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 29, 1424–1430, doi:10.1200/JCO.2010.28.5148 (2011). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjEwOiIyOS8xMS8xNDI0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTIvMTgvMjAyMi4xMi4xNi4yMjI4MzU4Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 22. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545–15550, doi:10.1073/pnas.0506580102 (2005). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTAyLzQzLzE1NTQ1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTIvMTgvMjAyMi4xMi4xNi4yMjI4MzU4Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 23. Tamayo, P. et al. Predicting relapse in patients with medulloblastoma by integrating evidence from clinical and genomic features. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 29, 1415–1423, doi:10.1200/JCO.2010.28.1675 (2011). [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamNvIjtzOjU6InJlc2lkIjtzOjEwOiIyOS8xMS8xNDE1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTIvMTgvMjAyMi4xMi4xNi4yMjI4MzU4Ny5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 24. Sled, J. G., Zijdenbos, A. P. & Evans, A. C. A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE transactions on medical imaging 17, 87–97, doi:10.1109/42.668698 (1998). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/42.668698&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9617910&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000073646700008&link_type=ISI) 25. Shah, M. et al. Evaluating intensity normalization on MRIs of human brain with multiple sclerosis. Medical image analysis 15, 267–282, doi:10.1016/j.media.2010.12.003 (2011). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.media.2010.12.003&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21233004&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F18%2F2022.12.16.22283587.atom) 26. McAuliffe, M. J. et al. Medical Image Processing, Analysis & Visualization in clinical research. Comp Med Sy, 381–386 (2001). 27. Reynolds, D. Gaussian mixture models. Encyclopedia of Biometrics, 659–663 (2009). 28. Zhou, M. et al. Radiologically defined ecological dynamics and clinical outcomes in glioblastoma multiforme: preliminary results. Translational oncology 7, 5–13 (2014). 29. Lee, J., Narang, S., Martinez, J. J., Rao, G. & Rao, A. Associating spatial diversity features of radiologically defined tumor habitats with epidermal growth factor receptor driver status and 12-month survival in glioblastoma: methods and preliminary investigation. Journal of Medical Imaging 2, 041006–041006 (2015). 30. Delaunay, B. (Izv. Acad. Nauk. SSSR, 1934). 31. Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 321–357 (2002). 32. Hall, M. et al. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11, 10–18 (2009). 33. Breiman, L. Random forests. Mach Learn 45, 5–32, doi:Doi 10.1023/A:1010933404324 (2001). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1023/A:1010933404324&link_type=DOI) 34. Arkadʹev, A. G. & Braverman, Ė. M. Computers and pattern recognition. (Thompson Book Co., 1967). [1]: /embed/graphic-3.gif [2]: /embed/graphic-5.gif [3]: /embed/graphic-6.gif [4]: /embed/graphic-7.gif [5]: /embed/graphic-10.gif [6]: /embed/graphic-11.gif [7]: /embed/graphic-12.gif [8]: /embed/graphic-13.gif [9]: /embed/graphic-14.gif [10]: /embed/graphic-15.gif [11]: /embed/graphic-17.gif