PT - JOURNAL ARTICLE AU - Xu, Zhuoran AU - Verma, Akanksha AU - Naveed, Uska AU - Bakhoum, Samuel AU - Khosravi, Pegah AU - Elemento, Olivier TI - Using Histopathology Images to Predict Chromosomal Instability in Breast Cancer: A Deep Learning Approach AID - 10.1101/2020.09.23.20200139 DP - 2020 Jan 01 TA - medRxiv PG - 2020.09.23.20200139 4099 - http://medrxiv.org/content/early/2020/09/24/2020.09.23.20200139.short 4100 - http://medrxiv.org/content/early/2020/09/24/2020.09.23.20200139.full AB - Chromosomal instability (CIN) is a hallmark of human cancer that involves mis-segregation of chromosomes during mitosis, leading to aneuploidy and genomic copy number heterogeneity. CIN is a prognostic marker in a variety of cancers, yet, gold-standard experimental assessment of chromosome mis-segregation is difficult in the routine clinical setting. As a result, CIN status is not readily testable for cancer patients in such setting. On the other hand, the gold-standard for cancer diagnosis and grading, histopathological examinations, are ubiquitously available. In this study, we sought to explore whether CIN status can be predicted using hematoxylin and eosin (H&E) histology in breast cancer patients. Specifically, we examined whether CIN, defined using a genomic aneuploidy burden approach, can be predicted using a deep learning-based model. We applied transfer learning on convolutional neural network (CNN) models to extract histological features and trained a multilayer perceptron (MLP) after aggregating patch features obtained from whole slide images. When applied to a breast cancer cohort of 1,010 patients (Training set: n=858 patients, Test set: n=152 patients) from The Cancer Genome Atlas (TCGA) where 485 patients have high CIN status, our model accurately classified CIN status, achieving an area under the curve (AUC) of 0.822 with 81.2% sensitivity and 68.7% specificity in the test set. Patch-level predictions of CIN status suggested intra-tumor spatial heterogeneity within slides. Moreover, presence of patches with high predicted CIN score within an entire slide was more predictive of clinical outcome than the average CIN score of the slide, thus underscoring the clinical importance of spatial heterogeneity. Overall, we demonstrated the ability of deep learning methods to predict CIN status based on histopathology slide images. Our model is not breast cancer subtype specific and the method can be potentially extended to other cancer types.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNo external funding has been received for this work.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This research involves public available deidentified patient data collected by The Cancer Genome Atlas. The Weill Cornell Medicine Institutional Review Board found this research exempt since it does not involve identifiable private information.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesWhole slide images along with clinical and genomic data can be downloaded from The Cancer Genome Atlas (TCGA), project TCGA-BRCA. The source code and the guideline will be publicly available at https://github.com/eipm/CIN.