PT - JOURNAL ARTICLE AU - Lavinia Loeffler, Chiara Maria AU - El Nahhas, Omar S.M. AU - Muti, Hannah Sophie AU - Seibel, Tobias AU - Cifci, Didem AU - Treeck, Marko van AU - Gustav, Marco AU - Carrero, Zunamys I. AU - Gaisa, Nadine T. AU - Lehmann, Kjong-Van AU - Leary, Alexandra AU - Selenica, Pier AU - Reis-Filho, Jorge S. AU - Bruechle, Nadina Ortiz AU - Kather, Jakob Nikolas TI - Direct prediction of Homologous Recombination Deficiency from routine histology in ten different tumor types with attention-based Multiple Instance Learning: a development and validation study AID - 10.1101/2023.03.08.23286975 DP - 2023 Jan 01 TA - medRxiv PG - 2023.03.08.23286975 4099 - http://medrxiv.org/content/early/2023/03/10/2023.03.08.23286975.short 4100 - http://medrxiv.org/content/early/2023/03/10/2023.03.08.23286975.full AB - Background Homologous Recombination Deficiency (HRD) is a pan-cancer predictive biomarker that identifies patients who benefit from therapy with PARP inhibitors (PARPi). However, testing for HRD is highly complex. Here, we investigated whether Deep Learning can predict HRD status solely based on routine Hematoxylin & Eosin (H&E) histology images in ten cancer types.Methods We developed a fully automated deep learning pipeline with attention-weighted multiple instance learning (attMIL) to predict HRD status from histology images. A combined genomic scar HRD score, which integrated loss of heterozygosity (LOH), telomeric allelic imbalance (TAI) and large-scale state transitions (LST) was calculated from whole genome sequencing data for n=4,565 patients from two independent cohorts. The primary statistical endpoint was the Area Under the Receiver Operating Characteristic curve (AUROC) for the prediction of genomic scar HRD with a clinically used cutoff value.Results We found that HRD status is predictable in tumors of the endometrium, pancreas and lung, reaching cross-validated AUROCs of 0.79, 0.58 and 0.66. Predictions generalized well to an external cohort with AUROCs of 0.93, 0.81 and 0.73 respectively. Additionally, an HRD classifier trained on breast cancer yielded an AUROC of 0.78 in internal validation and was able to predict HRD in endometrial, prostate and pancreatic cancer with AUROCs of 0.87, 0.84 and 0.67 indicating a shared HRD-like phenotype is across tumor entities.Conclusion In this study, we show that HRD is directly predictable from H&E slides using attMIL within and across ten different tumor types.Competing Interest StatementJNK reports consulting services for Owkin, France, Panakeia, UK and DoMore Diagnostics, Norway and has received honoraria for lectures by MSD, Eisai and Fresenius. JSRF reports a leadership (board of directors) role at Grupo Oncoclinicas, stock or other ownership interests at Repare Therapeutics and Paige.AI, and a consulting or Advisory Role at Genentech/Roche, Invicro, Ventana Medical Systems, Volition RX, Paige.AI, Goldman Sachs, Bain Capital, Novartis, Repare Therapeutics, Lilly, Saga Diagnostics, Swarm and Personalis. No other potential conflicts of interest are reported by any of the authors.Funding StatementJNK is supported by the German Federal Ministry of Health (DEEP LIVER, ZMVI1-2520DAT111) and the Max-Eder-Programme of the German Cancer Aid (grant #70113864), the German Federal Ministry of Education and Research (PEARL, 01KD2104C), and the German Academic Exchange Service (SECAI, 57616814). This research was supported by the National Institute for Health and Care Research (NIHR, NIHR213331) Leeds Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. JSRF is funded in part by the Breast Cancer Research Foundation, a Susan G Komen Leadership Grant, the NIH/NCI P50 CA247749 01 grant and by the NIH/NCI Cancer Center Core Grant P30-CA008748.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:https://portal.gdc.cancer.gov/ https://www.cbioportal.org/ https://www.cancerimagingarchive.net/I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesThe WSI, molecular and clinical data for TCGA and CPTAC cohorts are publicly accessible at https://portal.gdc.cancer.gov/ and https://www.cbioportal.org/ (accessed, 08 March 2022). Script for calculating the HRD score is available under https://github.com/sztup/scarHRD (accessed 06 June 2022). All other source codes can be downloaded under https://github.com/KatherLab/marugoto. Our calculated HRD score is publicly available in Supplementary Table 2. Moreover, our custom TCGA-BRCA HRD-H and HRD-L group can be accessed for the PanCancer Atlas cohort at https://www.cbioportal.org/ (Supplementary 3). https://portal.gdc.cancer.gov/ https://www.cbioportal.org/ https://github.com/sztup/scarHRD https://github.com/KatherLab/marugoto. https://www.cancerimagingarchive.net/ AIartificial intelligenceASCATAllele-Specific Copy number Analysis of TumorsattMILattention-weighted multiple instance learningAUROCArea Under the Receiver Operating Characteristic curveBRCAbreast invasive carcinomaBRCA1/2Breast Cancer genes 1 and 2CIconfidence intervalCIOMSCouncil for International Organizations of Medical SciencesCPTACClinical Proteomic Tumor Analysis ConsortiumCRCcolorectal cancerDLDeep LearningDSBDNA double-strand breaksER-estrogen receptor negativeER+estrogen receptor positiveFDAU.S. Food and Drug AdministrationGBMglioblastomaGDCGenomic Data CommonsGISgenomic instability scoreH&EHematoxylin & EosinHRHomologous recombinationHRD-HHRD highHRD-LHRD lowHRDHomologous Recombination DeficiencyHRRHomologous recombination repairLIHCliver hepatocellular carcinomaLOHloss of heterozygosityLSCCsquamous cell carcinoma of the lungLSTlarge-scale state transitionsLUADadenocarcinoma of the lungLUSCsquamous cell carcinoma of the lungOVovarian cancer (OV)PAADpancreatic adenocarcinomaPDApancreatic adenocarcinomaPARPPoly(ADP-Ribose)-polymerasePARPiPoly(ADP-Ribose)-polymerase inhibitorPRADprostate adenocarcinomaPRCprecision recall curveROCreceiving operating curveSBS3single base substitution 3SNPsingle nucleotide polymorphismSSDBssingle strand DNA breaksSSLself-supervised learningTAItelomeric allelic imbalanceTCGAThe Cancer Genome AtlasTRIPODTransparent reporting of a multivariable prediction model for individual prognosis or diagnosisUCECendometrial carcinomaWSIwhole slide images