Abstract
When designing individualized treatment protocols for cancer patients, clinicians must synthesize the information from multiple data modalities into a single parsimonious description of the patient’s personal disease. However, such a description of a patient is never observed. In this work, we propose to model these patient descriptions as latent discriminative subtypes—sample representations which can be learned from one data modality and used to contextualize predictions based on another data modality. We apply contextual deep learning to learn these sample-specific discriminative subtypes from lung cancer histopathology imagery. Based on these subtypes, we produce sample-specific transcriptomic models which accurately classify samples as adenocarcinoma, squamous cell carcinoma, or healthy tissue (F1 score of 0.97, outperforming previous state-of-the-art multimodal approaches). Combining these data modalities in a single pipeline not only improves the predictive accuracy, but also gives biological interpretations of the discriminative subtypes and ties the phenotypic patterns present in histopathology images to biological processes.
Competing Interest Statement
EX is a founder of Petuum, Inc.
Funding Statement
This work was supported in part by NIH R01GM114311. B.L. was supported in part by the CMLH Fellowship. M.A. was supported in part by the Google PhD Fellowship. A.A. was supported in part NIH grants 1R01GM122096 and OT2OD026682. J.W. was supported by NIH T32 training grant T32 EB009403 as part of the HHMI-NIBIB Interfaces Initiative.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
None required.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
TCGA data is publicly available at the following link.