Abstract
Lymphomas vary in terms of clinical behavior, morphology, and response to therapies and thus accurate classification is essential for appropriate management of patients. In this study, using a set of 670 cases of lymphoma obtained from a center in Guatemala City, we propose an interpretable machine learning method, LymphoML, for lymphoma subtyping into eight diagnostic categories. LymphoML sequentially applies steps of (1) object segmentation to extract nuclei, cells, and cytoplasm from hematoxylin and eosin (H&E)-stained tissue microarray (TMA) cores, (2) feature extraction of morphological, textural, and architectural features, and (3) aggregation of per-object features to create patch-level feature vectors for lymphoma classification. LymphoML achieves a diagnostic accuracy of 64.3% (AUROC: 85.9%, specificity: 88.7%, sensitivity: 66.9%) among 8 lymphoma subtypes using only H&E-stained TMA core sections, at a level similar to experienced hematopathologists. We find that the best model’s set of nuclear and cytoplasmic morphological, textural, and architectural features are most discriminative for diffuse large B-cell lymphoma (F1: 78.7%), classic Hodgkin lymphoma (F1 score: 74.5%), and mantle cell lymphoma (F1: 71.0%). Nuclear shape features provide the highest diagnostic yield, with nuclear texture, cytoplasmic, and architectural features providing smaller gains in accuracy. Finally, combining information from the H&E-based model together with the results of a limited set of immunohistochemical (IHC) stains resulted in a similar diagnostic accuracy (accuracy: 85.3%, AUROC: 95.7%, sensitivity: 84.5%, specificity: 93.5%) as with a much larger set of IHC stains (accuracy: 86.1%, AUROC: 96.7%, specificity: 93.2%, sensitivity: 86.0%). Our work suggests a potential way to incorporate machine learning tools into clinical practice to reduce the number of expensive IHC stains while achieving a similar level of diagnostic accuracy.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee/IRB of Stanford University and La Liga Nacional Contra El Cancer gave ethical approval for this work.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Competing Interests: The authors declare no competing financial interests.
Data Availability
All data produced in the present study are available upon reasonable request to the authors.