Abstract
Tumor-Infiltrating Lymphocytes (TILs) have strong prognostic and predictive value in breast cancer, but their visual assessment is subjective. To improve reproducibility, the International Immuno-oncology Working Group recently released recommendations for the computational assessment of TILs that build on visual scoring guidelines. However, existing resources do not adequately address these recommendations due to the lack of annotation datasets that enable joint, panoptic segmentation of tissue regions and cells. Moreover, existing deep-learning methods focus entirely on either tissue segmentation or cell nuclei detection, which complicates the process of TILs assessment by necessitating the use of multiple models and reconciling inconsistent predictions. We introduce PanopTILs, a region and cell-level annotation dataset containing 814,886 nuclei from 151 patients, openly accessible at: sites.google.com/view/panoptils. Using PanopTILs we developed MuTILs, a neural network optimized for assessing TILs in accordance with clinical recommendations. MuTILs is a concept bottleneck model designed to be interpretable and to encourage sensible predictions at multiple resolutions. Using a rigorous internal-external cross-validation procedure, MuTILs achieves an AUROC of 0.93 for lymphocyte detection and a DICE coefficient of 0.81 for tumor-associated stroma segmentation. Our computational score closely matched visual scores from 2 pathologists (Spearman R=0.58-0.61, p<0.001). Moreover, computational TILs scores had a higher prognostic value than visual scores, independent of TNM stage and patient age. In conclusion, we introduce a comprehensive open data resource and a novel modeling approach for detailed mapping of the breast tumor microenvironment.
Competing Interest Statement
R.S. has received research support from Merck, Roche, Puma; and travel/congress support from AstraZeneca, Roche, and Merck; and he has served as an advisory board member of BMS and Roche and consults for BMS.
Funding Statement
This work was supported by the U.S. National Institutes of Health National Cancer Institute grants U01CA220401 and U24CA19436201. We acknowledge support from Dr. David Gutman and the American Cancer Society, including Dr. Mia M. Gaudet, Dr. Samantha Puvanesarajah, Dr. Lauren Teras, James Hodge, and Elizabeth Bain.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
We relied on whole-slide images and clinical data from The Cancer Genome Atlas dataset, which is publicly available at: https://portal.gdc.cancer.gov/. Additionally, we used data from the Cancer Prevention Study-II (CPS-II). Approval for access and use of the CPS-II dataset was granted via an institutional data-sharing agreement between the American Cancer Society and Northwestern University. All data was shared with Northwestern in de-identified form. The Institutional Review Body of the American Cancer Society provided the initial approval for the collection of data, and all participants provided written informed consent for research usage of their data. Further details on the ethics oversight can be found at the original CPS-II publication: Calle EE, Rodriguez C, Jacobs EJ, Almon ML, Chao A, McCullough ML, Feigelson HS, Thun MJ. The American cancer society cancer prevention study II nutrition cohort: rationale, study design, and baseline characteristics. Cancer. 2002 May 1;94(9):2490-501. For questions, contact the American Cancer Society at: https://www.cancer.org/
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
↵† Co-first authors.
- More detailed evaluation with F1 score, MCC, specificity/sensitivity, and precision/recall. - Additional comparisons to alternative models, trained on region-only or nucleus-only datasets. - A detailed performance analysis of the impact of region constraint - Added rationale for using 3% as the threshold for No.TIL / Total No of cells. - Additional manual score for comparison to computational score - Enriched discussion and additional results elsewhere
Data Availability
The PanopTILs dataset is made public at: https://sites.google.com/view/panoptils/. The BCSS and NuCLS datasets used for constructing PanopTILs are publicly available, and so are the TCGA clinical data. The Cancer Prevention Study-II data is available via the American Cancer Society (https://www.cancer.org/).
https://sites.google.com/view/panoptils/