ABSTRACT
Aims Histological assessment is essential for the diagnosis and management of celiac disease. Current scoring systems, including modified Marsh (Marsh–Oberhuber) score, lack inter-pathologist agreement. To address this unmet need, we aimed to develop a fully automated, quantitative approach for histology characterisation of celiac disease.
Methods Convolutional neural network models were trained using pathologist annotations of haematoxylin and eosin-stained biopsies of celiac disease mucosa and normal duodenum to identify cells, tissue and artifact regions. Human interpretable features were extracted and the strength of their correlation with Marsh scores were calculated using Spearman rank correlations.
Results Our model accurately identified cells, tissue regions and artifacts, including distinguishing intraepithelial lymphocytes and differentiating villous epithelium from crypt epithelium. Proportional area measurements representing villous atrophy negatively correlated with Marsh scores (r=−0.79), while measurements indicative of crypt hyperplasia and intraepithelial lymphocytosis positively correlated (r=0.71 and r=0.44, respectively). Furthermore, features distinguishing celiac disease from normal colon were identified.
Conclusions Our novel model provides an explainable and fully automated approach for histology characterisation of celiac disease that correlates with modified Marsh scores, facilitating diagnosis, prognosis, clinical trials and treatment response monitoring.
What is already known on this topic
➢ Prior research has utilised machine learning (ML) techniques to detect celiac disease and evaluate disease severity based on Marsh scores.
➢ However, existing approaches lack the capability to provide fully explainable tissue segmentation and cell classifications across whole slide images in celiac disease histology.
➢ The need for a more comprehensive and interpretable ML-based method for celiac disease diagnosis and characterisation is evident from the limitations of currently available scoring systems as well as inter-pathologist variability.
What this study adds
➢ This study is the first to introduce an explainable ML-based approach that provides comprehensive, objective celiac disease histology characterisation, overcoming inter-observer variability and offering a scalable tool for assessing disease severity and monitoring treatment response.
How this study might affect research, practice or policy
➢ This study’s fully automated and ML-based histological analysis, including the correlation of Marsh scores, has the potential to enable more precise disease severity measurement, risk assessment and clinical trial endpoint evaluation, ultimately improving patient care.
Competing Interest Statement
AMG, ADF, KMC and KG are employees and shareholders of Eli Lilly and Company. MG, CS, QS, DF, AK, CK, DB, JABC, CJ and FN are employees of PathAI.
Funding Statement
This study was sponsored by Eli Lilly and Company. Medical writing assistance was provided by Jason Vuong, BPharm, and Clare Weston, MSc, of ProScribe Envision Pharma Group, and was funded by Eli Lilly and Company. ProScribe's services complied with international guidelines for Good Publication Practice.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
WCG IRB protocol number: 1316112
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Model parameters for cell and tissue models, and codes for model training, inference and feature extractions are not disclosed. Access requests for such code will not be considered to safeguard PathAI's intellectual property. All feature tables, as well as source code, for reproducing correlational analyses will be deposited to GitHub prior to publication, and the link will be provided at that time. Access to cell- and tissue-type heatmaps, as well as usage of cell- and tissue-type classification models, are available on reasonable request to academic investigators, without relevant conflicts of interest, for non-commercial use who agree not to distribute the data. Access requests can be made to: publications@pathai.com.