RT Journal Article SR Electronic T1 Fully automated histological classification of cell types and tissue regions of celiac disease is feasible and correlates with the Marsh score JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.12.11.23299520 DO 10.1101/2023.12.11.23299520 A1 Griffin, Michael A1 Gruver, Aaron M. A1 Shah, Chintan A1 Wani, Qasim A1 Fahy, Darren A1 Khosla, Archit A1 Kirkup, Christian A1 Borders, Daniel A1 Brosnan-Cashman, Jacqueline A. A1 Fulford, Angie D. A1 Credille, Kelly M. A1 Jayson, Christina A1 Najdawi, Fedaa A1 Gottlieb, Klaus YR 2023 UL http://medrxiv.org/content/early/2023/12/11/2023.12.11.23299520.abstract AB Aims Histological assessment is essential for the diagnosis and management of celiac disease. Current scoring systems, including modified Marsh (Marsh–Oberhuber) score, lack inter-pathologist agreement. To address this unmet need, we aimed to develop a fully automated, quantitative approach for histology characterisation of celiac disease.Methods Convolutional neural network models were trained using pathologist annotations of haematoxylin and eosin-stained biopsies of celiac disease mucosa and normal duodenum to identify cells, tissue and artifact regions. Human interpretable features were extracted and the strength of their correlation with Marsh scores were calculated using Spearman rank correlations.Results Our model accurately identified cells, tissue regions and artifacts, including distinguishing intraepithelial lymphocytes and differentiating villous epithelium from crypt epithelium. Proportional area measurements representing villous atrophy negatively correlated with Marsh scores (r=−0.79), while measurements indicative of crypt hyperplasia and intraepithelial lymphocytosis positively correlated (r=0.71 and r=0.44, respectively). Furthermore, features distinguishing celiac disease from normal colon were identified.Conclusions Our novel model provides an explainable and fully automated approach for histology characterisation of celiac disease that correlates with modified Marsh scores, facilitating diagnosis, prognosis, clinical trials and treatment response monitoring.What is already known on this topic➢ Prior research has utilised machine learning (ML) techniques to detect celiac disease and evaluate disease severity based on Marsh scores.➢ However, existing approaches lack the capability to provide fully explainable tissue segmentation and cell classifications across whole slide images in celiac disease histology.➢ The need for a more comprehensive and interpretable ML-based method for celiac disease diagnosis and characterisation is evident from the limitations of currently available scoring systems as well as inter-pathologist variability.What this study adds➢ This study is the first to introduce an explainable ML-based approach that provides comprehensive, objective celiac disease histology characterisation, overcoming inter-observer variability and offering a scalable tool for assessing disease severity and monitoring treatment response.How this study might affect research, practice or policy➢ This study’s fully automated and ML-based histological analysis, including the correlation of Marsh scores, has the potential to enable more precise disease severity measurement, risk assessment and clinical trial endpoint evaluation, ultimately improving patient care.Competing Interest StatementAMG, ADF, KMC and KG are employees and shareholders of Eli Lilly and Company. MG, CS, QS, DF, AK, CK, DB, JABC, CJ and FN are employees of PathAI.Funding StatementThis study was sponsored by Eli Lilly and Company. Medical writing assistance was provided by Jason Vuong, BPharm, and Clare Weston, MSc, of ProScribe Envision Pharma Group, and was funded by Eli Lilly and Company. ProScribe's services complied with international guidelines for Good Publication Practice.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:WCG IRB protocol number: 1316112I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesModel parameters for cell and tissue models, and codes for model training, inference and feature extractions are not disclosed. Access requests for such code will not be considered to safeguard PathAI's intellectual property. All feature tables, as well as source code, for reproducing correlational analyses will be deposited to GitHub prior to publication, and the link will be provided at that time. Access to cell- and tissue-type heatmaps, as well as usage of cell- and tissue-type classification models, are available on reasonable request to academic investigators, without relevant conflicts of interest, for non-commercial use who agree not to distribute the data. Access requests can be made to: publications@pathai.com.