Abstract
Background and Aims Diagnostic differentiation between Crohn’s disease (CD) and ulcerative colitis (UC) is crucial for timely and suitable therapeutic measures. The current gold standard for differentiating between CD and UC involves endoscopy and histology, which are invasive and costly. We aimed to identify blood plasma proteomic signatures using a Protein-Wide Association Study (PWAS) approach to differentiate CD from UC and evaluate the efficacy of these signatures as features in machine learning (ML) classifiers.
Methods Among participants (n=1,106; nCD=636; nUC=470) of the Study of a Prospective Adult Research Cohort with IBD (SPARC), plasma protein (n=2,920) levels were estimated using Olink proteomics. A PWAS with Bonferroni correction for multiple testing was used to identify proteins associated with disease states after controlling for age, sex, and disease severity. ML classifiers examined the diagnostic utility of these models. Feature importance was determined via SHapley Additive exPlanations (SHAP) analysis.
Results Thirteen proteins which were significantly differentially abundant in CD vs UC (all |β|s > 0.22, all adjusted p values < 8.42E-06). Random forest models of proteins differentiated between CD and UC with models trained only on PWAS identified proteins (Average ROC-AUC 0.73) outperforming models trained of the full proteome (Average ROC-AUC 0.62). SHAP analysis revealed that Granzyme B, insulin-like peptide 5 (INSL5), and interleukin-12 subunit beta (IL-12B) were the most important features.
Conclusions Our findings demonstrate that PWAS-based feature selection approaches are a powerful method to identify features in complex, noisy datasets. Importantly, we have identified novel peptide based biomarkers such as INSL5, that can be potentially used to complement existing strategies to differentiate between CD and UC.
Competing Interest Statement
PD: has received research support under a sponsored research agreement unrelated to the data in the paper and/or consulting from AbbVie, Arena Pharmaceuticals, Boehringer Ingelheim, Bristol Myers Squibb, Janssen, Pfizer, Prometheus Biosciences, Takeda Pharmaceuticals, Roche Genentech, Scipher Medicine, Fresenius Kabi, Teva Pharmaceuticals, Landos Pharmaceuticals, Iterative scopes and CorEvitas, LLC. U.J. has received research support from Boehringer Ingelheim
Funding Statement
This work is supported by SPARC IBD PLEXUS Grant from CCFA and NIH Washington University-DDRCC Grant Number P30 DK052574 to U.J. M.G.G and S.R.F are supported by awards from the Pediatric Gastroenterology Research Training Program (T32 DK077653). A.J.G is supported by NSF (DGE-213989). PD is supported by a Junior Faculty Development Award from the American College of Gastroenterology and IBD Plexus of the Crohn's & Colitis Foundation
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study protocol was approved by the Institutional Review Board (IRB) at the University of Pennsylvania which is the single IRB for the SPARC IBD study.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The SPARC IBD data are available upon approved application to Crohns & Colitis Foundation IBD Plexus
Abbreviations
- IBD
- Inflammatory bowel diseases
- PWAS
- Protein-Wide Association Study
- CD
- Crohn’s disease
- UC
- Ulcerative colitis