Development of a machine-learning model to assess terminal ileum Endoscopic healing in pediatric Crohn’s disease from Magnetic Resonance Enterography data
============================================================================================================================================================

* Itai Guez
* Gili Focht
* Mary-Louise C. Greer
* Ruth Cytter-Kuint
* Li-tal Pratt
* Denise A. Castro
* Dan Turner
* Anne M. Griffiths
* Moti Freiman

## Abstract

**Background and Aims** Endoscopic healing (EH), is a major treatment goal for Crohn’s disease (CD). However, terminal ileum (TI) intubation failure is common, especially in children. We evaluated the added-value of machine-learning models in imputing a TI Simple Endoscopic Score for CD (SES-CD) from Magnetic Resonance Enterography (MRE) data of pediatric CD patients.

**Methods** This is a sub-study of the prospective ImageKids study. We developed machine-learning and baseline linear-regression models to predict TI SES-CD score from the Magnetic Resonance Index of Activity (MaRIA) and the Pediatric Inflammatory Crohn’s MRE Index (PICMI) variables. We assessed TI SES-CD predictions’ accuracy for intubated patients with a stratified 2-fold validation experimental setup, repeated 50 times. We determined clinical impact by imputing TI SES-CD in patients with ileal intubation failure during ileocolonscopy.

**Results** A total of 223 children were included (mean age 14.1±2.5 years), of whom 132 had all relevant variables (107 with TI intubation and 25 with TI intubation failure). The combination of a machine-learning model with the PICMI variables achieved the lowest SES-CD prediction error compared to a baseline MaRIA-based linear regression model for the intubated patients (N=107, 11.7 (10.5-12.5) vs. 12.1 (11.4-12.9), p<0.05). The PICMI-based models suggested a higher rate of patients with TI disease among the non-intubated patients compared to a baseline MaRIA-based linear regression model (N=25, up to 25/25 (100%) vs. 23/25 (92%)).

**Conclusions** Machine-learning models with clinically-relevant variables as input are more accurate than linear-regression models in predicting TI SES-CD and EH when using the same MRE-based variables.

Keywords
*   Imaging
*   Endoscopy
*   Pediatrics

## 1. Introduction

Treatment goals for Crohn’s disease (CD) have shifted in the past two decades from symptom control towards more objective goals in accordance to a “treat-to-target” philosophy.1 Specifically, healing of inflammatory lesions2 defined as endoscopic healing (EH) is currently considered a primary objective treatment goal in clinical trials.3 The common term “Mucosal healing” has been recently replaced with “endoscopic healing (EH)” (to differentiate from “histological healing”).

The Simple Endoscopic Score for CD (SES-CD)4 assessed by ileocolonoscopy is considered the most established endoscopic score for EH evaluation.5–7 However, ileal intubation failure during ileocolonoscopy occurs in adults in 13% of CD patients8 and in considerably higher rates of up to 20-25% of pediatric CD patients.3,9 This is especially concerning as CD is mostly prevalent in the small bowel with more than 50% of CD patients having terminal ileum (TI) involvement.10 Consequently, a patient with ileal intubation failure during ileocolonoscopy can be falsely classified as achieving EH, posing a significant challenge for clinical trials.

Magnetic resonance enterography (MRE) has emerged as an imaging modality for evaluating transmural healing.2,11 The most common MRE index in adults of CD activity is the Magnetic Resonance Index of Activity (MaRIA).12 Recently, the Pediatric Inflammatory Crohn’s MRE Index (PICMI) has been developed and validated specifically to assess inflammatory activity in pediatric CD patients. It does not require rectal enema or gadolinium-based contrast agents, and it uses the sum of all affected segments in the entire small and large bowel, thereby reflecting the more extensive nature of disease in CD patients.13

Imputation of the TI SES-CD subscore from MRE data has the potential to improve overall EH assessment in non-intubated patients. Weiss et al previously developed a linear regression model for imputing the TI SES-CD subscore from MRE features with data collected as part of the ImageKids study.3,9 However, the model had only a moderate correlation with the TI subscore.9,14 In recent years, non-linear machine-learning models demonstrated their potential in improving inflammatory bowel disease (IBD) assessment in various clinical applications by leveraging complex non-linear correlations between the input variables and outcomes.12,15,16

The goals of this study were to evaluate the added-value of non-linear machine-learning models in imputing TI SES-CD from MRE data of pediatric CD patients compared to linear-regression models and to determine the most predictive set of MRE-based variables for EH prediction. The machine-learning-based models will have the potential to aid physicians in avoiding misdiagnoses, improving assessment in clinical trials, and ultimately guiding appropriate treatment options for CD patients. The presented models to impute TI SES-CD from MRE variables are available to the community through a dedicated website at: [https://tcml-bme.github.io/ML\_SESCD.html](https://tcml-bme.github.io/ML_SESCD.html)

## 2. Methods

### 2.1 Study Design

This is a sub-study of the multicenter ImageKids study ([NCT01881490](http://medrxiv.org/lookup/external-ref?link_type=CLINTRIALGOV&access_num=NCT01881490&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom)17) in which a total of 240 children who fulfilled the eligibility criteria of age 6 to 18 years and an established diagnosis of CD with ileocolonoscopy followed by an MRE within 14 days without any change in treatment were enrolled. MRE sequences and their scoring system were standardized across centers.18 Each protocol included a localizer sequence, a motility sequence in the coronal plane, followed by a series of coronal and/or axial rapid T2-weighted, T1-weighted gradient echo sequences and diffusion-weighted imaging (DWI). T1-weighted sequences were performed pre- and post-intravenous gadolinium-based contrast agent (GBCA) injection. An intravenous antispasmodic agent (glucagon or hyoscine butylbromide) was administered following the motility sequence. MRE was performed without enema as it is less feasible in pediatrics. Data were collected from 22 pediatric IBD centers in North America, Europe, Australia and Israel.18

### 2.2 Endoscopic report and gastroenterologist assessment

The SES-CD evaluates four items (ulcer size, proportion of the surface area that is ulcerated, proportion of the surface area affected, narrowing) by scoring each item on an ordinal scale from 0 to 3 in five bowel segments (TI, right colon, transverse colon, left colon, rectum), where higher score indicates a more severe disease defined by higher percentage of mucosal inflammation. The maximum score in a passable segment is 11 while a score of 12 can be obtained only in the last evaluated segment when there is impassable stenosis. The total SES-CD score, which is the sum of the five segments scores, ranges from 0 to 56. No adjustment is made to account for non-visualized segments (due to resection, technical difficulty or stenosis).4 The SES-CD items were scored by a gastroenterologist during ileocolonoscopy for each bowel segment. Gastroenterologist global assessment (GGA) of the degree of inflammation was scored on a 100-mm visual analogue score (VAS) and on a Likert scale from 1 (i.e. Complete deep remission) to 5 (i.e. Severe disease activity).

### 2.3 MRE and radiologist assessment

Each MRE was read three times by independent radiologists highly experienced in pediatric IBD (an on-site radiologist and two central radiologists using an explicit case report form). The radiologists were blinded to the clinical and endoscopic data. Radiologists completed, among others, the Radiologist global assessment (RGA) as well as the MaRIA variables,19 the PICMI variables,13 and the length of the diseased segments. The TI was defined in the Imagekids study as the first 10cm segment of the ileum measured from the ileocecal valve. The PICMI variables, the MaRIA variables and the RGA score were read by two independent radiologists and in case of large discrepancy between readings, a third reading was added. Table supp1 in the supplementary materials summarizes the variables for each index.

### 2.4. Development of ML algorithms

We developed two non-linear machine-learning-based models using the random-forest algorithm20 to predict EH from MRE each using a different set of variables. The first model used the MaRIA variables, and the second used the PICMI variables to predict EH. In addition, we constructed two multiple-linear-regression models each using the same two sets of variables as input features, and a linear-regression model based solely on the original MaRIA index to serve as baseline methods.

For development and validation purposes we used only patients with an available TI SES-CD score which had all MRE variables of both indices. The segment length was not used as a variable for the development as it is not part of MaRIA and PICMI indices.

Table supp4 summarizes the full parameter list used for configuration and optimization of the machine learning algorithms.

### 2.5 Statistics

We used a total of 100 folds (stratified 2-fold validation, repeated 50 times) in order to train each algorithm version with a random set of patients used for derivation and validation. We assessed the added-value of the machine-learning-based models in imputing TI SES-CD values by comparing the accuracy of the imputed TI-SES-CD predicted machine-learning-based models based on the MaRIA and PICMI variables to the predictions made by their counterpart multiple-linear regression models and to the linear regression model with the MaRIA index. We used the mean-squared-error (MSE) from the reference endoscopic TI SES-CD as the measure of accuracy. We also determined which set of variables (PICMI, or MaRIA) provides the most accurate predictions in comparison to the reference endoscopic TI SES-CD measurements.

We used the Wilcoxon non-parametric test21 with Bonferroni correction to control the family-wise error rate (FWER)22 to determine whether the median of validation MSE differed between two given models. In addition, we used receiver operating characteristic (ROC) curves to evaluate the capacity of the different models’ predictions of TI SES-CD to distinguish between patients with and without EH in comparison with reference EH assessments based on endoscopic TI SES-CD. We defined reference EH as SES-CD<3.23 The predicted SES-CD was normalized to a 0-1 range by dividing it by the maximum possible SES-CD value of 12. We determined the whether the difference in area under the ROC curve (AUC) of the different models is statistically significant with the Delong’s test.24 The models were written in Python [version 3.6.4] using the open-source Scipy [version 1.5.4] library and the open-source Statsmodels [version 0.11.0] library. Models ran on an Intel i3 CPU.

Finally, we assessed the clinical added-value of the models in imputing SES-CD for non-intubated patients. We trained the different models on all available data from patients with endoscopic TI SES-CD and imputed SES-CD from the MRE data for the non-intubated patients.

## 3. Results

### 3.1 Study population

Of the 240 children included in the ImageKids study, 17 were excluded (5 due to incomplete colonoscopy and 12 due to prior bowel surgery). The ileal intubation failure rate among the 223 included children was 15% (n=34) and 18% (n=43) among the total cohort. There was no statistically significant difference (Mann-Whitney non-parametric test) in all variables between the intubated and non-intubated groups except for C-Reactive Protein (CRP). GGA-VAS and the ratio of patients with disease limited to TI was higher in the non-intubated but not statistically significant. (Table supp2 in supplementary materials).

### 3.2 Patient-level indices: MRE report

A total of 132 children who had all relevant PICMI and MaRIA variables were included in this sub-study. 107 of them had a TI SES-CD score (intubated group), while 25 did not have TI SES-CD scores due to ileal intubation failure during ileocolonoscopy (non-intubated group). Figure 1 summarizes the patients selection process for the study. Table supp3 summarizes the demographic and variables values for the patients included in this sub-study.

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/06/2021.08.29.21262424/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/F1)

Figure 1: 
Patients selection from the entire cohort. MRE: Magnetic Resonance Enterography

There were no differences in the population basic characteristics between the intubated and non-intubated groups except for the PICMI score (21.5±10.43 vs. 26.52±7.63, p=0.015). Although not statistically significant, GGA-VAS and CRP had a higher mean score in the group with no ileal intubation (GGA-VAS: 37 (20-60) vs. 40 (22-70), p=0.22, and CRP: 7.3 (1.8-25.8) vs. 12.65 (4.98-24.68), p=0.14).

### 3.3 SES-CD prediction from MRE on intubated patients

The machine-learning-based models reduced the MSE of the SES-CD predictions compared to their multiple-linear-regression counterparts. The PICMI-based machine-learning model (RF-PICMI) achieved MSE of 11.8±1.5 while its multiple-linear-regression counterpart yielded a higher MSE of 12.1±1.5. Similarly, the MaRIA-based machine-learning model (RF-MaRIA) achieved an MSE of 12.3±1.6 while its multiple-linear-regression counterpart yielded a higher MSE of 12.5±1.6 (table 1).

View this table:
[Table 1:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/T1)

Table 1: 
Validation Mean Squared Error (MSE) distribution statistics over the folds per model. MSE: Mean-squared-error, LR: Linear regression, RF: random-forest, MLR: multiple-linear-regression.

View this table:
[Table 2:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/T2)

Table 2: 
Imputed SES-CD scores for the non-intubated patients. MSE: mean-squared-error, LR: Linear-regression, RF: Random-forest, MLR: Multiple-linear-regression.

The PICMI-based models reduced the MSE over MARIA-based models for both the machine-learning-based models (RF-PICMI 11.7 (10.5-12.5) vs. RF-MaRIA 12.1 (11.1-13.3), p<1e-5) and the multiple-linear-regression models (MLR-PICMI 11.7 (11.1-13) vs. MLR-MaRIA 12.4 (11.6-13), p<0.05). Finally, the PICMI-based machine-learning model (RF-PICMI) achieved the lowest MSE compared to the baseline linear-regression model with the MaRIA score (RF-PICMI 11.7 (10.5-12.5) vs. LR-MaRIA 12.1 (11.4-12.9), p<0.05) (table 1).

Fig. 2 depicts the MSE distribution of the validation group over the different folds.

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/06/2021.08.29.21262424/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/F2)

Figure 2: 
Mean-squared-error (MSE) distribution over the validation folds

Figure 3 presents the averaged ROC curves of the different models’ SES-CD predictions in differentiating between patients with and without EH according to their reference ileocolonscopy-based SES-CD score. The machine-learning-based model with the PICMI variables (RF-PICMI) was able to achieve a similar performance compared to the linear-regression model with the MaRIA index (0.77±0.05 vs. 0.77±0.04, p=0.22, DeLong’s test). The MaRIA-based linear-regression model is dependent upon a post-gadolinium T1-weighted sequence.

![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/06/2021.08.29.21262424/F3.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/F3)

Figure 3: 
Average ROC curves over the different folds for all models

![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/09/06/2021.08.29.21262424/F4.medium.gif)

[Figure 4:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/F4)

Figure 4: 
Imputed SES-CD distribution on the non-intubated patients

### 3.4 SES-CD imputation from MRE on the non-intubated patients

We derived final models on the entire development set of 107 patients and imputed the SES-CD scores for the 25 patients with ileal intubation failure who had all MRE variables. The PICMI-based models predicted higher percentage of patients with ileal disease compared to the baseline linear regression model with the MaRIA index (up to 25 out of 25 (100%) vs. 23 of 25 (92%)). Without imputing TI SES-CD, such non-intubated patients would have been falsely classified as patients without terminal-ileum disease based on colonoscopy only.

## 4. Discussion

Our results demonstrate the potential of machine-learning-based methods in improving MRE-based indices for EH assessment in pediatric CD patients. The combination of the most informative MRE variables (PICMI) and the machine-learning-based model achieved the lowest SES-CD prediction error compared to baseline linear and multiple linear regression models. Further, imputing SES-CD from MRE data for patients with TI intubation failure during ileocolonoscopy using these models suggested a higher rate of patients with TI disease who otherwise were classified as patients without TI disease. Such information might impact interpretation of results in clinical trials.

Recently, Weiss et al imputed TI SES-CD from MRE data on pediatric CD patients with a simple linear regression model with the MRE-based MaRIA index as input variable with moderate correlation to the ileo-colonoscopy-based SES-CD score.14,23

Recently, non-linear machine-learning models such as the Random-Forest model20 demonstrated their ability to capture complex, non-linear correlations between the input variables and the expected outcomes in multiple domains including in IBD.12,15,16,26–28 Specifically, Dhaliwal et al. used a machine-learning model to distinguish pediatric colonic IBD subtypes based on clinical, endoscopic, radiologic, and histologic data29 and Waljee et al. demonstrated the added value of a machine-learning model to predict disease course from demographic data and clinical records compared to a linear-regression model.30

In this study we hypothesized that the moderate correlation to the TI SES-CD achieved by MRE indices may be attributed, at least in part, to the utilization of classical linear models to characterize the correlation between the MRE-based variables and EH which limited their ability to determine the potentially complex, non-linear correlation. Our results suggest that non-linear machine-learning models can improve the characterization of the complex non-linear correlation between the MRE-based variables and TI SES-CD compared to linear models.

The better performance of the machine-learning-based models compared to their multiple-linear-regression counterpart was expected as the machine-learning-based models have a better capability to characterize complex, non-linear, correlations between input variables and outcomes compared to linear models. In addition, machine-learning-based models are known to have an improved ability to successfully account for noisy data entries that commonly present in medical applications.28 It is important to note that machine-learning-based models lack the ability to extrapolate predictions beyond the data range used for their development compared to their linear counterparts.31 However, the lack of extrapolation ability is not a limitation specifically for imputing TI SES-CD since the dataset had rich enough values to cover the entire breadth of scores.

The better performance of the linear-regression model with the MaRIA index compared to its more complex counterpart, the multiple-linear-regression with the MaRIA variables might be explained by the fact that the original MaRIA index coefficients were derived in a more robust way compared to our stratified k-fold experiments.

Finally, the importance of developing pediatric specific MRE-indices was demonstrated by the improved performance of the PICMI-based models compared to the MaRIA-based models. Though previous research questioned the need to develop a specific MRE index of activity for the pediatric population,2 our results indicate its importance.

The machine-learning-based model with PICMI variables used a DWI sequence as a replacement for the post-gadolinium T1-weighted sequence, an advantage given some evidence this sequence can be omitted on MRE in children to reduce use of GBCA, although not yet standard practice.25 The better performance of the PICMI-based models compared to that of the MaRIA-based models suggests that the diagnostic accuracy of DWI, included in PICMI and not MaRIA is not overly estimated as was previously hypothesized.32 There are several limitations to our study. First, the machine-learning models in this study were developed with PICMI and MaRIA variables obtained by expert central reading. Further, the PICMI and MaRIA variables were obtained from MRE data acquired with an optimized and standardized acquisition protocol. Models’ predictions using PICMI and MARIA variables obtained by radiologists’ readings of MRE data that was not necessarily standardized according to the ImageKids MRE protocol should be interpreted with caution. Second, the development of machine-learning models are assumed to require more data compared to the development of linear models due to their complexity. Therefore, the presented models’ performance might underestimate the added-value of the machine-learning models. Third, CD is a transmural disease rather than a mucosal disease. It is not clear yet whether mucosal or transmural healing should be the primary treatment goal.33 While MRE has the ability to detect CD lesions beyond the mucosa, its ability to assess low severity mucosal lesions is much more limited.2 Therefore, combined assessment with MRE and ileocolonscopy is ideally required to fully assess the extent of the disease.

Last, the models developed in this research are using a set of MRE variables intended to assess transmural healing and not EH. Mesenteric variables for example are indicative of transmural healing but not EH as they measure inflammation in tissues far from the mucosa. The length of an inflamed segment which is proven to be an important indicator of the total burden of the disease in a certain bowel segment is included in magnetic resonance enterography global score (MEGS) variables but not in the MaRIA and PICMI variables.34 A future study is required to optimally select MRE variables for the sole purpose of mucosa assessment. Tailored mucosal variable selection can further improve SES-CD prediction accuracy.

In summary, our study demonstrates the important role of the combination of non-linear machine-learning models and the selection of specific MRE-based variables to improve our ability to impute TI SES-CD score from MRE data. Such machine-learning methods have the potential to improve MRE-based IBD assessment in additional applications.

## Data Availability

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to restrictions e.g. they contain information that could compromise the privacy of research participants

## 7. Supplementary material

View this table:
[Table supp1:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/T3)

Table supp1: 
Variables of the MRE-based indices

View this table:
[Table supp2:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/T4)

Table supp2: 
Basic characteristics of the entire cohort. Mean ±standard deviation, median (interquartile range), and frequency (%) are displayed as appropriate. GGA=gastroenterologist global assessment; VAS=visual analogue scale; TI SES-CD = terminal ileum simple endoscopic score of Crohn’s disease; * according to Paris classification; ** clinically and also endoscopically (as much as is known).

View this table:
[Table supp3:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/T5)

Table supp3: 
Basic characteristics of the group with all relevant MRE variables and Ileocolonoscopy. Mean +/-standard deviation, median (interquartile range), and frequency (%) are displayed as appropriate. GGA=gastroenterologist global assessment; VAS=visual analogue scale; TI SES-CD = terminal ileum simple endoscopic score of Crohn’s disease; * according to Montreal classification; ** clinically and also endoscopically (as much as is known).

View this table:
[Table supp4:](http://medrxiv.org/content/early/2021/09/06/2021.08.29.21262424/T6)

Table supp4: 
Random forest algorithm parameters

## 5. Acknowledgements

The ImageKids study was supported by an educational grant from AbbVie who were not involved in any part of the study design, conduct, analysis or writing.

## Footnotes

*   Conflicts of interest: MCG – AbbVie research grant and honoraria (disclosure, no conflict)

*   Funding: None.

*   Ethics statement: The imageKids study has been approved by the Helsinki committee of the Israeli Ministry of Health. Since all identifying elements related to patient privacy have been removed from the database used in the research presented in this manuscript, it does not need to provide an additional approval of ethics committee.

*   Data availability statement: The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to restrictions e.g. they contain information that could compromise the privacy of research participants.

*   * Taub fellow (supported by the Taub Family Foundation, Technion’s program for leaders in Science and Technology)

## Abbreviations

EH
:   Endoscopic healing
CD
:   Crohn’s disease
TI
:   terminal ileum
MRE
:   Magnetic Resonance Enterography
SES-CD
:   Simple Endoscopic Score for Crohn’s Disease
MSE
:   mean-squared-error
AUC
:   area under the curve
MaRIA
:   Magnetic Resonance Index of Activity
PICMI
:   Pediatric Inflammatory Crohn’s MRE Index
IBD
:   inflammatory bowel disease
DWI
:   diffusion-weighted imaging
GGA
:   gastroenterologist global assessment
VAS
:   visual analogue score
RCE
:   relative contrast enhancement
CRP
:   C-Reactive Protein
DT
:   decision tree
FWER
:   family-wise error rate
ROC
:   receiver operating characteristic

*   Received August 29, 2021.
*   Revision received August 29, 2021.
*   Accepted September 6, 2021.


*   © 2021, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/)

## 6. References

1.  1. L. Peyrin-Biroulet  W. Sandborn Bes et al. Selecting Therapeutic Targets in Inflammatory Bowel Disease (STRIDE): determining therapeutic goals for treat-to-target. Am J Gastroenterol. 2015;110(1324–1338).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ajg.2015.233&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26303131&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

2.  2. J. Rimola Jp. Is the Objective of Treatment for Crohn’s Disease Mucosal or Transmural Healing? Clin Gastroenterol Hepatol. 2018;16(7):1037–1039.
    
    
3.  3.Turner D, Griffiths AM, Wilson D, et al. Designing clinical trials in paediatric inflammatory bowel diseases: a PIBDnet commentary. Gut. 2020;69(1):32–41.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ3V0am5sIjtzOjU6InJlc2lkIjtzOjc6IjY5LzEvMzIiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wOS8wNi8yMDIxLjA4LjI5LjIxMjYyNDI0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

4.  4.Daperno M D’Haens G VAG et al. Development and validation of a new, simplified endoscopic activity score for Crohn’s disease: the SES-CD. Gastrointest Endosc. 2004;60(505–512).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0016-5107(04)01878-4&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15472670&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

5.  5.et al. DZS. Effect of Standardised Scoring Conventions on Inter-rater Reliability in the Endoscopic Evaluation of Crohn’s Disease. J Crohn’s Colitis. 2016;1006(10–14).
    
    
6.  6.Khanna R Bouguen G FBG et al. A systematic review of measurement of endoscopic disease activity and mucosal healing in Crohn’s disease: recommendations for clinical trial design. Inflamm Bowel Dis. 2014;20:1850–1861.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MIB.0000000000000131&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25029615&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

7.  7.Khanna R  Zou G Dg et al. Reliability among central readers in the evaluation of endoscopic findings from patients with Crohn’s disease. Gut. 65(1119–1125, year={2016},).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ3V0am5sIjtzOjU6InJlc2lkIjtzOjk6IjY1LzcvMTExOSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzA2LzIwMjEuMDguMjkuMjEyNjI0MjQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

8.  8.Sh J, Kj L, Yb K, Hc K, Sj S, Jy C. Diagnostic value of terminal ileum intubation during colonoscopy. J Gastroenterol Hepatol. 2008;23(1):51-55. doi:10.1111/J.1440-1746.2007.05151.X
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/J.1440-1746.2007.05151.X&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18171342&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

9.  9.Weiss B, Turner D, Griffiths A, et al. Simple Endoscopic Score of Crohn Disease and Magnetic Resonance Enterography in Children: Report from ImageKids Study. J Pediatr Gastroenterol Nutr. 2019;69(4):461-465. doi:10.1097/MPG.0000000000002404
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MPG.0000000000002404&link_type=DOI) 

10. 10.Gajendran M Loganathan P CAPHJG. A comprehensive review and update on Crohn’s disease. Dis Mon. 2018;64(2)(20–57).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.disamonth.2017.07.001&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28826742&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

11. 11.Walsh AJ Br V, SP. T. Current best practice for disease activity assessment in IBD. Nat Rev Gastroenterol Hepatol. 2016;13(1):567–579.
    
    
12. 12.Olivera P, Danese S, Jay N, Natoli G, Peyrin-Biroulet L. Big data in IBD: a look into the future. Nat Rev Gastroenterol & Hepatol. 2019;16(5):312–321.
    
    
13. 13.Turner D Gavish M FG et al. Development of the Pediatric Inflammatory Crohn’s MRE Index (PICMI)-results from the ImageKids Study. J Crohn’s Colitis. 2018;12(1):158–159.
    
    
14. 14.Rozendorn N. AMMERAKUKE. A review of magnetic resonance enterography-based indices for quantification of Crohn’s disease inflammation. Ther Adv Gastroenterol. 2018;11(1):1–21.
    
    
15. 15.Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76–94.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1053/j.gastro.2019.08.058&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

16. 16.Seyed Tabib NS, Madgwick M, Sudhakar P, Verstockt B, Korcsmaros T, Vermeire S. Big data in IBD: big progress for clinical practice. Gut. 2020;69(8):1520–1532.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ3V0am5sIjtzOjU6InJlc2lkIjtzOjk6IjY5LzgvMTUyMCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA5LzA2LzIwMjEuMDguMjkuMjEyNjI0MjQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

17. 17.[https://clinicaltrials.gov/ct2/show/study/NCT01881490](https://clinicaltrials.gov/ct2/show/study/NCT01881490).
    
    
18. 18.et al. W-N, study group I. Associations Among Mucosal and Transmural Healing and Fecal Level of Calprotectin in Children With Crohn’s Disease. Clin Gastroenterol Hepatol. 2018;16(7)(1089–1097).
    
    
19. 19.Rimola J Ordás I RS et al. Magnetic resonance imaging for evaluation of Crohn’s disease. Inflamm Bowel Dis. 2011;17(759–68).
    
    
20. 20.Breiman L. Random Forests. Mach Learn. 2001;45(5–32).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1023/A:1010933404324&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28752533&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000170489900001&link_type=ISI) 

21. 21.Wilcoxon F. Individual Comparisons by Ranking Methods. Biometrics Bull. 1945;1(6):80–83.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/3001968&link_type=DOI) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1945UG23300002&link_type=ISI) 

22. 22.Hochberg Y. A Sharper Bonferroni Procedure for Multiple Tests of Significance. Biometrika. 1988;75(4):800–802.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/biomet/75.4.800&link_type=DOI) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1988R296800023&link_type=ISI) 

23. 23.Weiss B Turner D GAWTH-SI et al. Simple endoscopic score of Crohn disease and magnetic resonance enterography in children: report from ImageKids Study. J Pediatr Gastroenterol Nutr. 2019;69(461–465).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MPG.0000000000002404&link_type=DOI) 

24. 24.DeLong ER DeLong DM C-PDL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–845.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/2531595&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=3203132&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1988Q069100016&link_type=ISI) 

25. 25.Moy MP, Sauk J, Gee MS. The role of MR enterography in assessing Crohn’s disease activity and treatment response. Gastroenterol Res Pract. 2016;2016. doi:10.1155/2016/8168695
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1155/2016/8168695&link_type=DOI) 

26. 26.Klang E, Grinman A, Soffer S, et al. Automated Detection of Crohn’s Disease Intestinal Strictures on Capsule Endoscopy Images Using Deep Neural Networks. J Crohn’s Colitis. 2020;15(5):749–756.
    
    
27. 27.Klang E Barash Y MRY et al. Deep learning algorithms for automated detection of Crohn’s disease ulcers by video capsule endoscopy. Gastrointest Endosc. 2020;91(606–613).
    
    
28. 28. Y. Q. Random Forest for Bioinformatics. Ensemble Mach Learn. Published online 2012.
    
    
29. 29.Dhaliwal J, Erdman L, Drysdale E, et al. Accurate Classification of Pediatric Colonic Inflammatory Bowel Disease Subtype Using a Random Forest Machine Learning Classifier. J Pediatr Gastroenterol Nutr. 2021;72(2):262–269.
    
    
30. 30.Waljee AK, Lipson R, Wiitala WL, et al. Predicting Hospitalization and Outpatient Corticosteroid Use in Inflammatory Bowel Disease Patients Using Machine Learning. Inflamm Bowel Dis. 2017;24(1):45–53.
    
    
31. 31.Hengl T, Nussbaum M, Wright M, Heuvelink G, Graeler B. Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ. 2018;6:e5518.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7717/peerj.5518&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F09%2F06%2F2021.08.29.21262424.atom) 

32. 32.et al. JR. Development and Validation of a Simplified Magnetic Resonance Index of Activity for Crohn’s Disease. Gastroenterology. 2019;157(2):432-439.e1.
    
    
33. 33.ED. S. Treat-to-target in Crohn’s disease: Will transmural healing become a therapeutic endpoint? World J Clin Cases. 2018;6(12):501–513.
    
    
34. 34.Zheng X, Li M, Wu Y, et al. Assessment of pediatric Crohn’s disease activity: validation of the magnetic resonance enterography global score (MEGS) against endoscopic activity score (SES-CD). Abdom Radiol. 2020;45(11):3653-3661. doi:10.1007/s00261-020-02590-8
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00261-020-02590-8&link_type=DOI)