COVID-19 Prognostic Modeling Using CT Radiomic Features and Machine Learning Algorithms: Analysis of a Multi-Institutional Dataset of 14,339 Patients

Isaac Shiri; Yazdan Salimi; Masoumeh Pakbin; Ghasem Hajianfar; Atlas Haddadi Avval; Amirhossein Sanaat; Shayan Mostafaei; Azadeh Akhavanallaf; Abdollah Saberi; Zahra Mansouri; Dariush Askari; Mohammadreza Ghasemian; Ehsan Sharifipour; Saleh Sandoughdaran; Ahmad Sohrabi; Elham Sadati; Somayeh Livani; Pooya Iranpour; Shahriar Kolahi; Maziar Khateri; Salar Bijari; Mohammad Reza Atashzar; Sajad P. Shayesteh; Bardia Khosravi; Mohammad Reza Babaei; Elnaz Jenabi; Mohammad Hasanian; Alireza Shahhamzeh; Seyed Yaser Foroghi Gholami; Abolfazl Mozafari; Arash Teimouri; Fatemeh Movaseghi; Azin Ahmari; Neda Goharpey; Rama Bozorgmehr; Hesamaddin Shirzad-Aski; Rozbeh Mortazavi; Jalal Karimi; Nazanin Mortazavi; Sima Besharat; Mandana Afsharpad; Hamid Abdollahi; Parham Geramifar; Amir Reza Radmard; Hossein Arabi; Kiara Rezaei-Kalantari; Mehrdad Oveisi; Arman Rahmim; Habib Zaidi

doi:10.1101/2021.12.07.21267364

Abstract

Objective In this large multi-institutional study, we aimed to analyze the prognostic power of computed tomography (CT)-based radiomics models in COVID-19 patients.

Methods CT images of 14,339 COVID-19 patients with overall survival outcome were collected from 19 medical centers. Whole lung segmentations were performed automatically using a previously validated deep learning-based model, and regions of interest were further evaluated and modified by a human observer. All images were resampled to an isotropic voxel size, intensities were discretized into 64-binning size, and 105 radiomics features, including shape, intensity, and texture features were extracted from the lung mask. Radiomics features were normalized using Z-score normalization. High-correlated features using Pearson (R²>0.99) were eliminated. We applied the Synthetic Minority Oversampling Technique (SMOT) algorithm in only the training set for different models to overcome unbalance classes. We used 4 feature selection algorithms, namely Analysis of Variance (ANOVA), Kruskal- Wallis (KW), Recursive Feature Elimination (RFE), and Relief. For the classification task, we used seven classifiers, including Logistic Regression (LR), Least Absolute Shrinkage and Selection Operator (LASSO), Linear Discriminant Analysis (LDA), Random Forest (RF), AdaBoost (AB), Naïve Bayes (NB), and Multilayer Perceptron (MLP). The models were built and evaluated using training and testing sets, respectively. Specifically, we evaluated the models using 10 different splitting and cross-validation strategies, including different types of test datasets (e.g. non-harmonized vs. ComBat-harmonized datasets). The sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were reported for models evaluation.

Results In the test dataset (4301) consisting of CT and/or RT-PCR positive cases, AUC, sensitivity, and specificity of 0.83±0.01 (CI95%: 0.81-0.85), 0.81, and 0.72, respectively, were obtained by ANOVA feature selector + RF classifier. In RT-PCR-only positive test sets (3644), similar results were achieved, and there was no statistically significant difference. In ComBat harmonized dataset, Relief feature selector + RF classifier resulted in highest performance of AUC, reaching 0.83±0.01 (CI95%: 0.81-0.85), with sensitivity and specificity of 0.77 and 0.74, respectively. At the same time, ComBat harmonization did not depict statistically significant improvement relevant to non-harmonized dataset. In leave-one-center-out, the combination of ANOVA feature selector and LR classifier resulted in the highest performance of AUC (0.80±0.084) with sensitivity and specificity of 0.77 ± 0.11 and 0.76 ± 0.075, respectively.

Conclusion Lung CT radiomics features can be used towards robust prognostic modeling of COVID-19 in large heterogeneous datasets gathered from multiple centers. As such, CT radiomics-based model has significant potential for use in prospective clinical settings towards improved management of COVID-19 patients.

INTRODUCTION

The novel coronavirus disease which emerged in 2019 (COVID-19) is now a major cause of death worldwide ¹. This highly contagious virus can cause a spectrum of pulmonary, hematological, neurological, and systemic complications, making it a highly lethal pathogen ². As of August 23^st, 2021, there have been >200 million globally confirmed cases of COVID-19, including >4 million deaths and >4 billion vaccinations reported to the world health organization (WHO) [https://covid19.who.int/]. There remains an urgent need for addressing issues such as diagnosis, prognosis, and treatment options ³.

Diagnostic tools for COVID-19, such as reverse transcription polymerase chain reaction (RT-PCR) aid to distinguish between negative and positive cases ⁴. Prognostic tools, on the other hand, provide clinicians with insights to optimize treatment strategies, manage the hospitalization of patients both in the wards and intensive care units (ICU), and better handle patient follow-up plans ⁵. Different studies have evaluated clinical and/or non-clinical features for determining the diagnosis and prognosis of patients with COVID-19. Yan et al. ⁶ used only clinical features to classify patients into different categories, ranging from mild to critical conditions. Zhou et al. ⁷ also aimed at establishing a prognostic model for outcome prediction of patients with COVID-19, utilizing their clinical data.

Computed Tomography (CT) plays a pivotal role in the management of a wide variety of diseases as a fast and non-invasive imaging modality. In the case of COVID-19, CT is used for both diagnostic (e.g. in case of limited access to RT-PCR) and prognostic purposes ⁸. The clinical value of CT imaging relies mainly on the early detection of lung infections and high accuracy in quantifying the disease progression and severity ⁹. Francone et al. ¹⁰ assessed the correlation of CT scores with COVID-19 pneumonia severity and outcome. In another study, Zhao et al. ¹¹ specified pulmonary involvement severity by measuring the extent of pneumonia and consolidation that appear on CT images. Li et al. ¹² concluded that high scores of CT images are associated with severe COVID-19 pneumonia.

In spite of previously conducted research, there is still a need for studies with more accurate and comprehensive analyses ¹³. In conventional analyses, CT features are visually and subjectively defined, while machine learning (ML) and/or deep learning (DL) models have the potential to provide more comprehensive and objective assessment of images. Towards modeling of outcomes in COVID-19 patients, several ML and DL algorithms have been utilized to assess severity and to predict outcome of patients using CT imaging ^14–17. To evaluate the sensitivity and specificity of CT for COVID-19 diagnostic purposes, Harmon et al. ¹⁴ achieved a sensitivity of 84% and specificity of 93% in an independent test set of 1337 patients applying DL on CT images. In another machine learning study, Mei et al. ¹⁵ reported a sensitivity of 84.3% and specificity of 82.8% based on combined CT and clinical data. Cai et al. ¹⁶ utilized a random forest model to assess the severity of COVID-19 disease in 99 patients and their need for a longer hospital or ICU stay. Another study by Lessman et al. ¹⁷ reported the model performance of three DL models for severity assessment in COVID-19. In another multi-center study, Meng et al. ¹⁸ differentiated patients with high-risk of mortality versus low-risk ones using a convolutional neural network named De-COVID-Net. Ning et al. ¹⁹ used their pre- trained DL model on a dataset consisting of 351 patients that was capable of distinguishing between non-coronavirus pneumonia, mild coronavirus pneumonia, and severe forms of COVID-19 disease.

Various studies also reported remarkable prediction accuracies utilizing radiomics approaches using CT and chest x-ray imaging modalities ^20–23. Medical images could be converted into high-dimensional data by means of radiomics, wherein radiomics features are selected from the images and combined using machine learning algorithms to arrive at radiomics signatures as biomarkers of disease^24–34. In addition to wide usage in several oncologic ^35–37 as well as non-oncologic diseases ^{27, 38}, radiomics studies have indicated that imaging features extracted from CT or chest X-ray images could be used as parameters for outcome prediction of patients with COVID-19 pneumonia. Radiomics analyses have been applied to different aspects of COVID-19, including diagnosis, severity scoring, prognosis, hospital/ICU stay prediction, and survival analysis ^20–23. In a retrospective study, Fu et al. ³⁹ constructed a predictive model based on CT radiomics, clinical and laboratory features. This signature could classify COVID-19 patients into stable and unstable (i.e. progressive phenotype). Homayounieh et al. ⁴⁰ aimed to predict the severity of pneumonia in patients with COVID-19 using a radiomics model that outperformed models consisting of clinical-only features. Another study by Li et al. ⁴¹ analyzed a radiomics/DL model that distinguished severe from critical COVID-19 pneumonia patients. Cai et al. ⁴² developed a model by means of combining CT radiomics features and clinical data to predict RT-PCR negativity during admission. Yue et al. ⁴³ conducted a multicentric radiomics study on 52 patients to differentiate whether an individual needs a short-term or long-term hospital stay. Another study by Bae et al. ⁴⁴ predicted the mortality of patients with COVID-19 using chest x-ray radiomics. Their model could identify whether a patient needs mechanical ventilation or not. Artificial intelligence (AI) has been widely used in radiology to provide diagnostic and prognostic tools to help clinicians during the pandemic. However, owing to the lack of standardization in AI studies in terms of data collection, methodology, and evaluation, most of these studies were not pragmatic when it comes to clinical adoption ^{13, 45, 46}. In a recent study by Roberts et al. ¹³, possible sources of bias in more than 2000 AI articles in COVID-19 were evaluated in both deep learning and traditional machine learning-based studies. This review showed that bias do exist in most, if not all, of the studies in different domains, including dataset and methodology. In the dataset domain, several articles used public datasets which can contain duplicates, low-quality images, false demographics, or unknown clinical/lab data of patients. These public datasets can also induce bias in the outcome domain as they may fail to supply sufficient information about how they exactly proved a patient is COVID-19 positive or how imaging data were acquired in terms of image acquisition and reconstruction. They also mentioned using small datasets, Frankenstein datasets ¹³, and Toy datasets ⁴⁵ in several articles. In the methodology domain, most of the studies did not provide all methodological information or did not perform a standard AI analysis based on guidelines ^{47, 48}.

Overall, Roberts et al. ¹³ reviewed 69 traditional machine learning/radiomics studies and reported that 44 were excluded because of Radiomics Quality Score (RQS) ⁴⁷ of less than six or not describing the datasets appropriately. From the remaining 25, six articles performed model evaluation using external validation sets and only four papers reported the significance of their model along with the statistical parameters (agreement level). They also assessed bias in the prognostication studies in the four areas of prediction model risk of bias assessment tool (PROBAST) guide and reported high bias in participants, predictors, outcomes, and analysis areas. Overall, there are several radiomics studies targeting improved COVID-19 diagnosis or prognosis. However, owing to the limited sample size, single-centered nature of most of the databases, and variability in data acquisition and image reconstruction parameters, the models tend to overfit ¹³. Providing a generalizable model which is reproducible on unseen datasets of other centers is highly desired. In this context, we designed a large multi-institutional study to build and evaluate a radiomics model based on a large-scale CT imaging dataset aimed at the prediction of survival (alive or deceased) in COVID-19 infected patients. We built and evaluated our model based on different guidelines and tested different machine learning algorithms in different strategies to evaluate model reproducibility and repeatability in a large dataset.

MATERIALS AND METHODS

Figure 1 summarizes the different steps adopted in this study. To provide a standard and reproducible study, we completed different checklists/guidelines concerning predictive modeling, radiomics studies, and artificial intelligence studies. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) [40] checklist is provided in supplemental Table 1. We also reported the Radiomics Quality Score (RQS) based on Lambin et al. ⁴⁷ and the Checklist for Artificial Intelligence in Medical imaging (CLAIM) ⁴⁸ in supplemental material. These checklists were filled out by two individuals (with consensus) who are experts in radiomics field and not co-authors in this study.

Figure 1:

The flow chart of our study represents the different radiomics steps.

View this table:

Table 1.

Demographics and data acquisition parameters across different centers.

Patient Population

This study was approved by our local institutional review board (IRB), and written informed consent of patients was waived by the ethics committee as anonymized data were used without any interventional effect on diagnosis, treatment or management of COVID-19 patients.

In the first step, 24,448 patients, from 19 medical centers in Iran, suspected of COVID-19 and with acquired chest CT images were included. Different exclusion criteria were applied to provide a reliable dataset. We excluded: (i) patients without follow-up information or clear evidence of clinical endpoint, or if they were transferred to another medical center (3519 patients), or patients with (ii) negative RT-PCR (1860 patients), (iii) laboratory-confirmed pneumonia of other types (1606 patients), (iv) confirmed lung cancer or metastases from other origins to the lungs (1400 patients), (v) atypical CT findings for other abnormalities (850 patients), (vi) CT images with contrast media administration (58 patients), (vii) severe motion or bulk motion artifacts in CT images were carefully checked (515 patients), (viii) extremely inappropriate positioning which resulted in missing the upper and lower bounds of the lungs (121 patients), or (ix) CT images with extremely low quality or SNR (210 patients).

Considering these criteria, we excluded 10,149 patients from further analysis (Figure 2). Hence, 14,339 chest CT scans (with one scan per patient whose COVID-19 was confirmed either by RT- PCR or CT imaging) were included in this study. Common symptoms of COVID-19, including fever, respiratory symptoms, shortness of breath, dry cough and tiredness were recorded and contact history with COVID-19 patients was also assessed. In each center, CT images were evaluated at least by two radiologists and in case of discrepancy, a third radiologist was involved to settle the disagreement. As defined in the COVID-19 Reporting and Data System (CO-RADS) ⁴⁹, typical manifestations of COVID-19, such as ground-glass opacity, consolidation, crazy-paving pattern, or dominant peripheral distribution of parenchymal abnormalities were considered diagnostic for COVID-19 in CT images.

Figure 2:

Inclusion and exclusion criteria in this study. Fraction of deceased patients is overrepresented in our data due to our exclusion criteria.

Among these studies, 13,741 CT images were collected from 18 centers in Iran (1560 deceased, 12,171 alive; fraction of deceased cases are significantly overrepresented due to our exclusion criteria), and 608 images were gathered from an online open-access databases from China (Center 9: 18 deceased, 590 alive) ⁵⁰. All patients from Iran received standard treatment regimens according to the interim national COVID-19 treatment guideline [corona.behdasht.gov.ir]. Only one center (Center 10) included outpatient studies and the rest were inpatient-only studies from hospitalized patients. Follow-up was performed 3-4 months after the initial CT scan in the outpatient cases. For admitted patients (inpatient), follow-up was performed until discharge from the hospital, which was considered after careful evaluation of patients by the attending physician based on several criteria, including stable hemodynamics state (BP>90/60, HR<120), absence of fever for > 2 days, absence of any respiratory distress, blood oxygen saturation >93% in ambient oxygen without the need for supplementary oxygen, and no need for hospitalization for any other pathology.

CT Image Acquisition

All chest CT images from the Iranian centers were acquired according to an institutional variation of the national society of radiology COVID-19 imaging guidelines ⁵¹. Image acquisition was performed during breath-hold to reduce motion artifacts. Variations in CT imaging protocols among centers were observed which led to considerable variability in image quality and radiation dose. Volumetric CT Dose Index (CTDIvol), as a parameter representing vendor-free information on radiation exposure, was reported to better reflect intra/inter institutional variability of our dataset. Table 1 summarizes the image acquisition characteristics of each center, including the number of images, acquisition parameters (slice thickness, tube current), and CTDIvol.

Image Segmentation and Image Preprocessing

The lungs were automatically segmented using our DL-based algorithm named COLI-NET which we previously proposed and evaluated ⁵². For efficient radiomics feature extraction (feature extraction time), all images were first cropped to the lung region and then resized to 296×216 to obtain a computationally efficient feature extraction. After reviewing the segmentations, the image voxel was resized to an isotropic voxel size of 1×1×1 mm³, and the intensity discretized to 64- binning size ⁵³.

Radiomics Feature Extraction and Harmonization

After image preprocessing, radiomics feature extraction was performed using the PyRadiomics Python library ⁵⁴. Radiomics features, including morphological (n=16), intensity (n=17), and texture features including second-order features, such as Gray Level Co-occurrence Matrix (GLCM, n=24), higher-order features namely Gray Level Size Zone Matrix (GLSZM, n=16), Neighboring Gray Tone Difference Matrix (NGTDM, n=5), Gray Level Run Length Matrix (GLRLM, n=16), and Gray Level Dependence Matrix (GLDM, n=14) were extracted in compliance with the Image Biomarker Standardization Initiative (IBSI) guidelines ⁵³.

Feature Preprocessing

For each feature vector, the mean and standard deviation were calculated (in training sets) and then normalized using Z-Score normalization, which consists of subtracting each feature vector from the mean followed by division by the standard deviation. For Z-score normalization, the mean and standard deviation were calculated for the training set and then applied on test set. Features’ correlation was evaluated using Pearson correlation and features with high correlation (R²>0.99) were eliminated. Owing to unbalanced datasets in training and test set, we applied Synthetic Minority Oversampling Technique (SMOT) algorithm to only the training set for the different models.

Feature Selection and Classification

In this study, we used 4 feature selection algorithms, including Analysis of Variance (ANOVA), Kruskal-Wallis (KW), Recursive Feature Elimination (RFE), and Relief. Feature preprocessing and selection were performed on training sets and then applied on test sets. All test and external validation sets were unseen to feature processing and the selection and model building process. For classification task, we used seven classifiers, including Logistic Regression (LR), Least Absolute Shrinkage and Selection Operator (LASSO), Linear Discriminant Analysis (LDA), Random Forest (RF), AdaBoost (AB), Naïve Bayes (NB) and Multilayer Perceptron (MLP). By cross-combination of four feature selectors and seven classifiers, we tested twenty-eight different combinations.

Evaluation

For thorough assessment, we trained and evaluated our models using 10 different strategies as summarized in Figure 3. To evaluate the models on whole datasets without considering data variability in each center, we divided the dataset of each center to 70% training and 30% tests sets resulting in the following two strategies (1 and 2):

Figure 3:

Different strategies implemented in this study for model evaluation

- Random Splitting method #1: Non-harmonized datasets were randomly split into 70% (10,038 patients) and 30% (4301 patients) for training and test sets, respectively, without considering centers. The data included patients whose COVID-19 was confirmed using RT-PCR and patients confirmed only by imaging. This test dataset included both populations.

- Random Splitting method #2: Non-harmonized datasets were randomly split into 70% (8503 patients) and 30% (3644 patients) for the training and test sets, respectively, without considering centers. The train and test sets consisted of only patients with positive RT-PCR.

- To evaluate the models on whole datasets considering data variability in each center, we divided the dataset of each center to 70% training and 30% test sets resulting in the following two strategies (3 and 4):

- Random Splitting method #3: Data from each center (non-harmonized) were randomly split into 70% (10,048 patients) and 30% (4291 patients) for the training and test sets, respectively. As our data included patients whose COVID-19 was confirmed using RT-PCR and patients confirmed only by imaging, this test dataset included both populations.

- Random Splitting method #4: Data from each center were randomly split into 70% (10,704 patients) and 30% (3635 patients) for the training and test sets, respectively. The train and test sets consist of only patients with positive RT-PCR.

To evaluate the models in the whole dataset by removing data variability due to acquisition/reconstruction from different centers, ComBat harmonization proposed by Johnson et al. ⁵⁵ was applied to the extracted features to tackle the effect of center-based imaging variability. The impact of ComBat harmonization on radiomics features was assessed by Kruskal-Wallis test. After applying ComBat harmonization, we divided the datasets of each center to 70/30% train/test sets resulting in the following two strategies (3 and 4):

- Random Splitting method #5: Data from each center (ComBat harmonization) were randomly split into 70% (10,048 patients) and 30% (4291 patients) for the training and test sets, respectively. As our data included patients whose COVID-19 was confirmed using RT-PCR and patients confirmed only by imaging, this test dataset included both populations.

- Random Splitting method #6: Data from each center (ComBat harmonization) were randomly split into 70% (10,704 patients) and 30% (3635 patients) for the training and test sets, respectively. The train and test sets consisted of only patients with positive RT-PCR.

To evaluate model generalizability and sensitivity to datasets, we performed model assessment using the following strategies (7 to 9) on the external validation sets:

- Random Splitting method #7: Data (non-harmonized) were randomly split into 70% (10,655 patients) and 30% (3684 patients) for the training and external validation sets, respectively. The center’s number in the test set appears in the test sets.

- Center-based model evaluation #8: we built models on one center’s dataset (non-harmonized) and then evaluated on 18 remaining centers (external validation set), and then repeated this process for all datasets.

- Leave-one-center-out (LOCO) #9: On each of the 19 iterations, 18 centers were used as the training set, and one as the external validation set (unseen data during training). We repeated this process for all center datasets (non-harmonized).

To evaluate of model sensitivity to each dataset, we trained and tested the models in each center separately on each center dataset using the following strategies:

- Random Splitting method #10: Data from each center (non-harmonized) were randomly split into 70% and 30% for the training and test sets, respectively. The models were built and evaluated on each center separately.

All multivariate steps, including feature preprocessing, feature selection and classification were performed separately for each strategy. Classification algorithms were optimized during training using grid search algorithms. The best models were chosen by one standard deviation rule in 10- fold cross-validation and then evaluated on test or external validation sets. The accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were reported for the test or external validation sets (unseen during training). Statistical comparison of AUCs (by 10000 bootstrapping) between models was performed using the DeLong test ⁵⁶. The significance level was considered at a level of 0.05. All multivariate analysis steps were performed using Python Scikit-Learn open-source library.

RESULTS

Figures 4 depicts the hierarchical clustering heat map of radiomics features distribution in alive and deceased groups for the whole dataset prior to ComBat harmonization. Supplemental Figure 1 shows the cluster heat map of radiomics features in the non-harmonized data set. Figure 5 shows the correlation of radiomics features in the whole dataset, whereas supplemental Figure 2 represents the same for ComBat harmonized features. The statistical differences calculated using the Kruskal-Wallis test are presented in supplemental Table 1 before and after ComBat harmonization. The results of ComBat harmonization showed that the algorithm properly eliminated the center effect on radiomics features in most features. ComBat harmonization data only were used for strategies 5 and 6. Figures 6-8 provide the classifications power indices of AUC, sensitivity and specificity, respectively, for splitting strategies 1-10. More detailed results were presented in supplemental Tables 2-11 for the different strategies.

Supplemental Figure 1.

Cluster heat map of radiomics features in the non-harmonized data set.

Supplemental Figure 2.

Pearson correlation of ComBat-harmonized radiomics features.

View this table:

Supplemental Table 1.

P-values in Kruskal Wallis analysis of radiomics features before and after ComBat Harmonization.

View this table:

Supplemental Table 2.

Classification performance indices for different feature selectors (FS) and classifiers in Strategy 1.

View this table:

Supplemental Table 3.

Classification performance indices for different feature selectors (FS) and classifiers in Strategy 2.

View this table:

Supplemental Table 4.

Classification performance indices for different feature selectors (FS) and classifiers in Strategy 3.

View this table:

Supplemental Table 5.

Classification performance indices for different feature selectors (FS) and classifiers in Strategy 4.

View this table:

Supplemental Table 6.

Classification performance indices for different feature selectors (FS) and classifiers in Strategy 5.

View this table:

Supplemental Table 7.

Classification performance indices of different feature selector and classifiers in Strategy 6.

View this table:

Supplemental Table 8.

Classification performance indices of different feature selector and classifiers in Strategy 7.

View this table:

Supplemental Table 9.

Classification performance indices of different feature selector and classifiers in Strategy 8.

View this table:

Supplemental Table 10.

Classification performance indices of different feature selector and classifiers in Strategy 9.

View this table:

Supplemental Table 11.

Classification performance indices of different feature selector and classifiers in Strategy 10.

Figure 4:

Cluster heat map of radiomics features in non-harmonized data set

Figure 5.

Radiomics feature correlation using Pearson correlation in non-harmonized data set

Figure 6:

Heat map of AUC for cross combination of feature selectors and classifiers in different ten strategies

Figure 7:

Heat map of Sensitivity for cross combination of feature selectors and classifiers in different ten strategies

Figure 8:

Heat map of Specificity for cross combination of feature selectors and classifiers in different ten strategies

In strategy 1 where the data were randomly split into train and test sets (without considering centers), RFE feature selection and RF classifier results in highest performance of AUC 0.84±0.01 (CI95%: 0.82-0.85) with sensitivity and specificity of 0.78 and 0.76, respectively. In strategy 2 where only PCR positive studies were randomly split into train and test sets (without considering centers), KW feature selection and RF classifier combination resulted in the highest performance with an AUC of 0.84±0.01 (CI95%: 0.82-0.86) and sensitivity and specificity of 0.81 and 0.76, respectively. There was no statistical significant difference between Strategies 1 and 2, the main difference being the inclusion of CT and PCR positive studies in strategy 1 and only PCR positive studies in strategy 2.

In strategy 3 where whole data splitting was performed in each center separately for train and test sets, ANOVA feature selector and RF classifier combination resulted in the highest performance with AUC of 0.83±0.01 (CI95%: 0.81-0.85), sensitivity and specificity of 0.81 and 0.72, respectively. Similar results as above were achieved for strategy, 4 where data splitting was performed in each center separately to train and test set for PCR positive dataset. There were no statistically significant difference between strategies 3 and 4 where the main difference was including CT and PCR positive in strategy 4 and only PCR positive in strategy 5.

In strategy 5 where Combat harmonized whole data splitting was performed in each center separately to the train and test sets, Relief feature selector and RF classifier combination resulted in the highest performance with an AUC of 0.83±0.01 (CI95%: 0.81-0.85), sensitivity and specificity of 0.77 and 0.74, respectively. In strategy 6, where Combat harmonized data splitting was performed in each center separately to the train and test sets for PCR positive studies, Relief feature selector and RF classifier combination resulted in the highest performance with an AUC 0.83±0.01 (CI95%: 0.81-0.84), sensitivity and specificity of 0.79 and 0.72, respectively. . There were no statistically significant differences between strategies 5 and 6. The statistical comparison of AUCs between for ComBat harmonization strategies 5 and 6 and to the same splitting in strategies 3 and 4 using DeLong test didn’t reveal any statistically significant differences.

In strategy 7, where the splitting into train and test sets was performed based on centers (centers appear in training and test sets only once), RFE selector and RF classifier combination resulted in the highest performance with an AUC of 0.79±0.01 (CI95%: 0.76-0.81), sensitivity and specificity of 0.73 and 0.71, respectively. In Figure 9, the ROC curves of the test set for strategies 1-7 as well as the comparison of the different strategies are depicted. In strategy 8 where the model is built based on one center dataset (non-harmonized) and then evaluated on the 18 remaining centers (external validation set), ANOVA feature selector and NB classifier combination resulted in the highest performance with an AUC of 0.74±0.034, sensitivity and specificity of 0.71 ± 0.026 and 0.69 ± 0.033, respectively. The results of each center were presented in supplemental Table 12. To evaluate the model on external validation sets, we reported the results of each center in the LOCO strategy 9. ANOVA feature selector and LR classifier combination resulted in the highest performance with an AUC of 0.80±0.084, sensitivity and specificity of 0.77 ± 0.11 and 0.76 ± 0.075, respectively. In strategy 10, the data from each center (non-harmonized) were randomly split into 70% and 30% for training and test sets, respectively, and the models were built and evaluated on each center separately. ANOVA feature selector and LR classifier combination resulted in the highest performance with an AUC of 0.82±0.10, sensitivity and specificity of 0.84 ± 0.12 and 0.77 ± 0.09, respectively. The results of each center were presented in supplemental Tables 13 and 14 for strategies 9 and 10, respectively.

View this table:

Supplemental Table 12.

Classification performance indices of different feature selector and classifiers in Strategy 8.

View this table:

Supplemental Table 13.

Classification performance indices of different feature selector and classifiers in Strategy 12.

View this table:

Supplemental Table 14.

Classification performance indices of different feature selector and classifiers in Strategy 10.

Figure 9:

ROC curve for test sets in strategies 1-7. Strategy 1: AUC 0.84±0.01 (a), Strategy 2: AUC 0.84±0.01 (b), Strategy 3: AUC 0.83±0.01 (c), Strategy 4: AUC 0.83±0.01 (d), Strategy 5: AUC 0.83±0.01 (e), Strategy 6: AUC 0.83±0.01 (f), Strategy 7: AUC 0.79±0.01 (g) and different strategies comparison (k)

DISCUSSION

In this multi-centric study, we conducted a CT-based radiomics analysis to assess the ability of our model in predicting the overall survival of patients with COVID-19 using a large multi- institutional dataset. We included 14,339 patients along with their CT images, segmented the lungs, and extracted distinct radiomics features. We evaluated different combinations of feature selectors and classifiers in different strategies. Since the dataset was gathered from different centers, we applied the ComBat Harmonization algorithm that has been successfully applied in radiomics studies over the extracted features ⁵⁷. As our dataset consisted of imbalanced classes, we first used SMOT algorithm in the training sets. Our model was trained and the results of 3 different testing methods were reported.

Prognostic modeling can be regarded as an important framework towards better understanding of disease, its management, monitoring, and identification of the best treatment options. A number of reports have shown the effectiveness of image-based, laboratory-based, or combined models in outcome prediction of COVID-19 infected patients ^{58, 59}. Qiu et al. ⁶⁰ constructed a radiomics model trained to classify the severity of COVID-19 lesions (mild vs severe) using CT images. Their study included a medium-to-large number of patients (n=1160) and achieved an AUC of 0.87 in the test dataset. They showed that the radiomics signature is potent in aiding physicians to manage patients in a more precise way. Fu et al. ³⁹ conducted a similar experiment with a radiomics-based model using CT images and applied it to data from 64 patients to classify them into progressive and stable groups. Their model could accurately perform the given task (AUC=0.83). While the results were promising, their study did not include a large cohort.

A study by Chao et al. ⁵⁹ included different types of information, such as CT-based radiomics features, clinical, and demographic data to employ a holistic prognostic model. Their model could predict whether the patients will demand an ICU admission or not with an AUC of 0.88. Tang et al. ⁶¹ also assessed a random forest model for classifying patients into categories of severe and non-severe based on CT imaging radiomics features along with laboratory test results. The model performed well (AUC=0.98) on their dataset consisting of 118 patients. In a study by Wu et al. ⁶², the authors assessed the predictive power of a radiomic signature for showing poor patient outcomes defined as ICU admission, need for mechanical ventilation, or death. Their model could reach an AUC of 0.97 in the prediction of 28-day outcomes after CT images were taken. This highly promising result was achieved with the help of clinical data and harmonization of the features. At the same time, in our study, ComBat harmonization did not appear to impact outcome prediction.

One should note that both clinical-only and radiomics-only survival prediction models have advantages. However, studies have shown that radiomics features yield superior accuracy in most cases. In a study by Homayounieh et al. ⁶³, the authors developed a radiomics-based signature and compared it with a clinical-only signature in terms of mortality prediction. They concluded that radiomics-based model can outperform the clinical-only model with a wide margin (AUC of 0.81 versus 0.68). Their study included 315 adults and was applied to other clinical outcomes as well, such as the prediction of outpatient/inpatient care and ICU admission. In addition, other reports indicated that adding clinical features to the radiomics model only slightly improved the results ⁶⁴. In a recent study, Shiri et al. ⁶⁴ performed a radiomics study for prognostication purpose (alive or deceased) of COVID-19 patients using clinical (demographic, laboratory, and radiological scoring), COVID-19 pneumonia lesion radiomics features and whole lung radiomics features, separately and in combination. They trained a machine learning algorithm, Maximum Relevance Minimum Redundancy (MRMR) as the feature selector and XGBoost as the classifier, on 106 patients and evaluated and reported results on 46 test sets. They reported an AUC of 0.87 ± 0.04 for clinical-only, 0.92 ± 0.03 for whole lung radiomics, 0.92 ± 0.03 for lesion radiomics, 0.91 ± 0.04 for lung + lesion radiomics, 0.92 ± 0.03 for lung radiomics + clinical data, 0.94 ± 0.03 for lesion radiomics + clinical data and 0.95 ± 0.03 for lung + lesion radiomics + clinical data. The lung and lesion radiomics-only models showed similar performance, while the integration of features resulted in the highest accuracy.

Lassau et al. ⁶⁵ combined CT-based DL models, biological and clinical features for severity prediction in 1003 COVID-19 patients, confirmed by either CT or RT-PCR. They showed clinical and biological features correlation with CT markers. Zhang et al. ⁵⁸ conducted a diagnostic and prognostic study using 3777 COVID-19 patients. They reported a high positive and negative correlation of lung-lesion CT manifestations with a number of clinical and laboratory tests. They also reported that their diagnostic model (COVID-19 from common pneumonia and normal control) can improve radiologist’s performance from junior to senior level (AUC = 0.98) for progression to severe/critical disease in their prognostic model. They reported an AUC of 0.90 with sensitivity and specificity of 0.80 and 0.86, respectively. Feng et al. ⁶⁶ built a machine learning prognostic model using a multicenter COVID-19 dataset. They reported a high correlation of CT features with clinical findings, also utilizing a multivariable model in the validation set consisting of 106 patients. The AUC was 0.89 (95% CI: 0.81–0.98). Recently, Xu et al. ⁶⁷ conducted a multicentric study for the prediction of ICU admission, mechanical ventilation, and mortality of hospitalized patients with COVID-19. CT radiomics features were integrated with demographic and laboratory tests. The evaluation was performed in 1362 patients from nine hospitals reporting an AUC of 0.916, 0.919 and 0.853 for ICU admission, mechanical ventilation, and mortality of hospitalized patients, respectively. For the radiomics-only model, they reached an AUC of 0.86, 0.80, and 0.66 for the above three mentioned outcomes, respectively.

Most previous studies suffered from a common limitation of COVID-19 RT-PCR not being available for the entire dataset when using multicentric data. In our study, COVID-19 positivity was confirmed by either RT-PCR or CT images, and different strategies were adopted to evaluate the models, including random splits and leave-one-center-out. We randomly split the data to train and test sets containing both CT positive and RT-PCR positive patients. Furthermore, to ensure the reproducibility of our results on RT-PCR positive patients, we split the dataset in a way that the test set consisted only of RT-PCR positive patients. To maximize the generalizability of the model and avoid overfitting on training sets, owing to variability in acquisition and reconstruction protocols, our model was developed on multicentric datasets with a wide variety of acquisition and reconstruction parameters. To test the generalizability of our model, we repeated the evaluation of our model using leave-one-center-out cross-validation. The results were reported for 10 different strategies of splitting and cross-validation scenarios.

Several studies reported on the use of CT radiomics or DL algorithms for diagnostic and prognostic purposes in patients with COVID-19 ^{58, 59, 68}. However, most studies were performed using a small sample size. Overall, establishing evidence that radiomics features can help prioritize patients based on the severity of their disease and/or predicting their survival requires assessment using larger cohorts for a more generalizable model because of the wide variability in COVID-19 manifestations in different patients. In this study, we provided a large multinational multicentric dataset and evaluated our model in different scenarios to ensure model reproducibility, robustness and generalizability.

While attempting to address bias and limitations to create a generalizable model, the results should be interpreted considering some issues. First, motion artifacts were unavoidable in some COVID-19 patient scans which resulted in overlapping pneumonia regions. We removed patients with severe motion artifacts to omit this effect on model generalization. Second, we enrolled patients with common symptoms of COVID-19 whose infection was confirmed by either RT-PCR or CT imaging (typical manifestation of COVID-19 defined by interim guidelines). We handled this issue by testing different scenarios, including training a model using RT-PCR or CT positive patients and held out only RT-PCR patients in the test set and reported reproducible and repeatable results. Third, we did not include comorbidities (increased risk of adverse outcome), clinical or laboratory data during modeling. However, previous studies showed high correlation of lung features with these findings ^{58, 65, 66}. Future studies combining various information to build a holistic model using a large dataset could improve the model’s performance. Forth, we built a prognostic model based on all lung radiomics features. However, COVID-19 can result in imaging manifestations in other organs, such as the heart. Including features from different organs has the potential of improving prognostic performance ⁶⁹. Fifth, therapeutic regimens for different patients were not considered during modeling although providing this information may help improving the accuracy of the model. Sixth, only binary classification was considered for the prognostic model in this study. Future studies should perform survival analysis using time-to-event models to account for the time of adverse event. Lastly, we did not evaluate the impact of image acquisition or reconstruction parameters on radiomics features at the same time. We applied ComBat harmonization algorithm to eliminate center-specific parameter effects on CT radiomics features.

CONCLUSION

A very large heterogeneous COVID-19 database was gathered from multiple centers and a predictive model of survival outcome derived and extensively tested to evaluate its reproducibility and generalizability. We demonstrated that lung CT radiomics features could be used as biomarkers for prognostic modeling in COVID-19. Through the use of a large imaging dataset, the predictive power of the proposed CT radiomics model is more reliable and may be prospectively used in clinical setting to manage COVID-19 patients.

Data Availability

All data produced in the present work are contained in the manuscript

Data and code availability

Radiomics features and code would be available with request upon publication.

Conflict of Interest statement

The authors declare that they have no conflict of interest.

ACKNOWLEDGMENTS

This work was supported by the Swiss National Science Foundation under grant SNRF 320030_176052.

Footnotes

First Author: Isaac Shiri, MSc Geneva University Hospital, Division of Nuclear Medicine and Molecular Imaging, CH-1211 Geneva, Switzerland Email: Isaac.shirilord{at}unige.ch

REFERENCES

1.↵
Woolf, S.H., Chapman, D.A. & Lee, J.H. COVID-19 as the Leading Cause of Death in the United States. JAMA 325, 123–124 (2021).
OpenUrl CrossRef PubMed
2.↵
Lai, C.-C., Shih, T.-P., Ko, W.-C., Tang, H.-J. & Hsueh, P.-R. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges. Int J Antimicrob Agents 55, 105924–105924 (2020).
OpenUrl CrossRef PubMed
3.↵
Lai, C.-C., Ko, W.-C., Lee, P.-I., Jean, S.-S. & Hsueh, P.-R. Extra-respiratory manifestations of COVID-19. Int J Antimicrob Agents 56, 106024–106024 (2020).
OpenUrl
4.↵
Afzal, A. Molecular diagnostic technologies for COVID-19: Limitations and challenges. Journal of advanced research 26, 149–159 (2020).
OpenUrl
5.↵
Gill, T.M. The central role of prognosis in clinical decision making. JAMA 307, 199–200 (2012).
OpenUrl CrossRef PubMed Web of Science
6.↵
Yan, X., et al. Clinical Characteristics and Prognosis of 218 Patients With COVID-19: A Retrospective Study Based on Clinical Classification. Frontiers in medicine 7, 485–485 (2020).
OpenUrl
7.↵
Zhou, W., Qin, X., Hu, X., Lu, Y. & Pan, J. Prognosis models for severe and critical COVID-19 based on the Charlson and Elixhauser comorbidity indices. International journal of medical sciences 17, 2257–2263 (2020).
OpenUrl
8.↵
Pontone, G., et al. Role of computed tomography in COVID-19. J Cardiovasc Comput Tomog (2020).
9.↵
Yang, R., et al. Chest CT Severity Score: An Imaging Tool for Assessing Severe COVID-19. Radiology: Cardiothoracic Imaging 2, e200047 (2020).
OpenUrl
10.↵
Francone, M., et al. Chest CT score in COVID-19 patients: correlation with disease severity and short-term prognosis. Eur Radiol 30, 6808–6817 (2020).
OpenUrl PubMed
11.↵
Zhao, W., Zhong, Z., Xie, X., Yu, Q. & Liu, J. Relation Between Chest CT Findings and Clinical Conditions of Coronavirus Disease (COVID-19) Pneumonia: A Multicenter Study. American Journal of Roentgenology 214, 1072–1077 (2020).
OpenUrl PubMed
12.↵
Li, K., et al. The Clinical and Chest CT Features Associated With Severe and Critical COVID-19 Pneumonia. Investigative radiology 55, 327–331 (2020).
OpenUrl PubMed
13.↵
Roberts, M., et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nature Machine Intelligence 3, 199–217 (2021).
OpenUrl
14.↵
Harmon, S.A., et al. Artificial intelligence for the detection of COVID-19 pneumonia on chest CT using multinational datasets. Nature communications 11, 4080 (2020).
OpenUrl
15.↵
Mei, X., et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19. Nat Med 26, 1224–1228 (2020).
OpenUrl
16.↵
Cai, W., et al. CT Quantification and Machine-learning Models for Assessment of Disease Severity and Prognosis of COVID-19 Patients. Academic radiology 27, 1665–1678 (2020).
OpenUrl CrossRef PubMed
17.↵
Lessmann, N., et al. Automated Assessment of CO-RADS and Chest CT Severity Scores in Patients with Suspected COVID-19 Using Artificial Intelligence. Radiology, 202439 (2020).
18.↵
Meng, L., et al. A Deep Learning Prognosis Model Help Alert for COVID-19 Patients at High- Risk of Death: A Multi-center Study. IEEE journal of biomedical and health informatics Pp(2020).
19.↵
Ning, W., et al. Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nature biomedical engineering (2020).
20.↵
Fang, M., et al. CT radiomics can help screen the coronavirus disease 2019 (COVID-19): a preliminary study. Science China Information Sciences 63, 1–8 (2020).
OpenUrl
21.
Wu, Q., et al. Radiomics Analysis of Computed Tomography helps predict poor prognostic outcome in COVID-19. Theranostics 10, 7231 (2020).
OpenUrl CrossRef
22.
Homayounieh, F., et al. CT Radiomics, Radiologists and Clinical Information in Predicting Outcome of Patients with COVID-19 Pneumonia. Radiology: Cardiothoracic Imaging 2, e200322 (2020).
OpenUrl
23.↵
Wang, H., et al. Decoding COVID-19 pneumonia: comparison of deep learning and radiomics CT image signatures. European journal of nuclear medicine and molecular imaging, 1–9 (2020).
24.↵
Abdollahi, H., Shiri, I. & Heydari, M. Medical Imaging Technologists in Radiomics Era: An Alice in Wonderland Problem. Iran J Public Health 48, 184–186 (2019).
OpenUrl
25.
Amini, M., et al. Multi-level multi-modality (PET and CT) fusion radiomics: prognostic modeling for non-small cell lung carcinoma. Phys Med Biol 66(2021).
26.
Bouchareb, Y., et al. Artificial intelligence-driven assessment of radiological images for COVID- 19. Comput Biol Med 136, 104665 (2021).
27.↵
Edalat-Javid, M., et al. Cardiac SPECT radiomic features repeatability and reproducibility: A multi-scanner phantom study. J Nucl Cardiol (2020).
28.
Khodabakhshi, Z., et al. Overall Survival Prediction in Renal Cell Carcinoma Patients Using Computed Tomography Radiomic and Clinical Information. J Digit Imaging 34, 1086–1098 (2021).
OpenUrl
29.
Khodabakhshi, Z., et al. Non-small cell lung carcinoma histopathological subtype phenotyping using high-dimensional multinomial multiclass CT radiomics signature. Comput Biol Med 136, 104752 (2021).
30.
Nazari, M., Shiri, I. & Zaidi, H. Radiomics-based machine learning model to predict risk of death within 5-years in clear cell renal cell carcinoma patients. Comput Biol Med 129, 104135 (2021).
31.
Shayesteh, S., et al. Treatment response prediction using MRI-based pre-, post-, and delta-radiomic features and machine learning algorithms in colorectal cancer. Med Phys 48, 3691–3701 (2021).
OpenUrl
32.
Shiri, I., Abdollahi, H., Shaysteh, S. & Mahdavi, S.R. Test-retest reproducibility and robustness analysis of recurrent glioblastoma MRI radiomics texture features. Iranian Journal of Radiology (2017).
33.
Shiri, I., et al. Machine learning-based prognostic modeling using clinical data and quantitative radiomic features from chest CT images in COVID-19 patients. Comput Biol Med 132, 104304 (2021).
34.↵
Amini, M., et al. Overall Survival Prognostic Modelling of Non-small Cell Lung Cancer Patients Using Positron Emission Tomography/Computed Tomography Harmonised Radiomics Features: The Quest for the Optimal Machine Learning Algorithm. Clinical Oncology.
35.↵
Mostafaei, S., et al. CT imaging markers to improve radiation toxicity prediction in prostate cancer radiotherapy by stacking regression algorithm. La radiologia medica 125, 87–97 (2020).
OpenUrl
36.
Nazari, M., Shiri, I. & Zaidi, H. Radiomics-based machine learning model to predict risk of death within 5-years in clear cell renal cell carcinoma patients. Comput Biol Med 129, 104135 (2020).
37.↵
Akhavanallaf, A., Shiri, I., Arabi, H. & Zaidi, H. Whole-body voxel-based internal dosimetry using deep learning. Eur J Nucl Med Mol Imaging (2020).
38.↵
Shiri, I., et al. Diagnosis of COVID-19 Using CT image Radiomics Features: A Comprehensive Machine Learning Study Involving 26,307 Patients. medRxiv (2021).
39.↵
Fu, L., Li, Y., Cheng, A., Pang, P. & Shu, Z. A Novel Machine Learning-derived Radiomic Signature of the Whole Lung Differentiates Stable From Progressive COVID-19 Infection: A Retrospective Cohort Study. Journal of thoracic imaging (2020).
40.↵
Homayounieh, F., et al. Computed Tomography Radiomics Can Predict Disease Severity and Outcome in Coronavirus Disease 2019 Pneumonia. Journal of computer assisted tomography 44, 640–646 (2020).
OpenUrl
41.↵
Li, C., et al. Classification of Severe and Critical COVID-19 Using Deep Learning and Radiomics. IEEE journal of biomedical and health informatics Pp(2020).
42.↵
Cai, Q., et al. A model based on CT radiomic features for predicting RT-PCR becoming negative in coronavirus disease 2019 (COVID-19) patients. BMC medical imaging 20, 118 (2020).
43.↵
Yue, H., et al. Machine learning-based CT radiomics method for predicting hospital stay in patients with pneumonia associated with SARS-CoV-2 infection: a multicenter study. Annals of translational medicine 8, 859 (2020).
44.↵
Bae, J., et al. Predicting Mechanical Ventilation Requirement and Mortality in COVID-19 using Radiomics and Deep Learning on Chest Radiographs: A Multi-Institutional Study. ArXiv (2020).
45.↵
Tizhoosh, H.R. & Fratesi, J. COVID-19, AI enthusiasts, and toy datasets: radiology without radiologists. Eur Radiol 31, 3553–3554 (2021).
OpenUrl
46.↵
Summers, R.M. Artificial Intelligence of COVID-19 Imaging: A Hammer in Search of a Nail. Radiology 298, E162–e164 (2021).
OpenUrl
47.↵
Lambin, P., et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14, 749–762 (2017).
OpenUrl CrossRef PubMed
48.↵
Mongan, J., Moy, L. & Charles E. Kahn, J. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiology: Artificial Intelligence 2, e200029 (2020).
49.↵
Prokop, M., et al. CO-RADS: A Categorical CT Assessment Scheme for Patients Suspected of Having COVID-19-Definition and Evaluation. Radiology 296, E97–e104 (2020).
OpenUrl PubMed
50.↵
Ning, W., et al. Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nat Biomed Eng 4, 1197–1207 (2020).
OpenUrl
51.↵
Radpour, A., et al. COVID-19 evaluation by low-dose high resolution CT scans protocol. Academic radiology 27, 901 (2020).
52.↵
Shiri, I., et al. COLI-NET: Fully Automated COVID-19 Lung and Infection Pneumonia Lesion Detection and Segmentation from Chest CT Images. medRxiv, 2021.2004.2008.21255163 (2021).
53.↵
Zwanenburg, A., et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).
OpenUrl CrossRef PubMed
54.↵
van Griethuysen, J.J.M., et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer research 77, e104–e107 (2017).
OpenUrl Abstract/FREE Full Text
55.↵
Johnson, W.E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2006).
OpenUrl CrossRef PubMed Web of Science
56.↵
Robin, X., et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC bioinformatics 12, 1–8 (2011).
OpenUrl CrossRef PubMed
57.↵
Da-Ano, R., Visvikis, D. & Hatt, M. Harmonization strategies for multicenter radiomics investigations. Phys Med Biol 65, 24tr02 (2020).
OpenUrl
58.↵
Zhang, K., et al. Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography. Cell 181, 1423–1433.e1411 (2020).
OpenUrl
59.↵
Chao, H., et al. Integrative analysis for COVID-19 patient outcome prediction. Medical image analysis 67, 101844 (2020).
60.↵
Qiu, J., et al. A Radiomics Signature to Quantitatively Analyze COVID-19-Infected Pulmonary Lesions. Interdisciplinary sciences, computational life sciences, 1–12 (2021).
61.↵
Tang, Z., et al. Severity assessment of COVID-19 using CT image features and laboratory indices. Physics in medicine and biology (2020).
62.↵
Wu, Q., et al. Radiomics Analysis of Computed Tomography helps predict poor prognostic outcome in COVID-19. Theranostics 10, 7231–7244 (2020).
OpenUrl CrossRef
63.↵
Homayounieh, F., et al. CT Radiomics, Radiologists, and Clinical Information in Predicting Outcome of Patients with COVID-19 Pneumonia. Radiology: Cardiothoracic Imaging 2, e200322 (2020).
OpenUrl
64.↵
Shiri, I., et al. Machine Learning-based Prognostic Modeling using Clinical Data and Quantitative Radiomic Features from Chest CT Images in COVID-19 Patients. Computers in Biology and Medicine, 104304 (2021).
65.↵
Lassau, N., et al. Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients. Nature communications 12, 1–11 (2021).
OpenUrl
66.↵
Feng, Z., et al. Early prediction of disease progression in COVID-19 pneumonia patients with chest CT and clinical characteristics. Nature communications 11, 4968 (2020).
OpenUrl
67.↵
Xu, Q., et al. CT-based Rapid Triage of COVID-19 Patients: Risk Prediction and Progression Estimation of ICU Admission, Mechanical Ventilation, and Death of Hospitalized Patients. medRxiv : the preprint server for health sciences (2020).
68.↵
Chassagnon, G., et al. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Medical image analysis 67, 101860 (2020).
69.↵
Chassagnon, G., et al. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Medical image analysis 67, 101860 (2021).

View the discussion thread.

Posted December 07, 2021.

Download PDF

Data/Code

Citation Tools

Subject Area

Radiology and Imaging

Subject Areas

All Articles

Addiction Medicine (412)
Allergy and Immunology (726)
Anesthesia (214)
Cardiovascular Medicine (3107)
Dentistry and Oral Medicine (349)
Dermatology (263)
Emergency Medicine (463)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1100)
Epidemiology (13046)
Forensic Medicine (13)
Gastroenterology (862)
Genetic and Genomic Medicine (4866)
Geriatric Medicine (449)
Health Economics (751)
Health Informatics (3068)
Health Policy (1108)
Health Systems and Quality Improvement (1135)
Hematology (410)
HIV/AIDS (962)
Infectious Diseases (except HIV/AIDS) (14351)
Intensive Care and Critical Care Medicine (885)
Medical Education (453)
Medical Ethics (120)
Nephrology (502)
Neurology (4631)
Nursing (247)
Nutrition (689)
Obstetrics and Gynecology (847)
Occupational and Environmental Health (764)
Oncology (2393)
Ophthalmology (677)
Orthopedics (270)
Otolaryngology (333)
Pain Medicine (306)
Palliative Medicine (88)
Pathology (516)
Pediatrics (1243)
Pharmacology and Therapeutics (521)
Primary Care Research (522)
Psychiatry and Clinical Psychology (3976)
Public and Global Health (7201)
Radiology and Imaging (1606)
Rehabilitation Medicine and Physical Therapy (958)
Respiratory Medicine (944)
Rheumatology (460)
Sexual and Reproductive Health (478)
Sports Medicine (403)
Surgery (514)
Toxicology (65)
Transplantation (222)
Urology (190)

[1] 1.↵
Woolf, S.H., Chapman, D.A. & Lee, J.H. COVID-19 as the Leading Cause of Death in the United States. JAMA 325, 123–124 (2021).
OpenUrl CrossRef PubMed

[2] 2.↵
Lai, C.-C., Shih, T.-P., Ko, W.-C., Tang, H.-J. & Hsueh, P.-R. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges. Int J Antimicrob Agents 55, 105924–105924 (2020).
OpenUrl CrossRef PubMed

[3] 3.↵
Lai, C.-C., Ko, W.-C., Lee, P.-I., Jean, S.-S. & Hsueh, P.-R. Extra-respiratory manifestations of COVID-19. Int J Antimicrob Agents 56, 106024–106024 (2020).
OpenUrl

[4] 4.↵
Afzal, A. Molecular diagnostic technologies for COVID-19: Limitations and challenges. Journal of advanced research 26, 149–159 (2020).
OpenUrl

[5] 5.↵
Gill, T.M. The central role of prognosis in clinical decision making. JAMA 307, 199–200 (2012).
OpenUrl CrossRef PubMed Web of Science

[6] 6.↵
Yan, X., et al. Clinical Characteristics and Prognosis of 218 Patients With COVID-19: A Retrospective Study Based on Clinical Classification. Frontiers in medicine 7, 485–485 (2020).
OpenUrl

[7] 7.↵
Zhou, W., Qin, X., Hu, X., Lu, Y. & Pan, J. Prognosis models for severe and critical COVID-19 based on the Charlson and Elixhauser comorbidity indices. International journal of medical sciences 17, 2257–2263 (2020).
OpenUrl

[8] 8.↵
Pontone, G., et al. Role of computed tomography in COVID-19. J Cardiovasc Comput Tomog (2020).

[9] 9.↵
Yang, R., et al. Chest CT Severity Score: An Imaging Tool for Assessing Severe COVID-19. Radiology: Cardiothoracic Imaging 2, e200047 (2020).
OpenUrl

[10] 10.↵
Francone, M., et al. Chest CT score in COVID-19 patients: correlation with disease severity and short-term prognosis. Eur Radiol 30, 6808–6817 (2020).
OpenUrl PubMed

[11] 11.↵
Zhao, W., Zhong, Z., Xie, X., Yu, Q. & Liu, J. Relation Between Chest CT Findings and Clinical Conditions of Coronavirus Disease (COVID-19) Pneumonia: A Multicenter Study. American Journal of Roentgenology 214, 1072–1077 (2020).
OpenUrl PubMed

[12] 12.↵
Li, K., et al. The Clinical and Chest CT Features Associated With Severe and Critical COVID-19 Pneumonia. Investigative radiology 55, 327–331 (2020).
OpenUrl PubMed

[13] 13.↵
Roberts, M., et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nature Machine Intelligence 3, 199–217 (2021).
OpenUrl

[14] 14.↵
Harmon, S.A., et al. Artificial intelligence for the detection of COVID-19 pneumonia on chest CT using multinational datasets. Nature communications 11, 4080 (2020).
OpenUrl

[15] 15.↵
Mei, X., et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19. Nat Med 26, 1224–1228 (2020).
OpenUrl

[16] 16.↵
Cai, W., et al. CT Quantification and Machine-learning Models for Assessment of Disease Severity and Prognosis of COVID-19 Patients. Academic radiology 27, 1665–1678 (2020).
OpenUrl CrossRef PubMed

[17] 17.↵
Lessmann, N., et al. Automated Assessment of CO-RADS and Chest CT Severity Scores in Patients with Suspected COVID-19 Using Artificial Intelligence. Radiology, 202439 (2020).

[18] 18.↵
Meng, L., et al. A Deep Learning Prognosis Model Help Alert for COVID-19 Patients at High- Risk of Death: A Multi-center Study. IEEE journal of biomedical and health informatics Pp(2020).

[19] 19.↵
Ning, W., et al. Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nature biomedical engineering (2020).

[20] 20.↵
Fang, M., et al. CT radiomics can help screen the coronavirus disease 2019 (COVID-19): a preliminary study. Science China Information Sciences 63, 1–8 (2020).
OpenUrl

[21] 21.
Wu, Q., et al. Radiomics Analysis of Computed Tomography helps predict poor prognostic outcome in COVID-19. Theranostics 10, 7231 (2020).
OpenUrl CrossRef

[22] 22.
Homayounieh, F., et al. CT Radiomics, Radiologists and Clinical Information in Predicting Outcome of Patients with COVID-19 Pneumonia. Radiology: Cardiothoracic Imaging 2, e200322 (2020).
OpenUrl

[23] 23.↵
Wang, H., et al. Decoding COVID-19 pneumonia: comparison of deep learning and radiomics CT image signatures. European journal of nuclear medicine and molecular imaging, 1–9 (2020).

[24] 24.↵
Abdollahi, H., Shiri, I. & Heydari, M. Medical Imaging Technologists in Radiomics Era: An Alice in Wonderland Problem. Iran J Public Health 48, 184–186 (2019).
OpenUrl

[25] 25.
Amini, M., et al. Multi-level multi-modality (PET and CT) fusion radiomics: prognostic modeling for non-small cell lung carcinoma. Phys Med Biol 66(2021).

[26] 26.
Bouchareb, Y., et al. Artificial intelligence-driven assessment of radiological images for COVID- 19. Comput Biol Med 136, 104665 (2021).

[27] 27.↵
Edalat-Javid, M., et al. Cardiac SPECT radiomic features repeatability and reproducibility: A multi-scanner phantom study. J Nucl Cardiol (2020).

[28] 28.
Khodabakhshi, Z., et al. Overall Survival Prediction in Renal Cell Carcinoma Patients Using Computed Tomography Radiomic and Clinical Information. J Digit Imaging 34, 1086–1098 (2021).
OpenUrl

[29] 29.
Khodabakhshi, Z., et al. Non-small cell lung carcinoma histopathological subtype phenotyping using high-dimensional multinomial multiclass CT radiomics signature. Comput Biol Med 136, 104752 (2021).

[30] 30.
Nazari, M., Shiri, I. & Zaidi, H. Radiomics-based machine learning model to predict risk of death within 5-years in clear cell renal cell carcinoma patients. Comput Biol Med 129, 104135 (2021).

[31] 31.
Shayesteh, S., et al. Treatment response prediction using MRI-based pre-, post-, and delta-radiomic features and machine learning algorithms in colorectal cancer. Med Phys 48, 3691–3701 (2021).
OpenUrl

[32] 32.
Shiri, I., Abdollahi, H., Shaysteh, S. & Mahdavi, S.R. Test-retest reproducibility and robustness analysis of recurrent glioblastoma MRI radiomics texture features. Iranian Journal of Radiology (2017).

[33] 33.
Shiri, I., et al. Machine learning-based prognostic modeling using clinical data and quantitative radiomic features from chest CT images in COVID-19 patients. Comput Biol Med 132, 104304 (2021).

[34] 34.↵
Amini, M., et al. Overall Survival Prognostic Modelling of Non-small Cell Lung Cancer Patients Using Positron Emission Tomography/Computed Tomography Harmonised Radiomics Features: The Quest for the Optimal Machine Learning Algorithm. Clinical Oncology.

[35] 35.↵
Mostafaei, S., et al. CT imaging markers to improve radiation toxicity prediction in prostate cancer radiotherapy by stacking regression algorithm. La radiologia medica 125, 87–97 (2020).
OpenUrl

[36] 36.
Nazari, M., Shiri, I. & Zaidi, H. Radiomics-based machine learning model to predict risk of death within 5-years in clear cell renal cell carcinoma patients. Comput Biol Med 129, 104135 (2020).

[37] 37.↵
Akhavanallaf, A., Shiri, I., Arabi, H. & Zaidi, H. Whole-body voxel-based internal dosimetry using deep learning. Eur J Nucl Med Mol Imaging (2020).

[38] 38.↵
Shiri, I., et al. Diagnosis of COVID-19 Using CT image Radiomics Features: A Comprehensive Machine Learning Study Involving 26,307 Patients. medRxiv (2021).

[39] 39.↵
Fu, L., Li, Y., Cheng, A., Pang, P. & Shu, Z. A Novel Machine Learning-derived Radiomic Signature of the Whole Lung Differentiates Stable From Progressive COVID-19 Infection: A Retrospective Cohort Study. Journal of thoracic imaging (2020).

[40] 40.↵
Homayounieh, F., et al. Computed Tomography Radiomics Can Predict Disease Severity and Outcome in Coronavirus Disease 2019 Pneumonia. Journal of computer assisted tomography 44, 640–646 (2020).
OpenUrl

[41] 41.↵
Li, C., et al. Classification of Severe and Critical COVID-19 Using Deep Learning and Radiomics. IEEE journal of biomedical and health informatics Pp(2020).

[42] 42.↵
Cai, Q., et al. A model based on CT radiomic features for predicting RT-PCR becoming negative in coronavirus disease 2019 (COVID-19) patients. BMC medical imaging 20, 118 (2020).

[43] 43.↵
Yue, H., et al. Machine learning-based CT radiomics method for predicting hospital stay in patients with pneumonia associated with SARS-CoV-2 infection: a multicenter study. Annals of translational medicine 8, 859 (2020).

[44] 44.↵
Bae, J., et al. Predicting Mechanical Ventilation Requirement and Mortality in COVID-19 using Radiomics and Deep Learning on Chest Radiographs: A Multi-Institutional Study. ArXiv (2020).

[45] 45.↵
Tizhoosh, H.R. & Fratesi, J. COVID-19, AI enthusiasts, and toy datasets: radiology without radiologists. Eur Radiol 31, 3553–3554 (2021).
OpenUrl

[46] 46.↵
Summers, R.M. Artificial Intelligence of COVID-19 Imaging: A Hammer in Search of a Nail. Radiology 298, E162–e164 (2021).
OpenUrl

[47] 47.↵
Lambin, P., et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14, 749–762 (2017).
OpenUrl CrossRef PubMed

[48] 48.↵
Mongan, J., Moy, L. & Charles E. Kahn, J. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiology: Artificial Intelligence 2, e200029 (2020).

[49] 49.↵
Prokop, M., et al. CO-RADS: A Categorical CT Assessment Scheme for Patients Suspected of Having COVID-19-Definition and Evaluation. Radiology 296, E97–e104 (2020).
OpenUrl PubMed

[50] 50.↵
Ning, W., et al. Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning. Nat Biomed Eng 4, 1197–1207 (2020).
OpenUrl

[51] 51.↵
Radpour, A., et al. COVID-19 evaluation by low-dose high resolution CT scans protocol. Academic radiology 27, 901 (2020).

[52] 52.↵
Shiri, I., et al. COLI-NET: Fully Automated COVID-19 Lung and Infection Pneumonia Lesion Detection and Segmentation from Chest CT Images. medRxiv, 2021.2004.2008.21255163 (2021).

[53] 53.↵
Zwanenburg, A., et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).
OpenUrl CrossRef PubMed

[54] 54.↵
van Griethuysen, J.J.M., et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer research 77, e104–e107 (2017).
OpenUrl Abstract/FREE Full Text

[55] 55.↵
Johnson, W.E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2006).
OpenUrl CrossRef PubMed Web of Science

[56] 56.↵
Robin, X., et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC bioinformatics 12, 1–8 (2011).
OpenUrl CrossRef PubMed

[57] 57.↵
Da-Ano, R., Visvikis, D. & Hatt, M. Harmonization strategies for multicenter radiomics investigations. Phys Med Biol 65, 24tr02 (2020).
OpenUrl

[58] 58.↵
Zhang, K., et al. Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography. Cell 181, 1423–1433.e1411 (2020).
OpenUrl

[59] 59.↵
Chao, H., et al. Integrative analysis for COVID-19 patient outcome prediction. Medical image analysis 67, 101844 (2020).

[60] 60.↵
Qiu, J., et al. A Radiomics Signature to Quantitatively Analyze COVID-19-Infected Pulmonary Lesions. Interdisciplinary sciences, computational life sciences, 1–12 (2021).

[61] 61.↵
Tang, Z., et al. Severity assessment of COVID-19 using CT image features and laboratory indices. Physics in medicine and biology (2020).

[62] 62.↵
Wu, Q., et al. Radiomics Analysis of Computed Tomography helps predict poor prognostic outcome in COVID-19. Theranostics 10, 7231–7244 (2020).
OpenUrl CrossRef

[63] 63.↵
Homayounieh, F., et al. CT Radiomics, Radiologists, and Clinical Information in Predicting Outcome of Patients with COVID-19 Pneumonia. Radiology: Cardiothoracic Imaging 2, e200322 (2020).
OpenUrl

[64] 64.↵
Shiri, I., et al. Machine Learning-based Prognostic Modeling using Clinical Data and Quantitative Radiomic Features from Chest CT Images in COVID-19 Patients. Computers in Biology and Medicine, 104304 (2021).

[65] 65.↵
Lassau, N., et al. Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients. Nature communications 12, 1–11 (2021).
OpenUrl

[66] 66.↵
Feng, Z., et al. Early prediction of disease progression in COVID-19 pneumonia patients with chest CT and clinical characteristics. Nature communications 11, 4968 (2020).
OpenUrl

[67] 67.↵
Xu, Q., et al. CT-based Rapid Triage of COVID-19 Patients: Risk Prediction and Progression Estimation of ICU Admission, Mechanical Ventilation, and Death of Hospitalized Patients. medRxiv : the preprint server for health sciences (2020).

[68] 68.↵
Chassagnon, G., et al. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Medical image analysis 67, 101860 (2020).

[69] 69.↵
Chassagnon, G., et al. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Medical image analysis 67, 101860 (2021).

COVID-19 Prognostic Modeling Using CT Radiomic Features and Machine Learning Algorithms: Analysis of a Multi-Institutional Dataset of 14,339 Patients

Abstract

INTRODUCTION

MATERIALS AND METHODS

Patient Population

CT Image Acquisition

Image Segmentation and Image Preprocessing

Radiomics Feature Extraction and Harmonization

Feature Preprocessing

Feature Selection and Classification

Evaluation

RESULTS

DISCUSSION

CONCLUSION

Data Availability

Data and code availability

Conflict of Interest statement

ACKNOWLEDGMENTS

Footnotes

REFERENCES

Citation Manager Formats

Subject Area