PT - JOURNAL ARTICLE AU - Belkhatir, Zehor AU - Estépar, Raúl San José AU - Tannenbaum, Allen R. TI - Supervised Image Classification Algorithm Using Representative Spatial Texture Features: Application to COVID-19 Diagnosis Using CT Images AID - 10.1101/2020.12.03.20243493 DP - 2020 Jan 01 TA - medRxiv PG - 2020.12.03.20243493 4099 - http://medrxiv.org/content/early/2020/12/04/2020.12.03.20243493.short 4100 - http://medrxiv.org/content/early/2020/12/04/2020.12.03.20243493.full AB - Although there is no universal definition for texture, the concept in various forms is nevertheless widely used and a key element of visual perception to analyze images in different fields. The present work’s main idea relies on the assumption that there exist representative samples, which we refer to as references as well, i.e., “good or bad” samples that represent a given dataset investigated in a particular data analysis problem. These representative samples need to be accounted for when designing predictive models with the aim of improving their performance. In particular, based on a selected subset of texture gray-level co-occurrence matrices (GLCMs) from the training cohort, we propose new representative spatial texture features, which we incorporate into a supervised image classification pipeline. The pipeline relies on the support vector machine (SVM) algorithm along with Bayesian optimization and the Wasserstein metric from optimal mass transport (OMT) theory. The selection of the best, “good and bad,” GLCM references is considered for each classification label and performed during the training phase of the SVM classifier using a Bayesian optimizer. We assume that sample fitness is defined based on closeness (in the sense of the Wasserstein metric) and high correlation (Spearman’s rank sense) with other samples in the same class. Moreover, the newly defined spatial texture features consist of the Wasserstein distance between the optimally selected references and the remaining samples. We assessed the performance of the proposed classification pipeline in diagnosing the corona virus disease 2019 (COVID-19) from computed tomographic (CT) images.Competing Interest StatementThe authors have declared no competing interest.Funding StatementAFOSR grants (FA9550-17-1-0435, FA9550-20-1-0029), NIH grant (R01-AG048769), MSK Cancer Center Support Grant/Core Grant (P30 CA008748), and a grant from Breast Cancer Research Foundation (grant BCRF-17-193).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Not needed as the data is publicAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data is publicly available