PT - JOURNAL ARTICLE AU - Xu, Ming AU - Ouyang, Liu AU - Gao, Yan AU - Chen, Yuanfang AU - Yu, Tingting AU - Li, Qian AU - Sun, Kai AU - Bao, Forrest Sheng AU - Safarnejad, Lida AU - Wen, Jing AU - Jiang, Chao AU - Chen, Tianyang AU - Han, Lei AU - Zhang, Hengdong AU - Gao, Yue AU - Yu, Zhengmin AU - Liu, Xiaowen AU - Yan, Tianyu AU - Li, Hebi AU - Robinson, Patrick AU - Zhu, Baoli AU - Liu, Jie AU - Liu, Yang AU - Zhang, Zengli AU - Ge, Yaorong AU - Chen, Shi TI - Accurately Differentiating COVID-19, Other Viral Infection, and Healthy Individuals Using Multimodal Features via Late Fusion Learning AID - 10.1101/2020.08.18.20176776 DP - 2020 Jan 01 TA - medRxiv PG - 2020.08.18.20176776 4099 - http://medrxiv.org/content/early/2020/08/21/2020.08.18.20176776.short 4100 - http://medrxiv.org/content/early/2020/08/21/2020.08.18.20176776.full AB - Effectively identifying COVID-19 patients using non-PCR clinical data is critical for the optimal clinical outcomes. Currently, there is a lack of comprehensive understanding of various biomedical features and appropriate technical approaches to accurately detecting COVID-19 patients. In this study, we recruited 214 confirmed COVID-19 patients in non-severe (NS) and 148 in severe (S) clinical type, 198 non-infected healthy (H) participants and 129 non-COVID viral pneumonia (V) patients. The participants’ clinical information (23 features), lab testing results (10 features), and thoracic CT scans upon admission were acquired as three input feature modalities. To enable late fusion of multimodality data, we developed a deep learning model to extract a 10-feature high-level representation of the CT scans. Exploratory analyses showed substantial differences of all features among the four classes. Three machine learning models (k-nearest neighbor kNN, random forest RF, and support vector machine SVM) were developed based on the 43 features combined from all three modalities to differentiate four classes (NS, S, V, and H) at once. All three models had high accuracy to differentiate the overall four classes (95.4%-97.7%) and each individual class (90.6%-99.9%). Multimodal features provided substantial performance gain from using any single feature modality. Compared to existing binary classification benchmarks often focusing on single feature modality, this study provided a novel and effective breakthrough for clinical applications. Findings and the analytical workflow can be used as clinical decision support for current COVID-19 and other clinical applications with high-dimensional multimodal biomedical features.One sentence summary We trained and validated late fusion deep learning-machine learning models to predict non-severe COVID-19, severe COVID-19, non-COVID viral infection, and healthy classes from clinical, lab testing, and CT scan features extracted from convolutional neural network and achieved predictive accuracy of > 96% to differentiate all four classes at once based on a large dataset of 689 participants.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study is supported by the North Carolina Biotechnology Center Flash Grant on COVID-19 Clinical Research (2020-FLG-3898), the National Science Foundation for Young Scientists of China (81703201), the Natural Science Foundation for Young Scientists of Jiangsu Province (BK20171076), the Jiangsu Provincial Medical Innovation Team (CXTDA2017029), the Jiangsu Provincial Medical Youth Talent program (QNRC2016548), the Jiangsu Preventive Medicine Association program (Y2018086), the Lifting Program of Jiangsu Provincial Scientific and Technological Association, and the Jiangsu Government Scholarship for Overseas Studies.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study was rigorously evaluated and approved by both IRB committees of Wuhan Union Hospital, Huazhong University of Science and Technology (approval number 2020-IEC-J-345) and Kunshan People Hospital, Jiangsu Provincial Center for Disease Control and Prevention (approval number JSJK2020-8003-01).All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll codes and de-identified data files were freely available on GitHub.