PT - JOURNAL ARTICLE AU - Bridge, Joshua AU - Meng, Yanda AU - Zhu, Wenyue AU - Fitzmaurice, Thomas AU - McCann, Caroline AU - Addison, Cliff AU - Wang, Manhui AU - Merritt, Cristin AU - Franks, Stu AU - Mackey, Maria AU - Messenger, Steve AU - Sun, Renrong AU - Zhao, Yitian AU - Zheng, Yalin TI - Development and External Validation of a Mixed-Effects Deep Learning Model to Diagnose COVID-19 from CT Imaging AID - 10.1101/2022.01.28.22270005 DP - 2022 Jan 01 TA - medRxiv PG - 2022.01.28.22270005 4099 - http://medrxiv.org/content/early/2022/02/25/2022.01.28.22270005.short 4100 - http://medrxiv.org/content/early/2022/02/25/2022.01.28.22270005.full AB - Objectives To develop and externally geographically validate a mixed-effects deep learning model to diagnose COVID-19 from computed tomography (CT) imaging following best practice guidelines and assess the strengths and weaknesses of deep learning COVID-19 diagnosis.Design Model development and external validation with retrospectively collected data from two countries.Setting Hospitals in Moscow, Russia, collected between March 1, 2020, and April 25, 2020. The China Consortium of Chest CT Image Investigation (CC-CCII) collected between January 25, 2020, and March 27, 2020.Participants 1,110 and 796 patients with either COVID-19 or healthy CT volumes from Moscow, Russia, and China, respectively.Main outcome measures We developed a deep learning model with a novel mixed-effects layer to model the relationship between slices in CT imaging. The model was trained on a dataset from hospitals in Moscow, Russia, and externally geographically validated on a dataset from a consortium of Chinese hospitals. Model performance was evaluated in discriminative performance using the area under the receiver operating characteristic (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). In addition, calibration performance was assessed using calibration curves, and clinical benefit was assessed using decision curve analysis. Finally, the model’s decisions were assessed visually using saliency maps.Results External validation on the large Chinese dataset showed excellent performance with an AUROC of 0.936 (95%CI: 0.910, 0.961). Using a probability threshold of 0.5, the sensitivity, specificity, NPV, and PPV were 0.753 (0.647, 0.840), 0.909 (0.869, 0.940), 0.711 (0.606, 0.802), and 0.925 (0.888, 0.953), respectively.Conclusions Deep learning can reduce stress on healthcare systems by automatically screening CT imaging for COVID-19. However, deep learning models must be robustly assessed using various performance measures and externally validated in each setting. In addition, best practice guidelines for developing and reporting predictive models are vital for the safe adoption of such models.Statements The authors do not own any of the patient data, and ethics approval was not needed. The lead author affirms that this manuscript is an honest, accurate, and transparent account of the study being reported, that no important aspects of the study have been omitted, and that any discrepancies from the study as planned (and, if relevant, registered) have been explained. Patients and the public were not involved in the study.Funding This study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.What is already known on this topicDeep learning can diagnose diseases from imaging data automaticallyMany studies using deep learning are of poor quality and fail to follow current best practice guidelines for the development and reporting of predictive modelsCurrent methods do not adequately model the relationship between slices in CT volumetric dataWhat this study addsA novel method to analyse volumetric imaging data composed of slices such as CT images using deep learningModel developed following current best-practice guidelines for the development and reporting of prediction modelsCompeting Interest StatementThis study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.Funding StatementThis study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:All data used is publicly available from the following links: https://mosmed.ai/datasets/covid19_1110/ http://ncov-ai.big.ac.cn/download?lang=enI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data used is publicly available from the following links: https://mosmed.ai/datasets/covid19_1110/ http://ncov-ai.big.ac.cn/download?lang=en