Abstract
Objectives To develop and externally geographically validate a mixed-effects deep learning model to diagnose COVID-19 from computed tomography (CT) imaging following best practice guidelines and assess the strengths and weaknesses of deep learning COVID-19 diagnosis.
Design Model development and external validation with retrospectively collected data from two countries.
Setting Hospitals in Moscow, Russia, collected between March 1, 2020, and April 25, 2020. The China Consortium of Chest CT Image Investigation (CC-CCII) collected between January 25, 2020, and March 27, 2020.
Participants 1,110 and 796 patients with either COVID-19 or healthy CT volumes from Moscow, Russia, and China, respectively.
Main outcome measures We developed a deep learning model with a novel mixed-effects layer to model the relationship between slices in CT imaging. The model was trained on a dataset from hospitals in Moscow, Russia, and externally geographically validated on a dataset from a consortium of Chinese hospitals. Model performance was evaluated in discriminative performance using the area under the receiver operating characteristic (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). In addition, calibration performance was assessed using calibration curves, and clinical benefit was assessed using decision curve analysis. Finally, the model’s decisions were assessed visually using saliency maps.
Results External validation on the large Chinese dataset showed excellent performance with an AUROC of 0.936 (95%CI: 0.910, 0.961). Using a probability threshold of 0.5, the sensitivity, specificity, NPV, and PPV were 0.753 (0.647, 0.840), 0.909 (0.869, 0.940), 0.711 (0.606, 0.802), and 0.925 (0.888, 0.953), respectively.
Conclusions Deep learning can reduce stress on healthcare systems by automatically screening CT imaging for COVID-19. However, deep learning models must be robustly assessed using various performance measures and externally validated in each setting. In addition, best practice guidelines for developing and reporting predictive models are vital for the safe adoption of such models.
Statements The authors do not own any of the patient data, and ethics approval was not needed. The lead author affirms that this manuscript is an honest, accurate, and transparent account of the study being reported, that no important aspects of the study have been omitted, and that any discrepancies from the study as planned (and, if relevant, registered) have been explained. Patients and the public were not involved in the study.
Funding This study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.
What is already known on this topic
Deep learning can diagnose diseases from imaging data automatically
Many studies using deep learning are of poor quality and fail to follow current best practice guidelines for the development and reporting of predictive models
Current methods do not adequately model the relationship between slices in CT volumetric data
What this study adds
A novel method to analyse volumetric imaging data composed of slices such as CT images using deep learning
Model developed following current best-practice guidelines for the development and reporting of prediction models
Competing Interest Statement
This study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.
Funding Statement
This study was funded by EPSRC studentship (No. 2110275), EPSRC Impact Acceleration Account (IAA) funding, and Amazon Web Services.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
All data used is publicly available from the following links: https://mosmed.ai/datasets/covid19_1110/ http://ncov-ai.big.ac.cn/download?lang=en
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Reported performance at various probability thresholds instead of just one, using Jeffrey's prior. Corrected the data augmentation hyperparameters. Corrected some formatting. Corrected a coding programming error, we believe all code implementations of previous models are now correct to the paper implementations. Fixed Figure 10, which had the values shifted over one place.
Data Availability
All data used is publicly available from the following links: https://mosmed.ai/datasets/covid19_1110/ http://ncov-ai.big.ac.cn/download?lang=en