RT Journal Article SR Electronic T1 Deep learning-based computer-aided diagnostic models versus other methods for predicting malignancy risk in CT-detected pulmonary nodules JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.06.06.23291012 DO 10.1101/2023.06.06.23291012 A1 Wulaningsih, Wahyu A1 Akram, Abdullah A1 Benemile, Janella A1 Kathyrn, Ruth A1 Croce, Filippo A1 Watkins, Johnathan YR 2023 UL http://medrxiv.org/content/early/2023/06/08/2023.06.06.23291012.abstract AB Importance There has been growing interest in the use of artificial intelligence (deep learning) to help achieve early diagnosis of prevalent diseases. None moreso than in lung cancer, where a combination of factors, including the high prevalence of nodules, the low prevalence of malignant nodules, and the indeterminacy of many nodules mean that it is fertile ground for the deployment of accurate, high-throughput deep learning (DL)-based tools.Objective To survey the landscape of externally validated DL-based computer-aided diagnostic (CADx) models, and assess their diagnostic performance for predicting the risk of malignancy in computed tomography (CT)-detected pulmonary nodules.Data sources An electronic search was performed in the MEDLINE (PubMed), EMBASE, Science Citation Index, Cochrane Library databases (from inception to 10 April 2023).Study selection Studies were deemed eligible if they were peer-reviewed experimental or observational articles that analysed the diagnostic performance of externally validated DL-based CADx models for the prediction of malignancy risk, with a direct comparison to models widely used in clinical practice.Data extraction and synthesis PRISMA guidelines were followed for the identification, screening, and selection process. A bivariate random-effect approach for the meta-analysis on the included studies was used. Quality Assessment of Diagnosis Accuracy Studies 2 (QUADAS-2) was used to assess risk of bias and applicability.Main outcomes and measures Main outcomes included sensitivity, specificity, and area under the curve (AUC).Results After screening, 20 studies were included, comprising 7,664 participants and 10,128 nodules, of which 2,126 were malignant. DL-based CADx models were 15.8% more sensitive than physician judgement alone, and 35.4% more than clinical risk models alone. They had a similar pooled specificity as physician judgement alone (0.77 [95% CI: 0.69 –0.84] v 0.80 [95% CI: 0.71 –0.86], respectively), but were 5.5% more specific than clinical risk models alone. Accounting for threshold effects, DL-based CADx models had superior summary areas under the receiver operating characteristic curve (sAUROC), with relative sAUROCs of 1.06 (95% CI: 1.03–1.08) and 1.22 (95% CI: 1.19–1.24), as compared to physician judgement and clinical risk models alone, respectively.Conclusions and relevance DL-based models show superior or comparable diagnostic performance when externally validated against widely used methods, such as the Brock and Mayo models. They have the potential to fulfil an unmet clinical-management need alongside experienced physician image readers. The included studies reported a high degree of heterogeneity, with threshold effects particularly prominent. Future research may consider more prospective studies and human-experimental studies.Question How effective are image-based, computer-aided diagnostic models that use deep learning methods to predict the malignancy risk of pulmonary nodules as compared with other methods used in clinical practice?Findings This systematic review and meta-analysis identified 20 observational studies (7,664 participants; 10,128 pulmonary nodules) from which pooled analyses found deep learning-based models to have a sensitivity of 0.88, specificity of 0.77, and summary area under the curve of 0.90 in predicting malignancy in pulmonary nodules. This was superior or comparable to other methods routinely used in clinical practice.Meaning Deep learning-based models are already being used in clinical practice in certain settings for nodule management. The results show their diagnostic performance justifies wider and more routine deployment.Competing Interest StatementJW is an employee of Optellum Ltd; Optellum holds some patents in the area.Funding StatementThis study was funded by Optellum Ltd, Oxford, United KingdomAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study used only openly available human data that are located on the MEDLINE (PubMed), EMBASE, Science Citation Index, and Cochrane Library databasesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present study are available upon reasonable request to the authors