Abstract
Background and Purpose Limited optimization was clinically applicable for reducing missed diagnosis, misdiagnosis and inter-reader variability in pulmonary nodule diagnosis. We aimed to propose a deep learning-based algorithm and a practical strategy to better stratify the risk of pulmonary nodules, thus reducing medical errors and optimizing the clinical workflow.
Materials and Methods A total of 2,348 pulmonary nodules (1,215 with lung cancer) containing screened nodules from National Lung Cancer Screening Trial (NLST) and incidentally detected nodules from Jinling Hospital (JLH) were used to train and evaluate a deep learning algorithm, Filter-guided pyramid network (FGP-NET). Internal and external test of FGP-NET were performed on two independent datasets (n=542). The performance of FGP-NET at Youden point which maximizing the Youden index was compared with 126 board-certificated radiologists. We further proposed Hierarchical Ordered Network ORiented Strategy (HONORS), which manipulates the emphasis either on sensitivity or specificity to target risk-stratified clinical scenarios, directly making decisions for some patients.
Results FGP-NET achieved a high area under the curve (AUC) of 0.969 and 0.855 for internal and external testing, and was comparable or even outperformed the radiologists when considering sensitivity. HONORS-guided FGP-NET identified benign nodules with a high sensitivity (95.5%) in the screening scenario, and demonstrated satisfactory performance for the rest ambiguous nodules with 0.945 of AUC by the Youden point. FGP-NET also detected lung cancer with a high specificity of 94.5% in routine diagnostic scenario; an AUC of 0.809 was achieved for the rest nodules.
Conclusion The combination of HONORS and FGP-NET provides well-organized stratification for pulmonary nodules and also offers the potential for reducing medical errors.
Highlights
Pulmonary nodules were managed for both screening and diagnostic scenarios
Proposal of a hierarchical strategy for targeting risk-stratified clinical scenarios
A large scale Human-deep learning contest for reliable performance evaluation
Competing Interest Statement
Xiuli li and Haoliang Lin are employees of Deepwise inc. The other authors have no conflict of interest to declare.
Funding Statement
This work was supported by the National Key Research and Development Program of China (2017YFC0113400).
Author Declarations
All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.
Yes
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
We used one dataset which is publicly accessible: NLST: https://biometry.nci.nih.gov/cdas/learn/nlst/images/ Retrospective data used in this study from clinical hospitals cannot be released under the terms of our Institutional Review Board approval to protect patient confidentiality.