PT - JOURNAL ARTICLE AU - Pham, Hieu H. AU - Nguyen, Ha Q. AU - Lam, Khanh AU - Le, Linh T. AU - Nguyen, Dung B. AU - Nguyen, Hieu T. AU - Le, Tung T. AU - Nguyen, Thang V. AU - Dao, Minh AU - Vu, Van TI - An Accurate and Explainable Deep Learning System Improves Interobserver Agreement in the Interpretation of Chest Radiograph AID - 10.1101/2021.09.28.21264286 DP - 2021 Jan 01 TA - medRxiv PG - 2021.09.28.21264286 4099 - http://medrxiv.org/content/early/2021/09/30/2021.09.28.21264286.short 4100 - http://medrxiv.org/content/early/2021/09/30/2021.09.28.21264286.full AB - Interpretation of chest radiographs (CXR) is a difficult but essential task for detecting thoracic abnormalities. Recent artificial intelligence (AI) algorithms have achieved radiologist-level performance on various medical classification tasks. However, only a few studies addressed the localization of abnormal findings from CXR scans, which is essential in explaining the image-level classification to radiologists. Additionally, the actual impact of AI algorithms on the diagnostic performance of radiologists in clinical practice remains relatively unclear. To bridge these gaps, we developed an explainable deep learning system called VinDr-CXR that can classify a CXR scan into multiple thoracic diseases and, at the same time, localize most types of critical findings on the image. VinDr-CXR was trained on 51,485 CXR scans with radiologist-provided bounding box annotations. It demonstrated a comparable performance to experienced radiologists in classifying 6 common thoracic diseases on a retrospective validation set of 3,000 CXR scans, with a mean area under the receiver operating characteristic curve (AUROC) of 0.967 (95% confidence interval [CI]: 0.958–0.975). The sensitivity, specificity, F1-score, false-positive rate (FPR), and false-negative rate (FNR) of the system at the optimal cutoff value were 0.933 (0.898–0.964), 0.900 (0.887–0.911), 0.631 (0.589–0.672), 0.101 (0.089– 0.114) and 0.067 (0.057–0.102), respectively. For the localization task with 14 types of lesions, our free-response receiver operating characteristic (FROC) analysis showed that the VinDr-CXR achieved a sensitivity of 80.2% at the rate of 1.0 false-positive lesion identified per scan. A prospective study was also conducted to measure the clinical impact of the VinDr-CXR in assisting six experienced radiologists. The results indicated that the proposed system, when used as a diagnosis supporting tool, significantly improved the agreement between radiologists themselves with an increase of 1.5% in mean Fleiss’ Kappa. We also observed that, after the radiologists consulted VinDr-CXR’s suggestions, the agreement between each of them and the system was remarkably increased by 3.3% in mean Co-hen’s Kappa. Altogether, our results highlight the potentials of the proposed deep learning system as an effective assistant to radiologists in clinical practice. Part of the dataset used for developing the VinDr-CXR system has been made publicly available at https://physionet.org/content/vindr-cxr/1.0.0/.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis research was supported by Vingroup Big Data Institute.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study was approved by the Institutional Review Boards (IRB) of the Hanoi Medical University Hospital and Hospital 108.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe dataset used for developing the deep learning system has been made publicly available at https://physionet.org/content/vindr-cxr/1.0.0/ https://physionet.org/content/vindr-cxr/1.0.0/