Abstract
Multiple attempts at intracranial hemorrhage (ICH) detection using deep-learning techniques have been made and plagued with clinical failures. Most studies for ICH detection have insufficient data or weak annotations. We sought to determine whether a deep-learning algorithm for ICH detection trained on a strongly annotated dataset outperforms that trained on a weakly annotated dataset, and whether a weighted ensemble model that integrates separate models trained using datasets with different ICH subtypes is more accurate. We used publicly available brain CT scans from the Radiological Society of North America (27,861 CT scans, 3,528 ICHs) and AI-Hub (53,045 CT scans, 7,013 ICHs) for training datasets. For external testing, 600 CT scans (327 with ICH) from Dongguk University Medical Center and 386 CT scans (160 with ICH) from Qure.ai were used. DenseNet121, InceptionResNetV2, MobileNetV2, and VGG19 were trained on strongly and weakly annotated datasets and compared. We then developed a weighted ensemble model combining separate models trained on all ICH, subdural hemorrhage (SDH), subarachnoid hemorrhage (SAH), and small-lesion ICH cases. The final weighted ensemble model was compared to four well-known deep-learning models. Six neurologists reviewed difficult ICH cases after external testing. InceptionResNetV2, MobileNetV2, and VGG19 models outperformed when trained on strongly annotated datasets. A weighted ensemble model combining models trained on SDH, SAH, and small-lesion ICH had a higher AUC than a model only trained on all ICH cases. This model outperformed four well-known deep-learning models in terms of sensitivity, specificity, and AUC. Strongly annotated data are superior to weakly annotated data for training deep-learning algorithms. Since no model can capture all aspects of a complex task well, we developed a weighted ensemble model for ICH detection after training with large-scale strongly annotated CT scans. We also showed that a better understanding and management of cases challenging for AI and human is required to facilitate clinical use of ICH detection algorithms.
Key Points Question Can a weighted ensemble method and strongly annotated training datasets develop a deep-learning model with high accuracy to detect intracranial hemorrhage?
Findings A deep-learning algorithm for detecting ICH trained with a strongly annotated dataset outperformed models trained with a weakly annotated dataset. After ensembling separate models that were trained with only SDH, SAH, and small-lesion ICH, a weighted ensemble model had a higher AUC.
Meaning This study suggests that to enhance the performance of deep-learning models, researchers should consider the distinct imaging characteristics of each hemorrhage subtype and use strongly annotated training datasets.
Competing Interest Statement
Wi-Sun Ryu and Gi-Hun Park are employed by JLK Inc. Dong-Eog Kim is a stockholder of JLK Inc.
Funding Statement
This research was supported by the National Research Foundation of Korea and funded by the Ministry of Science and ICT (Grant NRF‐2020M3E5D9079768).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was approved by the institutional review board of DUMC and JLK Inc. (No. DUIH 2018-03-018 and 20220407-01).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present study are available upon reasonable request to the authors.