Abstract
Ischemic stroke, a leading global cause of death and disability, is caused by carotid arteries atherosclerosis. Such calcifications are classically detected by ultrasound screening. In recent years it was shown that these calcifications can also be inferred from routine panoramic dental radiographs. In this work, we focused on the panoramic dental radiographs taken from 500 patients, manually labelling each of the patients’ sides (each radiograph was treated as two sides), and which were used to develop an artificial intelligence (AI)-based algorithm to automatically detect carotid calcifications. The algorithm uses deep learning convolutional neural networks (CNN), with transfer learning (TL) approaches followed by eXtreme Gradient Boosting algorithm (XGBoost) that achieved true labels for each corner, and reaches a sensitivity (recall) of 0.82 and a specificity of 0.93 for individual artery, and a recall of 0.88 and specificity of 0.86 for individual patients. Applying and integrating the algorithm we developed in healthcare units and dental clinics has the potential of reducing stroke events and their mortality and morbidity consequences.
Author summary Stroke is a leading global cause of death and disability. One major cause of stroke is carotid artery calcification (CAC). Traditional approaches for CAC detection are doppler ultrasound screening and angiography computerized tomography (CT), medical procedures that require financial expenses, are time consuming and discomforting to the patient. Of note, angiography CT involves the injection of contrast material and exposure to x-ray ionizing irradiation. In recent years researchers have shown that CAC can also be detected when analyzing routine panoramic dental radiographs, a non-invasive, cheap and easily accessible procedure. This study takes us one step further, in developing artificial intelligence (AI)-based algorithms trained to detect such calcifications in panoramic dental radiographs. The models developed are based on deep learning convolutional neural networks, transfer learning, and XGBoost algorithm, that enable an accurate automated detection of carotid calcifications, with a recall of 0.82 and a specificity of 0.93. Statistical approaches for assessing predictions per individual (i.e.: predicting the risk of calcification in at least one artery), were developed showing a recall of 0.88 and specificity of 0.86. Applying and integrating this approach in healthcare units may significantly contribute to identifying at-risk patients.
Introduction
Stroke is the third leading cause of death and the leading cause of disability in the Western world. Ischemic stroke is caused by carotid arteries atherosclerosis, small intracranial vessel disease or emboli from the heart and aorta [1,2]. The lifelong risk of stroke in adult men and women (age 25 and older) is about 25 percent [3]. Ten percent of strokes are caused by intracerebral hemorrhage and 87% of all strokes are ischemic [2]. Several studies showed that patients aged 60-96 with carotid artery calcification (CAC) found in panoramic radiograph are 2.4 fold more likely to suffer from vascular events, including stroke and/or ischemic heart diseases [4,5].
Standard tests for detecting CAC are doppler ultrasound (US) and angiography computerized tomography (CT). However, there is evidence that calcification can be detected in panoramic dental X-rays (dental radiographs). These X-rays are routinely performed in daily practice by dentists and oral and maxillofacial surgeons [6]. A panoramic radiograph is a two-dimensional interpretation of tomographic images of curved anatomic structures. Panoramic radiography images serve as a diagnostic tool, and the image encompasses the teeth, the maxillary and mandibular bones, temporomandibular joints, and the maxillary sinus. Nonetheless, most dental professionals, dentists as well as specialists, are not trained for detecting and diagnosing CAC in panoramic X rays. Several studies focused on evaluating the ability of panoramic radiographs to detect CAC [5]. Recent meta-analyses of these studies revealed that the level of agreement between panoramic radiography and the above standard methods is 50% [7]. However, even with this limitation, panoramic radiography is more prevalent by far than US or coronary angiography (CAG). CAG is more available, but its use as a diagnostic tool is mostly overlooked. Therefore, panoramic radiography may play an important role in the screenig and detection of non-symptomatic CAC patients in the population.
Deep neural networks are a branch of machine learning (ML) and artificial intelligence (AI). These networks were developed to tackle complex challenges such as speech recognition, natural language processing, and also image classification and object recognition [8]. Deep learning architectures and algorithms use multilayer artificial neural network (NN) architecture, a major class of deep learning algorithms is the convolutional neural networks (CNN), that are widely used for image classification [9].
Major challenges in the development of an efficient CNN classifier are the requirement for large numbers of training samples (usually >1,000 for each class), in addition to the long and comprehensive process of training a model. In order to cope with these challenges, transfer learning (TL) approach was developed. In this approach, the CNN training is not created from scratch, but partially uses an existing pre-trained model as a starting point [10,11]. The pre-trained model was previously trained on a different task using large amounts of data.
CNN and TL have been widely used in the prediction of medical conditions using different techniques (CT, MRI, panoramic images) - for example: identification of prostate cancer [12]; prediction of bladder cancer treatment response in CT [13]; detection of maxillary sinusitis on panoramic radiographs [14]; screening for osteoporosis in dental panoramic radiographs [15]; cardiac cine segmentation [16] and even COVID-19 detection from chest CT-scans [17].
In this study we aimed to develop and evaluate an image classifier for screening carotid calcification (CAC) in standard dental panoramic radiograph (DPR) images.
We trained and tested a convolutional neural network (CNN) followed by eXtreme Gradient Boosting algorithm (XGBoost) for CAC detection of a single carotid (one side of the image) and then calculated the performance of a full panoramic radiography images.
Results
500 patients participated in the cohort. The average age was 67.5, with a range of ±13.3 years, 56% were male and 44% were female. 19.7% of the partients are smokers, 40% had diabetes, 63.7% had hypertension and 41.6% had history of cardiac infarction Table 1 presents the prediction performance blind tests and shows that the classifier succeeded to find 82% of the CAC images. The prevalence of the real CACs in the predicted positives was 71%.
The Recall-Precision (RP) curve for all the cross-validation folds is presented in Figure 1. A RP curve is more informative than the usual ROC curve when the test is imbalanced and the performance on the minority class (i.e the CAC) is more important. The curves show the trade-off between recall and precision of the seven folds. It can be noticed that fold 3 slightly deviated from the rest of the folds. Thus, the actual performance may be somewhat better than the above. In addition, these curves show that we can achieve a higher recall value of 90% at the cost of decreasing the precision to 60%.
In addition, we evaluated the classifier by the determination of the specific image area that was important for class prediction. We employed the gradient-weighted class activation mapping technique (Grad-CAM) [24] to present the most significant regions for screening CAC in order to verify that the classifier indeed concentrated on the calcification areas when it predicted CAC. Figure 2. presents Grad-maps which highlight the important regions in the image for predicting both “CAC” and “clean” - thus providing “visual explanations” to the predictions. These maps show that the classifier indeed focused in the calcification areas of the CAC images. It can be noticed that the region of the calcification signs is the most significant for the CAC prediction. Other Grad-CAM images can be found in supplementary Figures S3 and S4.
We used the p1 (recall - probability of predicting the actual calcified corners out of the true calcified corners and p2 (specificity - probability of predicting “clean” out of the true clean corners) that were calculated per corner (0.82, 0.93 respectively) as in Table 1 in order to calculate the performances per patient.
Assuming that 1/3 of the CAC patients are MM and the other two thirds are CM [25], and that the ratio of CAC vs. clean is 1:5, the probabilities of predicting X or C on the three types of patients with the actual p1 and p2 from the performance on a single side is presented in Table 2. Suppose the dataset consists of 144 patients as follows: 8 MM, 16 CM and 120 CC we would get the confusion matrix presented in Table 2. We therefore can use this table to calculate the performance per patient (see the methods section).
Assuming the dataset consists of 144 patients as follows: 8 MM, 16 CM and 120 CC, the resulting calculated performance per patients is presented in Table 3. Recall (sensitivity) of 0.88, precision of 0.57 and specificity of 0.86.
The increase in the recall is due to higher probability of correctly predicting the “CAC” of the MM patients compared to the prediction for an individual corner. The decrease in the precision and specificity is due to the higher probability of mistakenly predicting “CAC” in the CC patients compare to the prediction for a single corner.
Discussion
Prediction of stroke is still one of the major challenges in western medicine. Atherosclerosis of the carotid arteries is an important etiology for ischemic stroke. The main risk factors for atherosclerosis are hypertension, diabetes, hyperlipidemia, (hyperlipidemia), high cholesterol levels, smoking and obesity, all of which cause endothelial cell dysfunction. Atherosclerosis tends to calcify over the years. Therefore, carotid artery calcification is a manifestation of advanced atherosclerosis in the carotid arteries as well as a marker for atherosclerosis in other blood vessels, including coronary artery disease and peripheral vascular disease in the lower extremities. Early diagnosis of carotid arteries calcification (atherosclerosis) would prevent stroke by diagnosing, monitoring and treating carotid arteries stenoses as well as detecting and treating risk.
Carotid calcifications can be detected by performing a carotid ultrasound screening, but this is not a routine procedure, and is usually recommended only when a murmur is detected on auscultation or upon evidence of lower limb peripheral vascular disease, or in the presence of medical conditions that increase the risk of stroke. Periodic ultrasound screenings of the carotid arteries could detect carotid arteries atherosclerosis and calcification before the appearance of clinical manifestation, however such a policy involves a huge financial burden and is impractical. CT angiography is another test that detects atherosclerosis and calcification in carotid arteries, it involves the injection of contrast material and exposure to x-ray ionizing irradiation in addition to significant financial expenses, which make this test inadequate for screening purposes. From the other end, panoramic dental X-rays may provide important information on carotid artery calcification [26-29]. They are performed routinely and the information on possible CAG can be retrived without additional clinical test or prcedure..
In this work we developed an AI-based algorithm that can efficiently diagnose calcified atherosclerosis in the carotid arteries, using routine panoramic dental X-rays images. Such diagnosis once available, should direct the treating physician to refer the patient for further evaluation and treatment of carotid artery narrowing, risk factors for atherosclerosis in various blood vessels, including those causing coronary artery disease and peripheral vascular disease of the lower limbs.
The first challenge in this study is the absence of a typical constant shape of the calcification signs – i.e. there are general characteristics for CAC that are common to a wide range of shapes and orientations. Additionally, this region in the panoramic images contains background noise and other organs/bones, including the hyoid bone and various shapes of the spinal cord. This together with the relatively small number of samples, which complicate the convergence and generalization. One approach we used to cope with this challenge, was using the TL, that was successfully implemented in previous medical studies, including AI-based studies that analyzed panoramic radiographs [14-15]. We believe that higher number of images would result in better perofrmence.
Computer aided screening of calcification in radiological images is not specific only to the current topic. There are other procedures in which it can be adopted, such as for detecting coronary calcification in intravascular optical coherence tomography OCT and detecting calcifications in breast mammograms. Several studies aimed to computationally screen calcifications using AI and CNN approaches have been published: Li et al. used CNN to automate the segmentation and quantify coronary calcification in intravascular OCT images, reaching a F1 score of 0.96 [30]; Fuhrman et al. developed an algorithm based on both CNN Support vector machine (SVM) algorithm to classify coronary artery calcifications in low dose thoracic CT [31]. Other studies used a variety of deep neural network approaches based on CT images to predict different pathologies, such as transcatheter aortic valve replacement [32], chemotherapy response in breast cancer [33], quantitative assessment of liver trauma [34], and even the evaluation of complications associated with metastatic spine tumor surgery [35].
Panoramic radiographs are a routine part of oral and maxillofacial examinations. The high number of panoramic X rays performed routinely in dental clinics can provide an important and efficient tool for the early detection of calcifications. Nevertheless, due the inadequate training and awareness of dental personnel in detecting pathologies of the neck region, especially carotid artery calcifications, results in ignoring of vast amounts of available information that have a high potential for diagnosis, prevention and monitoring of atherosclerotic changes in carotids. We believe that the current study lays the foundation for a valuable clinical tool to help health professionals for referring patients to an appropriate specialist. This noval clinical tool, may be used on wide basis in healthcare organizations, both dental and medical.
The present study has several limitations. The manual labeling (by a physician) is challenging: CACs can be confused with other soft tissue calcifications in the same radiologic region, such as the triticeous cartilage calcification. However, it is not possible to make a conclusive diagnosis without doppler ultrasonography, which is used as the gold standard for the diagnosis of atherosclerosis [36]. Because of the retrospective design of this present study, doppler ultrasonographic screening could not be used as a reference. In addition to the aforementioned small population, a possible limitation may be in the representation of the overall population, mostly due to the relatively high proportions of the elderly (which are in any case more susceptible to strokes), or the fact that the method of diagnosing carotid artery calcification relies on expert diagnosis and not on other laboratory examinations.
We intend to conduct further research that will compare machine diagnosis to carotid ultrasound and angio-CT. In addition, the results of the present study need to be confirmed by larger series. The prevalence of the real CACs in the predicted positives is only 68%. We anticipate that larger sample would improve this parameter. However, this study has significant strengths: good AI performance, mostly the high recall (0.82) and the specificity (0.93), a significant benefit is the ability to assess the algorithms’ performance for a patient, rather than just a corner, and, above all, the potential implication of this study in clinics and healthcare organizations, enabling a non-invasive, efficient and applicable solution to understanding the potential and benefits of the early detection of carotid calcification, both on a patient level and throughout healthcare systems.
In summary, this study shows the potential and feasibility of applying deep learning-based methods in an actual “real-world” application of automatic screening for CAC in standard panoramic dental X-rays. Applying this approach may significantly contribute to quality of life of populations and save many lives.
Materials and methods
Study population and ethical approval
This study was approved by the Poriya Medical Center Institutional Review Board (approval # POR-0008-21) and was performed in accordance with the Declaration of Helsinki, seventh revision (2013).
In total, the study included 500 patients who visited the oral and maxillofacial department at the Poriya Medical Center between 2016 and 2021 and met with the following criteria. This retrospective research was based on data from the department’s archive, all cases were anonymized. Informed consent was not required by the ethical committee.
Inclusion criteria were as follows: (a) patients who were 40 years old and older, and (b) had a panoramic radiograph encompassing both jaws (upper and lower), the hyoid bone, and the fourth upper cervical spine vertebrae. Exclusion criteria were as follows: (a) low quality panoramic radiographs with trimmed corners and/or blurred and spread spinals; (b) treatment with coumadin (warfarin); (c) diagnosis of hypomagnesemia; and (d) diagnosis of hypercalcemia due to malignancy.
The following parameters were elicited anonymously from the patients’ medical files: age, sex, smoking, alcohol and drug abuse, weight, height, physical activities and systemic medical history.
Panoramic radiographs
All the panoramic radiographs were performed on a Planmeca ProMax® 3D (Planmeca Oy, Finland). The clinical files and panoramic radiographs were anonymized.
Image labeling
The panoramic images were labeled by several physicians, including certified oral and maxillofacial physicians, using the location, texture, and morphologic features of stained areas in the images, as defined and described in previous works [18-20]. The images were labeled to two classes:
(a) carotid calcification (CAC) - non-homogeneous irregular calcifications located adjacent to C3-C4 intervertebral space. These characteristics differentiate CAC from other calcifications such as triticeous cartilage calcification (see supplementary figures S1 and S2. depicting various examples of CACs and triticeous cartilage calcification); and
(b) no carotid calcification (including “clean” images with no calcification and calcification from non-carotid sources such as triticeous cartilage calcification). Both sides (corners), the left carotid artery and the right carotid artery, were labelled individually (i.e. two labels for each panoramic image).
Data Preprocessing
The main location of CACs is adjacent to C3-C4 spinals. We filtered out images with trimmed corners and/or with low quality corners such as very blurred and spread spinals in the areas of interest. The final dataset consists of 480 clean and 179 CAC corners.
Since the location of the calcifications of the carotid artery is adjacent to the spinal cord, we cut out each of the lower corners of the panoramic image in 500 × 500 pixels size (depicted in Fig 3). Each corner was analyzed separately.
Convolutional neural networks (CNN) and Transfer Learning (TL)
We trained CNN using a TL approach based on the pre-trained InceptionResNetV2 architecture [21].
We replaced the original top layer with 1024 and 256 dense layers followed by 20% drop out. We used cross-entropy loss function (a detailed representation of the model architecture is depicted in supplementary Table S1).
We carried out two train routines: an initial train of only the top layer followed by a full model train. We used the Keras and TensorFlow libraries (version 2.6.0) [22]. Next, we use the trained CNN (without the top layers) to extract features from the images and fed these features into XGBoost [23] (version 1.5.2) classifier.
Algorithm evaluation
Due to the relatively small number of corners, we evaluated the algorithm performance using a 7-fold cross-validation approach. We created an imbalance in the folds of 1:5 CAC as a reasonable balance between “real-world” occurrence rates and the requirement to emphasize carotid calcification. In each fold we split the data into three sets: (a) test set – with 24 CAC corners and 120 clean ones, (b) validation set – with 24 CAC corners and 120 clean ones, and (c) training set - with 131 CAC corners and 240 clean ones.
The performance of the network model after concatenation can be evaluated by determining statistical values (recall, specificity, precision, and accuracy) and the F1-score. The recall, known also as “sensitivity” reflects the positive proportion of correct recognition , the specificity reflects the negative proportion of correct predictions the precision is the fraction of the correct predictions among the retrieved instances . F1-score is an evaluation index which takes both precision and recall into account .
Calculation of prediction per patient (two sides) from prediction of the individual sides
This analysis was meant to calculate the algorithms’ performances focusing on patients rather than on individual corners. The statistical calculations are based on the corner classification algorithms and their performances. As described before, each panoramic image (of a specific patients) has two corners. Therefore, there are three types of patients: (a) a patient with two clean corners (Clean-Clean – CC); or (b) a patient with one clean corner and one carotid calcified corner and one clean corner (Calcified-Clean – MC); or (c) two carotid calcified corners (MM).
We used the performance on a single corner (presented in Table 1) to calculate:
p1 = probability of predicting the actual calcified corners out of the true calcified corners (also known as the recall).
1-p1 = probability of mistakenly predicting clean corners out of the of true calcified corners.
p2 = probability of predicting “clean” out of the true clean corners (also known as the specificity).
1-p2 = probability of predicting “calcified” out of the true clean corners.
These probabilities enable calculating the probabilities of each of the following scenarios related to the aforementioned three patient types (See Table 1).
The probability of predicting a MM patient (with both corners calcified) as “clean” is the probability of mistakenly predict “clean” on both sides of the MM patient.
The probability of predicting a MM patient (with both corners calcified) as “CAC” (with at least one calcified corner) is the complementary probability 1-the probability calculated in (a).
The probability of predicting a CM patient (with one clean corner and one calcified cornet) as “clean” is the multiplication of the specificity (actual clean) by the complementary of the recall.
The probability of predicting a CM patient as “CAC” is the complementary probability: 1-the probability in (c). 1-p2(1-p1) = 1-p2+p1p2 = p2(p1-1) + 1.
The probability of predicting a CC patient (with two clean corners) as “clean”, is the multiplication of p2 (specificity) of one corner by the specificity of the other corner.
The probability of predicting a CC patient (with two clean corners) as “CAC”, is the complementary probability 1-the probability calculated in (e).
Data Availability
The minimal data set cannot be shared publicly as they contain potentially identifying patient information. The data are owned by Yonsei University Dental Hospital. For researchers who meet the criteria for access to confidential data, requests for these data sets can be sent to Yonsei University Dental Hospital IRB Committee via detailirb{at}yuhs.ac. The authors had no special access privileges that others would not have.
Supporting information captions
Table S1 Sequence of layers in the CNN used to predict CAC.
Fig S1. Figures A-D are panoramic CAC Corners. The yellow arrows point toward the plaque location.
Fig S2. Figures E and F are panoramic corners with Triticeous Cartilage calcification. The blue arrows point toward the calcification. Figures G and H are clean normal corners.
Fig S3. MapGrad of calcified corners. Fig.
S4. MapGrad of clean corners.
Acknowledgements
We would like to thank Dr. Millie Kaplan Ben-Ari for her significant contribution to the research and to the manscript.