ABSTRACT
Objective This study investigates the performance of a CNN algorithm on epilepsy diagnosis. Without pathology, diagnosis involves long and costly electroencephalographic (EEG) monitoring. Novel approaches may overcome this by comparing brain connectivity using graph metrics. This study, however, uses deep learning to learn connectivity patterns directly from easily acquired EEG data.
Methods A convolutional neural network (CNN) algorithm was applied on directed Granger causality (GC) connectivity measures, derived from 50 seconds of resting-state surface EEG recordings from 30 subjects with epilepsy and a 30 subject control group.
Results The learned CNN filters reflected reduced delta band connectivity in frontal regions and increased left lateralized frontal-posterior gamma band connectivity. A diagnosis accuracy of 85% (F1-score 85%) was achieved by an ensemble of CNN models, each trained on differently prepared data from different electrode combinations.
Conclusions Appropriate preparation of connectivity data enables generic CNN algorithms to be used for detection of multiple discriminative epileptic features. Differential patterns revealed in this study may help to shed light on underlying altered cognitive abilities in epilepsy patients.
Significance The accuracy achieved in this study shows that, in combination with other methods, this approach could prove a valuable clinical decision support system for epilepsy diagnosis.
1. INTRODUCTION
Non-seizure EEG recordings of epilepsy patients typically exhibit interictal epileptiform discharges (IED). Visual detection of IEDs is performed by a neurologist and forms an important aspect of the standard diagnostic procedures for epilepsy. However, for epilepsy patients without this pathology, diagnosis typically involves a diagnostic trajectory that could take years and includes costly EEG monitoring in a hospital. Moreover, the detection of IEDs by a trained epileptologist does not achieve perfect diagnostic performance, and IEDs are often judged differently by different epileptologists (Smith, 2005).
There are many causes of epilepsy, resulting in many epilepsy subtypes. Also, equally many mechanisms exist that could underlie seizures; however on some level, seizures result from disruptions in the dynamics that control inhibition and excitation of neurons. Interestingly, these disruptions in epileptic brains can be associated with altered (functional) connectivity patterns between various parts of the brain, as compared to healthy brains. Connectivity measures may therefore be used to find robust neuromarkers for epilepsy, which may be used for diagnosis. Novel diagnosis approaches have exploited this phenomenon, achieving diagnosis accuracies above 90%. Often, such studies calculate graph-theoretical characteristics of brain connectivity networks to discriminate epileptic brains from healthy brains. However, such studies usually test a group of subjects with only one or two epilepsy types, and they require usage of costly fMRI or MRI neuroimaging. Furthermore, although their performance achieves relatively high accuracy, this is still not high enough for reliable diagnosis in a clinical setting. One reason is that, even among subjects with nearly identical anatomic locations of seizure onset, the distribution of functional connectivity in the epilepsy network is unique for each subject (Dumlu et al., 2020; Marino et al., 2019). Therefore, it is a challenge for epilepsy diagnosis to find reliable and generalizable connectivity patterns that can be used as a neuromarker for the diagnosis of epilepsy. Although graph theoretical metrics facilitate characterization of brain connectivity networks and may be useful for differential diagnosis between different kinds of epilepsy, an algorithm for the purpose of early stage diagnosis may require more than using graph theoretic network metrics as a robust neuromarker.
The current study also exploits brain functional connectivity measures to diagnose epilepsy. However, to achieve high diagnostic accuracy it employs a deep learning algorithm, a convolutional neural network (CNN), directly on directed functional (i.e. effective) connectivity data. Usage of deep learning has become possible since recently some large databases with brain activity recordings have become publically available. Deep learning algorithms can circumvent the complexities of identifying general patterns of brain connectivity and automate the identification of brain connectivity patterns. This approach was not yet before studied, while it may achieve a diagnostic accuracy that is similar to that of a trained epileptologist and to methods that require MRI based techniques; and can provide supportive evidence for a diagnosis. Moreover, our method processes only 50 seconds of EEG recordings of a patient during rest, requiring no cognitive tasks or tests to be performed. Perhaps, these simplifications allow this method to be performed in more accessible facilities such as a local practitioner’s office.
For this purpose, convolutional neural network (CNN) algorithms are trained directly on effective connectivity matrices, which are calculated with Granger causality and will be called GC matrices henceforth. The GC matrices are derived from non-seizure (resting state) EEG brain recordings of 30 epilepsy patients and 30 control group subjects. Furthermore, the diagnostic accuracy of the CNN algorithm is analyzed for different combinations of EEG electrodes.
By training the CNN algorithm on GC matrices, it is possible to derive connectivity patterns that are associated with epilepsy, and these patterns can thus be viewed as neuromarkers for the disease. Furthermore, this approach allows for a comparison of epileptic brain connectivity patterns with connectivity patterns observed in patients with other co-occurring cognitive deficits, such as decreased memory function. Comparisons such as these may provide researchers with novel insights into the way complex phenomena such as neuroplasticity play a role in epilepsy, as will be discussed later.
2. MATERIAL AND METHODS
2.1. EEG data
We used the TUH EEG Epilepsy Database from Temple University Hospital, Philadelphia (Shah et al., 2018), which is a subset of the TUH database (Obeid and Picone, 2016). This database consists of EEG recordings of more than 200 patients and is divided into two subsets; one consists of patients with epilepsy and the other patients without epilepsy. The recordings were produced with electrodes placed in the 10-20 EEG system and each recording has a technician’s report describing the medical background of the patient, the process and the results. A certified neurologist, who reviewed medical histories, the EEG technician’s reports and the EEGs, has determined whether patients had epilepsy or not. From this database, 60 subjects were selected, of whom 30 did have epilepsy and 30 did not.
2.2. Selection of the subjects
Because the patients were from a hospital database, they had a large number of medical issues. Therefore, these 60 subjects were selected according to a criteria that aim to minimize inter-group differences. Patients with recent seizures, cerebral dysfunction, coma, encephalopathy, unresponsiveness, mental retardation, high heart rate, brain tumor, resected brain area or drug overdose were excluded in the selection process. Further exclusion was applied to patients when technical difficulties with electrode connections, muscle artifact, deep sleep stage (stage II) were mentioned in the technical report. The remaining subjects were still affected by various diseases, as presented in Table 2. We attempted to minimize the number of diseases and the difference in average age and male-female ratios amongst the remaining subject groups. Their file numbers, and detailed medical information can be found in the appendix of (Rijnders, 2021).
2.3. Selection of EEG samples and channels
For the 60 selected subjects, EEG time series data recorded by 21 scalp electrodes, were included in the analysis. Only recordings with the Average Reference (AR) configuration were used. Recordings were 20 minutes on average. Some of the subjects were drowsy (sleep stage I) at some point during the recordings, or they were undergoing photic stimulation tests. However, in most cases, such events occurred only during the latter half of the recording, therefore we attempted to avoid these effects by using only the first 5 minutes of the recordings for further analysis.
2.4. Preprocessing the EEG
From these first 5 minutes of EEG data, 50 continuous seconds were selected that appeared the calmest upon visual inspection. This was done for the purpose of acquiring resting-state EEG data. An index of which 50 seconds were used for which subject can be found in the appendix of (Rijnders, 2021).
The preprocessing was kept at a minimum because Granger causality is sensitive to filtering. The selected preprocessing method involved the following steps:
The first steps of pre-processing of the remaining 50 sec EEG data segment involved notch filtering (60, 120 and 180 Hz), trend line removal, and removal of eyeblink artifacts. This last step was performed by calculating ICA components and sorting them by to their correlation with the FP1 electrode.
Bivariate Granger causality connectivity values were calculated with model order value MO=15. Both EEG preprocessing and Granger causality calculation were performed in Brainstorm version 3.200124 (Tadel et al., 2011).
2.5. Design of experiment
We classified the resting state EEG data with a convolutional neural network (CNN) algorithm that was trained on Granger causality (GC) connectivity matrices. In this way, the EEG data for each subject could be predicted to be of class epileptic or not epileptic. There were four phases: pre-processing, further processing, deep learning and classification. A flowchart is presented in Figure 1.
2.6. Extraction of connectivity matrices
The preprocessed EEG data was used to estimate connectivity between different brain regions. Granger causality (GC) connectivity between multpile electrodes was calculated in Brainstorm. For a predefined selection of different electrodes (i.e. electrode combination), the calculated GC connectivity values between each electrode pair were presented in a GC connectivity adjacency matrix (i.e. GC matrix). This resulted in one GC matrix per subject, derived from 50 sec of EEG recording. However, for each subject, this was also done for several different electrode combinations, as displayed in Figure 2.
For each of the 60 subjects, and for each electrode combination, four GC matrices were calculated, one for each frequency band (delta = 1-4 Hz, theta = 5-7 Hz, alpha = 8-13 Hz, beta = 14-29 Hz, gamma = 30-55 Hz). A detailed description of how these operations were performed and configured in Brainstorm is presented in (Rijnders, 2021).
Next, for each subject, these four frequency band GC matrices, were combined into one larger image, as displayed in Figure 3. The resulting 60 larger images (one for each subject) were then used as input to train the CNN deep learning algorithm.
GC values of these GC matrices were significantly lower for the high frequency bands than for the low frequency bands, as described in Figure 3. For this reason, also another set of same sized images was created, which were created by using a different data preparation method. This involved a procedure in which each frequency band GC matrix was first separately normalized (i.e. scaled) such that, for each frequency band, its average GC matrix value is 0.5. This procedure is depicted in Figure 4. Also in this case, the resulting 60 images (one for each subject) were used as input to train (another) CNN deep learning algorithm.
2.7. CNN training and classification
The purpose of the CNN algorithm was to recognize features of epilepsy in the GC matrices and to classify the combined GC matrix of each subject as either being epileptic or not epileptic.
The CNN algorithm used for this experiment had a minimal architecture, consisting of only one convolution layer and three linear layers, as depicted in Figure 5.
The filters of the CNN algorithm were chosen to be the same size as the (small) GC matrices, and employed a stride value of also that same size, so that each filter convolves over the larger image in only four steps; one step for each frequency band. As a result, the convolution layer outputs only one 2×2 image per filter. After being flattened, these 2×2 images were then used as input for the first of two dense layers. Since the filters have the same size as the original GC matrices, relevant connectivity patterns can be directly observed by looking at the learned filters, and these filters can be utilized effectively as the neuromarkers for epilepsy. In the original study (Rijnders, 2021) the CNN architecture consisted of not one but two convolution layers, which employed smaller kernel sizes and stride. This was however computationally more complex, while resulting in similar performance as the simple architecture of the current study, therefore the architecture was simplified.
For each run of the algorithm, most of the 60 subjects are used for training and a smaller amount of subjects are used for the validation set, and a still smaller amount is used for testing the trained network. The validation set is used during training to check how well the model generalizes. An early stopping strategy is used so that each model update was only saved in case the calculated loss on the validation set reached a new minimum. And the training session is ended in case no such new minimum is achieved within a predefined amount of training epochs (i.e. the patience value). This prevents the model from excessive overfitting on the training data.
For each electrode combination, a different CNN layer size, dropout values, validation set size and other hyperparameters were chosen to accommodate maximum diagnostic performance.
For reproducibility, 10-fold cross validation is used. With a test set size of 6 subjects, this had the consequence that after 10 runs, each subject was used once as a test subject and twice as a subject for a validation set. For reproducibility, the dropout of the linear layers is applied with a fixed random seed. The used loss function is cross entropy and SGD optimizer is used for updating the weights of the network.
The CNN algorithm is implemented using Pytorch 1.1.0. The Python code will be released on https://github.com/berjor/epilepsy-cnn. The code was executed on Google COLAB.
2.8. Combined classification
The resulting predictions/classifications from different methods were combined into one hybrid (ensemble) method. In this combined classification, each subject received three different CNN predictions/diagnoses by using only the three electrode combinations that achieved highest performance. Subsequently, the diagnosis label (epilepsy or no epilepsy) that received the majority of votes (at least 2 out of 3) was selected as the final diagnosis. A flowchart for this hybrid system is presented in Figure 6.
3. RESULTS
3.1. Results for separate electrode combinations
Highest accuracies for the different electrode combinations are presented in Table 1.
The largest accuracy with the CNN algorithm was achieved with the FP1&F3&P3 electrodes by first normalizing (scaling) each frequency band GC matrix. The maximum accuracy achieved by this method was 78%. From the confusion matrix (see Table 3) an F1-score was calculated as 79%. For the details of the specific CNN configurations, see the appendix. Brain images were obtained from Brainstorm (Tadel et al., 2011).
By observing the maximum weights in the trained network, we were able to trace back which CNN filter was maximally associated with the epilepsy label. Cross validation was performed over ten runs, so we obtained in total ten such trained (epilepsy) filter images. We averaged those ten images into one image to obtain a more reliable epilepsy filter image. This averaged trained filter image can thus be seen as a likely neuromarker for epilepsy. The filter image, as displayed in Figure 7 (right image), is such a neuromarker. In this image it is observed by the red color of the top right pixel that there is increased connectivity, directed from the FP1 to the P3 electrode. The blue pixel on the top middle of the image indicates reduction of connectivity from the FP1 electrode to the F3 electrode. The image also documented on which frequency band the filter was most effective. In this example this was the case for the gamma band.
An equally high accuracy was achieved with the CNN algorithm that used only the six frontal electrodes without normalizing the four 6×6 GC matrices before combining them into one 12×12 image. The maximum accuracy achieved by this method was also 78%. And from the confusion matrix (see Table 4) an F1-score of 79% was calculated. See the appendix for the used CNN configuration and performance calculation details.
The third highest accuracy was achieved with the CNN algorithm that used the twelve electrodes from the frontal and the parietal regions (F&P), by first normalizing (scaling) each frequency band GC matrix separately and subsequently combining these four scaled 12×12 GC matrices into one 24×24 image. The maximum accuracy that this CNN method achieved was 73%. And from the confusion matrix (see Table 5) an F1-score of 69% was calculated. See the appendix for the used configurations and performance calculation details.
3.2. Results of the hybrid method
The purpose of this experiment was to examine the capacity of a deep learning method, that exploits GC connectivity, to diagnose epilepsy. This was done with a hybrid (ensemble) CNN method, that combines the outcomes of the three best performing CNN algorithms, each trained on another data preparation method and using different CNN layer sizes. For this hybrid method, the most accurate predictions, which were made by the three previously described best performing CNN methods, were combined and for each subject, the label that was predicted in the majority of these three methods was decisive for the final classification. The accuracy of this hybrid method was 85%. From the confusion matrix (see Table 6) an F1-score of 85% was calculated. See the appendix for details of performance calculations.
The resulting 85% accuracy shows that our CNN algorithm performs much better than a random binary classifier. Thus, it can be claimed that CNN can be used to diagnose epilepsy with a statistically significant accuracy.
4. DISCUSSION
It was found that the CNN algorithm achieved high prediction accuracy when applied on the FP1 + F3 + P3 electrode combination. Gamma band connectivity seemed to be most relevant for achieving the highest accuracy for this electrode combination. The same high accuracy was achieved by applying the CNN algorithm on GC matrices derived from frontal electrodes only. The frontal-parietal electrodes combination resulted in the third highest accuracy. By combining three CNN predictions each using a different data preparation or normalization method, we achieved a combined epilepsy diagnosis accuracy of 85%, which is significantly higher than a random classifier. Therefore, CNN algorithms trained on connectivity matrices can be used to support diagnose of epilepsy.
The epileptic subjects in this study showed various differences in connectivity compared to the non-epileptic subjects. This is not surprising, since connectivity alterations have often been reported by epilepsy researchers (Sargolzaei et al., 2015; Stam, 2014; Van Diessen et al., 2013). A direct observation of average GC values for each of the two subject groups showed various connectivity differences, as reported in the appendix of (Rijnders, 2021). These comparisons indicated that for epilepsy patients, between frontal electrodes, and also between parietal electrodes, connectivity was reduced for all frequency bands. And in agreement with other studies (Jiang et al., 2018) there was particularly a decrease of connectivity between the frontal electrodes. This was most clearly observable for the delta frequency band. The learned filters of the CNN in this study also reflected this feature, as was shown in Figure 8. The figure indicates that the learned filter (that correlated most with the epilepsy label) was most effective for the delta band. These observations also explain why the frontal electrodes achieved their highest accuracy of 78% by using non-scaled GC matrices. Because, not scaling the GC matrices has the effect of leaving the delta band as the most prominent source of GC connectivity; so that the discriminative power of these large GC reductions in the delta band will become easier to learn by the algorithm.
Interestingly, this comparison of average GC values has also shown a more complex pattern in the GC connectivity in the case of the FP1+F3+P3 electrodes. For this combination, connectivity between parietal and frontal electrodes, particularly in the gamma frequency band, was not reduced but increased. These epileptic GC alterations are depicted in Figure 11.
The left image in Figure 11 shows a clear left-hemispheric pattern of increased gamma band connectivity from the FP1 to the P3 electrode, and a decreased gamma band connectivity from the FP1 to the F3 electrode, which coincides exactly with the pattern found in our learned CNN filters, which was shown in Figure 7. It should be noted that the average gamma band connectivity of non-epileptic subjects was close to zero. Therefore, these values will have a large relative increase with just a small absolute increase of gamma band connectivity. This unstable effect may have been exacerbated by the low sample rate of 250 Hz causing a lower signal-to-noise ratio at the gamma frequencies; however, we mitigated this problem by only processing the low gamma band frequencies of up to 55 Hz.
Overall, these two observations coincide with various other studies, which found similar frequency band dependent connectivity differences, where a decrease of connectivity occurred in low frequency bands and an increase of connectivity was seen in high frequency bands (Clemens et al., 2011; Wang and Meng, 2016).
For the FP1+F3+P3 electrode combination, it was important to allow our CNN algorithm to find patterns of increased gamma band connectivity, because we first needed to compensate for the fact that the absolute gamma band GC values were very low compared to delta band GC values and also for the fact that there is large inter-subject variance in those gamma band GC values. Without normalization (scaling) these gamma band values for epileptic subjects would have stayed unnoticed by the CNN. The GC matrix values were scaled according to the method in Figure 4, and as a result, epileptic patterns in gamma band became detectable for the CNN. Since we were using Granger causality, which is a directed (effective) connectivity measure, this scaling was thus also able to emphasize the directional characteristic of the connectivity. This turned out to be the most discriminative connectivity feature for this electrode combination, as we will discuss.
The electrode combinations that resulted in the highest prediction performance (i.e. frontal electrodes and parietal electrodes), cover the areas of the default mode network (DMN). The DMN can be characterized as the brain regions that are active during resting-state and become deactivated during externally directed tasks, and is often associated with cognitive processes that are directed toward the self, such as autobiographic memory, mind wandering, future thinking and introspection (Buckner & Carroll, 2007). Surface electrodes were used without performing EEG source estimation (ESI), therefore we cannot guarantee that the GC connectivities in the current study were derived from sources in the DMN. However, based on the fact that we used resting-state recordings, we assume that areas of the DMN were responsible for our connectivity measurements. This allowed us to speculate which mechanisms could be involved in our findings.
Between certain electrode pairs, the described connectivity patterns were not equal for opposite directions. As was shown in Figure 11, for epilepsy patients, connectivity from the anterior DMN region (i.e. the FP1 and F3 electrode) to the posterior DMN region (P3) was significantly more altered than connectivity in the opposite direction (from P3 to FP1 and F3). In fact, because brain connectivity dynamics are strongly dependent on directionality aspects, the success of a connectivity based classification also depends on which combinations of directed connectivity patterns are considered (Verhoeven et al., 2018). In the current study, it was found that this was particularly the case for the FP1 + F3 + P3 electrodes.
Though this effect is difficult to interpret, it seems plausible that the matrix scaling procedure causes only patterns of GC differences between the electrodes within one subject to remain visible, whereas the absolute GC value difference between subjects becomes unusable as a feature. It is thus plausible that for the FP1+F3+P3 electrodes, in the gamma band, its largest discriminative power lies not only in the increase of absolute gamma band GC values, but in the altered directionality of this connectivity. It seems plausible that most discriminative information was captured by the directionality of the gamma band increase, and not by only its increased magnitude.
The information captured by the absolute magnitude of the GC connectivity value is lost by performing scaling. As can be seen in the results in Table 1, the discriminative power of inter-subject absolute GC values is maximally leveraged without scaling of the GC matrices. This non-scaling method reached a prediction accuracy of 78%, but only when applied on the frontal electrodes, which has been explained by the finding that highest reduction of GC values for epilepsy patients occurred among these frontal electrodes.
Interestingly, the results showed that the GC matrix scaling method infers a different set of diagnoses as compared to unscaled GC matrices, for a relatively large number of subjects, see Figure 12. It is thus clear that the combination of different data preparation methods, as exploited by the hybrid CNN method, had the effect of increasing diagnostic performance of the ensemble CNN algorithm. It leveraged its discriminative power by exploiting multiple epileptic features. It therefore seems plausible that each data preparation method enables the extraction of a different physiological aspect that is characteristic of epilepsy.
Whether each of these different physiological characteristics can be associated with one particular subtype of epilepsy remains an open question, since we do not know the subtypes of epilepsy in this dataset. Although a neurologist determined whether each subject had epilepsy or not in the TUH EEG Epilepsy database, the epilepsy type or syndrome was not determined and therefore unknown to us. Given the fact that there are various common types of epilepsy and that there were a variety of symptoms noted for the 30 selected epilepsy patients, it seems plausible that there were different epilepsy types in the database, including patients with generalized seizures and patients with focal seizures.
One interesting observation made in the current study was that the gamma band GC increase pattern was significantly more prominent in the connectivity between the FP1+F3+P3 electrodes (left hemisphere) than between the FP2+F4+P4 electrodes (right hemisphere), as can be compared in Figure 11 (the left image versus the right image). Indeed, a similar hemispheric asymmetric pattern was also found for most frequencies in another study which found that, similar to the current study, decreased connectivity was more dominant in the right hemisphere, while increased connectivity occurred mostly in the left hemisphere (Clemens et al., 2011). Such asymmetrical patterns were observed in various studies on temporal lobe epilepsy (TLE) and mTLE. Because temporal lobe epilepsy (TLE) is the most common type of adult focal epilepsy, it is likely that a large number of subjects in the current study have indeed TLE, which is supported by the fact that for many subjects the technical report described symptoms associated with TLE, including visual or auditory phenomena, as summarized in the appendix of the original study (Rijnders, 2021). Interestingly, various studies reported that the asymmetrical patterns of connectivity are different for left TLE than for right TLE, indicating that different pathological mechanisms may underlie these two types of epilepsies.
A study from 2014 suggests that although right TLE patients have more reduced connectivity between DMN regions, they also showed some increases, which was a pattern thought to result from a compensatory mechanism (Haneef et al., 2014). Various other studies also suggested that such altered connectivity patterns can be associated with compensatory neuroplasticity effects that support default mode network (DMN) function (Dupont et al., 2002; Vlooswijk et al., 2010). Furthermore, such compensatory connectivity variations were found to be related to the duration of the disorder (Zhang et al., 2010).
Of particular notice is that several epilepsy researchers observed altered DMN connectivity with the hippocampus. Interestingly, in one study from 2011, patients with left TLE often were found to have decreased connectivity between the posterior DMN and the hippocampus. Because the hippocampus and anterior DMN are connected via the posterior DMN, this increased anterior connectivity in left TLE was explained as a redistribution of connectivity (Liao et al., 2011). It seems possible that the increased connectivity pattern found in the current study may also be interpreted as the result of such a redistribution of connectivity; perhaps in response to pre-existing memory impairments, given the important role of the hippocampus in terms of memory function. Weaker connectivity between posterior DMN regions with the hippocampus is also found in subjects with Alzheimer’s disease (Wu et al., 2011), and a large degree of co-occurrence between epilepsy and Alzheimer’s disease has been reported. In the current study, connectivity increase was found only during resting state and only in comparison to the control subjects. However, other studies have shown that increase of connectivity during resting state can be associated with improved memory function during tasks. In this view, the directed connectivity differences within the DMN observed in this study may in fact result from effects that compensate for cognitive aspects such as memory dysfunction.
It is not well understood which role gamma band connectivity plays in terms of oscillatory dynamics. However, several studies have demonstrated that frequency band dependent rhythmic fluctuations link the oscillatory patterns of neuronal activity to periodic fluctuations in several cognitive processing tasks, including those related to memory (Helfrich and Knight, 2016; Uhlhaas and Singer, 2012). In this regard, viewing the posterior DMN as a dysfunctional hub for communication with the hippocampus could perhaps explain the increased gamma band connectivity towards the posterior DMN as being the result of the anterior DMN attempting and failing to complete an early phase of some cognitive processing task that requires the anterior DMN to communicate with the hippocampus.
Explaining the findings in the current study by any of these speculations remains challenging. In particular, we used no source estimation, so it is not possible to determine what occurred in deeper brain regions. Nevertheless, the findings do seem to sustain literature pointing towards compensatory effects, perhaps in support of memory function. In particular, several other studies also reported similar findings further supporting such a conclusion (Bettus et al., 2009; Doucet et al., 2013; Lv et al., 2014). However, as mentioned in section 0, it is likely that among the focal epilepsy patients in our study, their epileptic zones exhibit a large variation. The epilepsy types and etiologies can therefore be assumed to be heterogeneous in the studied cohort, which makes it difficult to verify any physiological explanation for our study.
For the same reason, it is also not possible to determine whether the used model and methods have greater or lesser predictive value for specific subtypes of epilepsy. However, regardless of what has caused the connectivity patterns identified in this study, it was shown that this simple CNN method, in combination with different GC matrix normalizations, enables the identification of various complex epileptic brain connectivity patterns. Considering the fact that epilepsy is a heterogeneous disease (Dumlu et al., 2020) it remains to be seen whether this method can result in higher accuracies by training the model on subjects with only one subtype of epilepsy. However, it does seem plausible that using a larger dataset can improve the ability of the deep learning algorithm to find more neuromarkers that may be representative of a particular subtype of epilepsy. As such, the practice of diagnosing epilepsy may likely benefit from incorporating deep learning techniques such as the one in this study, since it may provide a valuable clinical decision support system for epilepsy diagnosis.
5. CONCLUSIONS
The CNN algorithm achieved its highest diagnosis prediction accuracy on delta and gamma band directed connectivity patterns. Therefore, a generic CNN algorithm trained on images that consist of different GC matrices is able to detect complex features, provided the input data is prepared adequately. In order to maximally leverage a CNN’s ability to exploit multiple epileptic features in GC connectivity matrices, predictions made by different CNN methods, each with a separate data preparation and electrode combination, should be fused into a combined classification system. The high accuracy achieved in this study, for the automated diagnosis of epilepsy based on GC connectivity derived from EEG periods without pathological activity, shows that this approach could prove a valuable aid for clinicians or for supportive diagnostic evidence in conjunction with other automated methods. Because different electrode combinations and matrix normalization methods can elicit different physiological characteristics, it is possible that this approach could achieve a diagnostic performance beyond that of a trained neurologist. The performance was comparable to those reported by other studies, which required more expensive imaging modalities such as MRI, fMRI or ESI with high density EEG. The current study shows that these more expensive modalities may not be required for achieving high epilepsy diagnosis accuracy. The current method requires only a short and simple, noninvasive, low density, scalp EEG recording. This is particularly beneficial for patients with low seizure and IED frequency. Since it is computationally feasible on a standard laptop and requires minimal training to implement, it could also be easily used in a practitioner’s office. This, in turn, could be a significant cost reduction for society as a whole, since automated diagnosis of epilepsy overcomes the necessity for long term monitoring.
Data Availability
This study used the openly accessible dataset from Temple University Hospital EEG Corpus.
Funding sources
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Declarations of interest
none
Acknowledgements
The authors acknowledge Ali Uslu and Dr. Seda Dumlu for their proof reading assistance.
Appendix A: Training Configurations
Common configurations for all electrode combinations
Used freq bands: Delta, Theta, Beta, Gamma.
CNN architecture:
One convolutional layer, 1 input channel, ReLu, no dropout, no pooling.
Three fully connected (linear) layers: ReLu, dropout, final layer has 2 nodes.
FC layer dropouts were applied with a manual seed=23.
Test set consisted of 3 epileptic and 3 non-epileptic subjects for each run.
Validation set consisted of 6 epileptic and 6 non-epileptic subjects for each run. Loss function: cross entropy loss.
Optimizer: stochastic gradient descent. Batch size = 6.
Learning rates were reduced by a factor 33.3%, in 2 equal steps:
First reduction at epoch == patience, and second at epoch == 2*patience.