An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy

Chen, Wenna; Wang, Yixing; Ren, Yuhao; Jiang, Hongwei; Du, Ganqin; Zhang, Jincan; Li, Jinghua

doi:10.1186/s12911-023-02180-w

Research
Open access
Published: 22 May 2023

An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy

Wenna Chen¹^na1,
Yixing Wang²^na1,
Yuhao Ren²,
Hongwei Jiang¹,
Ganqin Du¹,
Jincan Zhang² &
…
Jinghua Li²

BMC Medical Informatics and Decision Making volume 23, Article number: 96 (2023) Cite this article

3802 Accesses
17 Citations
Metrics details

Abstract

Background

Epilepsy is a neurological disorder that is usually detected by electroencephalogram (EEG) signals. Since manual examination of epilepsy seizures is a laborious and time-consuming process, lots of automatic epilepsy detection algorithms have been proposed. However, most of the available classification algorithms for epilepsy EEG signals adopted a single feature extraction, in turn to result in low classification accuracy. Although a small account of studies have carried out feature fusion, the computational efficiency is reduced due to too many features, because there are also some poor features that interfere with the classification results.

Methods

In order to solve the above problems, an automatic recognition method of epilepsy EEG signals based on feature fusion and selection is proposed in this paper. Firstly, the Approximate Entropy (ApEn), Fuzzy Entropy (FuzzyEn), Sample Entropy (SampEn), and Standard Deviation (STD) mixed features of the subband obtained by the Discrete Wavelet Transform (DWT) decomposition of EEG signals are extracted. Secondly, the random forest algorithm is used for feature selection. Finally, the Convolutional Neural Network (CNN) is used to classify epilepsy EEG signals.

Results

The empirical evaluation of the presented algorithm is performed on the benchmark Bonn EEG datasets and New Delhi datasets. In the interictal and ictal classification tasks of Bonn datasets, the proposed model achieves an accuracy of 99.9%, a sensitivity of 100%, a precision of 99.81%, and a specificity of 99.8%. For the interictal-ictal case of New Delhi datasets, the proposed model achieves a classification accuracy of 100%, a sensitivity of 100%, a specificity of 100%, and a precision of 100%.

Conclusion

The proposed model can effectively realize the high-precision automatic detection and classification of epilepsy EEG signals. This model can provide high-precision automatic detection capability for clinical epilepsy EEG detection. We hope to provide positive implications for the prediction of seizure EEG.

Peer Review reports

Introduction

Epilepsy is the second most common neurological disorder after stroke, according to a report from World Health Organization [1, 2]. People with epilepsy account for about 1% of the world population. Due to the uncertainty of ictal, epilepsy patients need to take long-term medication, which brings great harm to their bodies and mind. Therefore, the analysis and mining of epilepsy features are helpful to achieve early warning of epileptic seizures, which can not only ensure the personal safety of patients, but also remind patients to choose emergency antiepileptic drugs. The development of electroencephalogram (EEG) has prompted the emergence of a low-cost, high-efficiency EEG recognition technology for epilepsy [3]. The EEG features of epileptic patients and normal people are quite different. EEG activity in patients with epilepsy is usually divided into interictal and ictal phases, and there are significant differences in EEG features between interictal and ictal. The way that neurosurgeons read EEG signals to determine if people have epilepsy is a general approach in the medical community. However, the observation and detection of EEG signals is a time-consuming and laborious task [4]. Not only does it require many manpower and material resources, but also has a high risk of misdiagnosis. Therefore, the automatic detection and classification model of EEG signals is becoming more and more urgent.

In recent years, in order to realize the automatic diagnosis of epilepsy EEG signals, various automatic detection and classification models have been proposed. In order to extract the features of EEG signals effectively, the decomposition of the signal is required to be performed first. Since the wavelet transform can handle non-smooth and complex signals such as EEG signals while the traditional Fourier transform used for time–frequency domain analysis of signals can only handle smooth signals, a large number of studies have employed Discrete Wavelet Transform (DWT) to decompose EEG signals [5,6,7]. Furthermore, analyzing and extracting the effective signal features play an important role in classification, to realize the automatic detection of epilepsy. However, only a single feature was adopted for EEG classification in most of the available studies for epilepsy EEG detection. In general, the features which are used to detect epilepsy contain the following categories: Power Spectral Density Energy Diagram (PSDED) represented by energy analysis [5], nonlinear characteristics Approximate Entropy (ApEn), Distribution Entropy (DistEn), Shannon Entropy (ShanEn), Renyi Entropy (RenEn) and LempelZiv Complexity [8,9,10,11,12,13,14,15], and Common Spatial Pattern (CSP) algorithms for the spatiotemporal domain [16]. A single EEG feature can only describe part of the EEG features, resulting in poor classification accuracy. Yet, the combination of the above features can better reflect the features of EEG signals in epilepsy. For example, some studies combine various nonlinear features such as Hurst Exponent (HE), Kolmogorov Complexity (KC), ShanEn, and Sample Entropy (SampEn) [15, 17, 18], and a fusion of spatial and temporal features could also be performed [19]. However, if too many epileptic EEG features are extracted and fused, it may lead to lower computational efficiency and information redundancy, and there are also some bad features that interfere with the classification results. Therefore, a small number of studies have performed the selection of hybrid features, such as features selection by use of genetic algorithms based on the Viral Swarm Particle Optimization (VSPO) technique [20], but the classification accuracy obtained by this method is not high. In addition, according to the EEG characteristics of epilepsy, selecting an effective classification model is very critical for the automatic detection of epilepsy. With the development of artificial intelligence, machine learning models were widely used in automatic epilepsy detection, such as Artificial Neural Networks (ANN) [5], Random Forests (RF) [21], and Support Vector Machines (SVM). Although the traditional machine learning algorithms such as SVM are widely used, the method is more suitable for single channel and small sample datasets [13, 20, 22,23,24]. However, when larger data with multiple features for EEG signals is analyzed, deep learning algorithms such as Convolutional Neural Network (CNN) have obvious advantages compared to traditional machine learning algorithms [8, 19, 25,26,27,28].

To address the above multi-feature extraction and screening problems as well as to consider the performance of the used classifier, an automatic epileptic EEG signal recognition method based on feature fusion and selecting is proposed in this paper. Firstly, the EEG signal was decomposed by DWT, and the Joint Time–Frequency Analysis (JTFA) and nonlinear analysis were used to extract the EEG hybrid features of epilepsy. Secondly, the random forest algorithm was used to select some important features. Finally, CNN was used to classify the EEG signals. The structure of this article is as follows. The previous related works are investigated and summarized in Section II. Section III shows the dataset used in this experiment, in addition to describing the methods and algorithms used to establish the model in this paper. Section IV shows the experiment results and analysis. Finally, Section V concludes the paper by summarizing the contributions.

Literature survey

Many automated epileptic EEG signal classification systems using a single feature have emerged in recent years. In EEG signals, features can be divided into time domain, frequency domain, time–frequency domain, and nonlinear features. Nonlinear features are often used in the classification of EEG signals [8, 10, 12, 13]. G. R. Kiranmayi and Udayashankara [8] proposed a method for nonlinear analysis of EEG based on ApEn feature, and the ApEn feature was extracted from the δ, θ, α, β, and γ subbands of healthy EEG, ictal and interictal EEG. Emran Ali et al. [10] analyzed and compared the effectiveness of DistEn, ShanEn, RenEn, and LempelZiv Complexity as classification features of seizures in EEG signals. Si Thu Aung et al. [12] proposed a modified Distribution Entropy (mDistEn) for epilepsy detection and obtained 92% classification accuracy by exploring the advantages of Fuzzy Entropy (FuzzyEn) and DistEn. Deepti Tripathi et al. [13] described the classification of EEG signals into healthy, interictal, and ictal using the EMD-based FuzzyEn method.

Shasha Zhang et al. [26] presented a lightweight solution. For the first stage, Pearson correlation coefficients are computed to obtain the correlation matrix. For the second stage, a simple CNN model was used to classify the correlation matrix to distinguish pre-episode states from inter-episode states with a prediction accuracy of 89.98%.

Aayesha et al. [29] proposed a fuzzy-based seizure detection model that incorporates a new feature extraction and selection method. For the binary classification problem of interictal and ictal periods, the classification accuracy rate of 96.67% was reached.

With the study of EEG characteristics, energy analysis of EEG signals and space–time analysis have emerged [9, 16, 30]. Yunyuan Gao et al. [9] proposed a deep learning-based method for the detection of epileptic EEG signals, where the epilepsy EEG signals were converted into Power Spectral Density Energy Daps (PSDED), which are then applied to Deep Convolutional Neural Networks (DCNNs) and transfer learning PSDED. N. Sriraam et al. [30] utilized Teager energy features to automatically detect seizures from multichannel EEG recordings and evaluated the performance of a multilayer perceptron neural network classifier using sensitivity, specificity, and false detection rate. Turky N. Alotaiby et al. [16] used the CSP algorithm to extract spatiotemporal domain features from EEG signals for the classification of EEG signals.

Rishabh Bajpai et al. [25] applied the spectrum to convert EEG signals into the image domain. The spectral images were then applied to CNN to learn robust features, which facilitate the automatic detection of pathological and normal EEG signals with experimental accuracy, sensitivity, and specificity of 96.65%, 90.48%, and 100%, respectively.

Zhao and Wang [31] proposed SeizureNet, a CNN-based model for robust seizure detection of EEG signals. Firstly, two convolutional neural networks were employed to extract time-invariant features from single-channel EEG signals. Secondly, the fully connected layer was used to learn the high-level features. Finally, these features were fed to the softmax layer for classification. They evaluated the model on a benchmark database provided by the University of Bonn, and a tenfold cross-validation method was used, obtaining up to 98.5% accuracy and 97.0% sensitivity for dichotomous mission between interictal and ictal period.

As seen from the above experiments, the classification accuracy obtained from a single feature is low. Therefore, some other studies performed feature fusion. Many researchers choose to fuse nonlinear features with other features [15, 17, 22].Mohd Syakir Fathillah et al. [15] combined multiple features such as HE, KC, ShanEn, and SampEn for EEG signals by studying multi-resolution analysis algorithms. Daniel Abásolo et al. [17] analyzed EEG recordings from patients with focal epilepsy using two nonlinear methods of ApEn and LempelZiv complexity. Yanan Lu et al. [22] combined three features to classify single-channel EEG signals for seizure detection, and the three features contain the Kraskov entropy feature based on the Hilbert-Huang Transform (HHT), the instantaneous area of the analytical eigenmode function of EEG signals, and the Kraskov entropy applied to the tunable Q wavelet transform, while the Least Squares Support Vector Machine (LS-SVM) classifier was used to classify the multivariate feature combination.

Sharma et al. [23] used the Empirical Modal Decomposition (EMD) method to decompose EEG signals and extracted the Intrinsic Mode Function (IMF). The entropy features of different IMFs for focal and nonfocal EEG signals were calculated, namely average Shannon Entropy (ShanEnAvg), average Renyi Entropy (RenEnAvg), average ApEn (ApEnAvg), average Sample Entropy (SampEnAvg) and average phase entropy (S1Avg and S2Avg). These entropies were used as input feature sets for LS-SVM classifiers to classify EEG signals into focal and nonfocal signals and the model achieved an average classification accuracy of 87%.

In addition, some researchers integrate temporal features with frequency-domain features or spatial features [19, 21, 24]. Hisham Daoud et al. [19] used DCNN and Bi-LSTM networks to learn important spatial and temporal features from raw data, respectively, and used a semi-supervised learning method based on DCAE with migration learning techniques for dichotomous classification of EEG states. Xiashuang Wang et al. [21] presented an automatic seizure detection model based on the method of multiple time–frequency analysis, which involves a new random forest model combined with grid search optimization. Abeg Kumar Jaiswal et al. [24] proposed an automatic detection method for EEG signal epilepsy based on subpattern Principal Component Analysis (SpPCA) and cross-subpattern correlation Principal Component Analysis (SubXPCA) combined with SVM.

Banupriya and Devi [20] used a genetic algorithm based on Virus Swarm Particle Optimization (VSPO) technique for feature selection and SVM technique for classification of EEG signals. The experimental results shown that the sensitivity was 98.03%, the specificity was 98.01%, and the accuracy was 98.90%.

Deivasigamani et al. [32] presented a computer-assisted method for automatic detection and classification of focal and nonfocal EEG signals. The Double-Tree Complex Wavelet Transform (DT-CWT) was used to decompose EEG signals and extract features from the decomposition coefficients. These features were trained and classified using the Adaptive Neural Fuzzy Inference System (ANFIS). Finally, the classification results with sensitivity of 98%, specificity of 100% and accuracy of 99% were obtained.

Methods and materials

Dataset

Bonn EEG dataset

The dataset used in this study is the epilepsy EEG dataset of the University of Bonn, Germany [33], which was collected from five healthy subjects and five epilepsy patients, and the dataset is a single-channel EEG signal dataset, containing five subsets (Set A ~ Set E). Each subset contains 100 data segments of the same type, and each data segment contains 4097 EEG time series. Each data segment has a time length of 23.6 s with a sampling frequency of 173.61 Hz, and the artifacts have been removed by manual filtering of 0.53 ~ 40 Hz. The electrode positions of Set A and Set B subsets were located on the scalp, which is the EEG data of 5 healthy subjects in the state of opened and closed eyes, respectively. The EEG data of Set C and Set D subsets were obtained from 5 epilepsy patients in the interictal period, while the electrode position of the Set C subset was located in the contralateral region of the lesion, and the electrode position of the Set D subset was located in the lesion area. The electrode position of the Set E subset was located in the lesion area, which is the EEG data of 5 epilepsy patients during the ictal period.

Figure 1 shows the visual graphics of the data fragments of Group 1 in each subset, where the horizontal axis represents the number of samples of EEG time series and the vertical axis represents the sample value. It can be seen that there are some differences in the 5 types of EEG signal waveforms. Due to the presence of feature waves for epileptic EEG signals, the EEG signal amplitude of Set E is significantly larger than those of the other four groups.

New Delhi EEG dataset

These datasets were exemplary segmented EEG time series recordings of 10 epilepsy patients from the Neurology & Sleep Centre, Hauz Khas, New Delhi. The datasets were acquired using the Grass Telefacor Comet AS40 amplification system at a sampling frequency of 200 Hz. Gold-plated scalp EEG electrodes were placed using a 10–20 electrode placement system at the time of acquisition. The acquired EEG signal is filtered by a band-pass filter from 0.5 Hz ~ 70 Hz. There are three states including preictal, interictal and ictal, which are in the form of MAT. Each EEG state contains 50 MAT files, and each MAT file consists of 1024 samples of one EEG time series data with a duration of 5.12 s.

Research methods

The study process in this paper is divided into four steps. Firstly, it is necessary to preprocess the EEG signal data, where it is filtered through the bandpass filter, and then DWT is used to decompose and reconstruct the wavelet to realize wavelet denoising. Secondly, feature extraction is performed. The reconstructed EEG signal is decomposed again, which is divided into five subbands D1, D2, D3, D4, and A4, and four types of features including time domain standard deviation (STD), ApEn, FuzzyEn, and SampEn are extracted from the above five subbands. Thirdly, feature selection is carried out. The random forest algorithm is adopted to evaluate the importance of features and select the most important 10 features. Finally, the fourth step is classification, using CNN to classify EEG signals. The method block diagram is shown in Fig. 2.

Data preprocessing

In order to improve the accuracy of subsequent feature extraction and classification, it is necessary to filter and denoise the EEG signals. The feature wave of epilepsy EEG signal covers the 0 ~ 80 Hz frequency band, while the sampling frequency of the experimental dataset is 173.61 Hz, so the 4th-order Butterworth bandpass filter is used to obtain an EEG signal of 0.01 Hz ~ 86.8 Hz. The filtered EEG was decomposed by using the "db4" wavelet basis function, and the select threshold was selected for denoising. Then the denoised subband was reconstructed to obtain the filtered denoised EEG.

The Fourier transform, which is traditionally used for the Joint Time–Frequency Analysis of signals, only can process stationary signals, while wavelet transforms can process non-stationary complex signals such as EEG signals. Therefore, the EEG signal preprocessing and EEG signal decomposition were realized using DWT in this paper. The DWT was used to denoise the raw EEG data. The EEG signal was decomposed by multi-level wavelet decomposition, and the approximation coefficient and detail coefficient of the signal at various scales were obtained.

It is assumed that the function φ(t) is a quadratic integral function which is denoted as φ(t) ∈ L²(R), where L²(R) represents the square-integrable space of real numbers. Its Fourier Transform Ψ(ω) satisfies the following equation:

$${C}_{\Psi }={\int }_{-\infty }^{+\infty }\frac{{\left|\Psi \left(\omega \right)\right|}^{2}}{|\omega |}\mathrm{d}\omega <\infty$$

(1.)

The continuous wavelet function Ψ_s,t (t) is obtained from the fundamental wavelet Ψ(t) by scale scaling and translation, which is expressed as:

$${C}_{\Psi }={\int }_{-\infty }^{+\infty }\frac{{\left|\Psi \left(\omega \right)\right|}^{2}}{|\omega |}\mathrm{d}\omega <\infty$$

(2.)

where s is the scale factor, τ is the translation factor, and R represents the set of real numbers.

Next, the discretizations of the scale factor and translation factor are performed. Assuming that s = 2^−j and τ = k2^−j, where j and k are the size of the scaling and the translation scale, respectively, and the values of j and k are integers. And then, the expression of the discrete wavelet function for the Ψ(t) can be written as:

$${\Psi }_{{2}^{-j},k{2}^{-j}}\left(t\right)={2}^{j/2}\Psi \left({2}^{j}t-k\right)$$

(3.)

For any function f(x), the DWT can be expressed as:

$${W}_{\Psi }f(j,k)={2}^{j/2}{\int }_{-\infty }^{+\infty }f\left(t\right){\Psi }^{*}\left({2}^{j}t-k\right)\mathrm{d}t$$

(4.)

In this study, the input signal passes through the low-pass filter G(n) and the high-pass filter H(n), both of which have a cut-off frequency of one-quarter of the sampling frequency. In the first step of DWT decomposition, the low-frequency approximation coefficient A1 and detail coefficient D1 are obtained, and then, the output A1 is fed to another quadrature mirror filter. By means of repeating the same process, the approximation and detail coefficient for the next level can be obtained. Considering that the frequency band above 80 Hz may not contain the eigenwaves of epileptic EEG, the "db4" wavelet basis function was used to perform a 4-level decomposition of EEG signals. Figure 3 illustrates the 4-level decomposition of EEG signals. The subband frequencies of A1, D1, A2, D2, A3, D3, A4, and D4 are 0 ~ f_s/4, f_s/4 ~ f_s/2, 0 ~ f_s/8, f_s/8 ~ f_s/4, 0 ~ f_s/16, f_s/16 ~ f_s/8, 0 ~ f_s/32, f_s /32 ~ f_s/16, 0 ~ f_s/64, f_s/64 ~ f_s /32, respectively, where fs is the sampling frequency of the used data set, being 173.61 Hz.

Feature extraction

Firstly, DWT wavelet decomposition is performed on the filtered denoised EEG signals. The "db4" was selected as the wavelet basis function, and the 4-stage decomposition was used to obtain five subbands of D1 ~ D4 and A4. Then, the multiple features of the STD, SampEn, FuzzyEn, and ApEn were extracted from the EEG signals of the above five subbands. The extracted 20 EEG features are shown in Table 1.

Table 1 EEG features of epilepsy

Full size table

Nonlinear features

With an in-depth understanding of EEG signals, it is generally believed that human EEG signals are nonlinear random signals in the field of bioelectric signals, and their nonlinear features can better characterize EEG signals. Entropy is a physical quantity that can characterize the EEG complexity. Studies have shown that the uncertainty of EEG signals during the ictal phase is significantly reduced, so it is necessary to characterize the features of EEG signals using entropy. ApEn was developed on the basis of Kolmogorov-Sinai entropy and was proposed by Pincus in 1991 [34]. ApEn predicts the amplitude of the future signal based on the known signal amplitude, which can be used to describe the uncertainty or randomness of the signal. SampEn was proposed by Richman et al. [35]. The SampEn is similar to the ApEn in the physical meaning, but the SampEn overcomes three following shortcomings of the ApEn: SampEn removes the self-match from the data. SampEn obtains the total number of well-matched templates before the logarithmic operation. When dimension m is embedded, the reconstructed time series in SampEn is N-m rows instead of N-m + 1 rows of ApEn, so that the number of patterns in embedding dimension m and m + 1 are equal. FuzzyEn characterizes the occurrence probability of the new pattern, and the larger the measured value, the greater the occurrence probability of the new pattern, that is, the greater the complexity of the sequence.

Standard Deviation (STD)

Since the STD can achieve a good recognition effect, as a simple and computable time–frequency feature, the STD is also applied to EEG signals in this paper. The calculation formula of the STD σ is defined as:

$$=\sqrt{\frac{{\sum_{i=1}^{N}\left({x}_{i}-\overline{x }\right)}^{2}}{N}}$$

(5.)

where x represents the average of x_i. N is the total sample quantity, and x is a variable.

Feature selection

In this paper, the random forest algorithm was used to evaluate the extracted 20 EEG signal features importance and sorted them in descending order. According to the feature importance, the last feature in each round was removed. Thus, a new feature set is obtained and the above process is repeated with the new feature set, and the process does not stop until the 10 features with the highest importance are left.

As an ensemble learning algorithm, Random Forests (RF) uses decision trees as the basic unit. The decision trees are added into RF on the basis of Bagging, which is an improved version of the Bagging algorithm. The training of RF subsets is independent of each other and efficient. It also retains the advantages of the Classification and Regression Tree (CART) algorithm, which uses Gini coefficients to select the optimal features and syncopation point, and overcomes the disadvantages of CART which require a fully spanning tree. The operation principle of random forest is shown in Fig. 4.

For RF, k samples are taken from the dataset using bootstrap sampling, and each sample has N features. Then k decision models are established for each of the k samples, and the k-th decision tree is labeled as T_k. The k-th bootstrap sample was trained to calculate the classification accuracy of the k-th Out of bag (OOB) data LOOB k. The feature X_j (j = 1,2,…, N) in the OOB data was disturbed randomly, and the classification accuracy was calculated again. And then, the above process is repeated when k = 2, 3, 4, …, in order. The importance of the feature P_j is calculated by the following equation.

$${P}_{j}=\frac{1}{K}\sum_{j=1}^{K}\left({L}_{k}^{OOB}-{L}_{k,j}^{OOB}\right)$$

(6.)

Finally, they are ranked according to their importance and the features with the lowest importance are excluded.

Classification

In this work, the CNN architecture is defined with 16 filters of size 2 × 1 with a stride of 1 for the first convolutional layer. An input data of 10 × 1 × 1 was used as input to this convolutional layer. After the first convolutional layer, batch normalization and max-pooling were performed using a filter of 2 × 1 with a stride of 1. Again, for the next convolutional layer, 32 filters of size 2 × 1 were used with a stride of 1. Similarly, batch normalization and max-pooling were performed using a filter of size 2 × 1 and a stride of 1 after the second convolutional layer. There are two fully connected layers that use softmax as the activation function after the two convolutional layers. Adaptive Moment Estimation (Adam) is used to learn the parameters of the CNN. The dataset used in the experiment was divided into a training and test set with a ratio of 3:1, and the CNN classifier was used to classify the selected feature data. The CNN architecture diagram is shown in Fig. 5.

The convolutional layer consists of several convolutional units, and the parameters of each convolutional unit are optimized by a backpropagation algorithm. The different features of the input are extracted by convolution, which is calculated as follows.

$${H}_{i,j}=f{\left(C{D}^{k}*x\right)}_{i,j}+{a}_{k}$$

(7.)

where f is the activation function, D^k is the K-th convolution kernel, a_k is the offset error for the sum of the results of the K-th convolution kernel, and x is the convolution input data.

The pooling layer, also called the downsampling layer, mainly subsamples the feature maps learned in the convolutional layer, which reduces the input dimension of the subsequent network layers, and improves the computational accuracy.

The average pooling can be expressed as:

$$y\left(x\right)=\frac{1}{k*k}\sum_{i={i}_{1}}^{{i}_{1+k}}\sum_{j={j}_{1}}^{{j}_{1+K}}{x}_{i,j}$$

(8.)

The max pooling is given as:

$$\left(x\right)=max\left({X}_{[i,i+k][j,j+k]}\right)$$

(9.)

The fully connected layer is fully connected by using softmax, and the obtained activation values are the features extracted by the convolutional neural network, and the features learned by the convolutional layer and the pooling layer are weighted and fused to the sample labeling space.

Results and discussion

Evaluation metrics

To evaluate the performance of the model, Accuracy, Sensitivity, Specificity, and Precision metrics are used in this paper. The indicators are calculated as follows:

$$Accuracy=\frac{TP+TN}{TP+FN+FP+TN}$$

(10.)

$$Sensitivity=\frac{TP}{TP+FN}$$

(11.)

$$Specificity=\frac{TN}{TN+FP}$$

(12.)

$$Precision=\frac{TP}{TP+FP}$$

(13.)

where TN is the true negative rate, which indicates the number of samples that are actually negative samples predicted to be negative samples; FP is the false positive rate, which indicates the number of samples that are actually negative samples predicted to be positive samples; FN is the false negative rate, which indicates the number of samples that are actually positive samples predicted to be negative samples; TP is the true positive rate, which represents the number of samples that are actually positive samples predicted to be positive samples.

Experimental results

In order to extract the features of the EEG signals effectively, the wavelet decompositions for Set A, Set B, Set C, Set D, and Set E of the Bonn EEG dataset were carried out. Taking the Set E subset during the ictal period as an example, the DWT was adopted to perform a 4-level wavelet decomposition. The subband waveforms of Set E decomposed by DWT are shown in Fig. 6, where the horizontal axis represents the number of samples of EEG time series and the vertical axis represents the sample value. The subband frequencies of A4, D4, D3, D2 and D1 are 0 ~ 5.4 Hz, 5.4 ~ 10.8 Hz, 10.8 ~ 21.7 Hz, 21.7 ~ 43.4 Hz, and 43.4 ~ 86.8 Hz, respectively. And then, the effective features for all the subbands were extracted and analyzed. For convenience, the analysis of features including ApEn, FuzzyEn, SampEn, and STD features for decomposed D1 subband was given in this paper. The extracted features for the D1 subband are shown in Figs. 7 and 8, where the horizontal axis represents each data segment and the vertical axis represents the feature value of each data segment. For the D1 subband, there are significant differences in the amplitudes of the above four features. The amplitude of the four features of the D1 subband in the inter epileptic Set D is significantly lower than those in Set A and E, so the classification effect could be greatly improved by using the four features to classify the interictal period and the ictal period or healthy people. For Set A and Set E, the FuzzyEn feature amplitude and STD feature amplitude are quite different, so the two features can play very important roles in the classification of Set A and Set E. For the approximates entropy and SampEn features, most of the feature amplitudes for Set E are lower than those for Set A, while there is a small overlap. Therefore, it is necessary to use random forest-based feature selection to remove the poor features, and the adopted 10 features with the best importance are shown in Table 2.

Table 2 The adopted 10 most important features

Full size table

In order to more intuitively reflect that it is essential to perform feature selection, the compared experiment before and after performing feature selection was carried out. The data was divided into a training set and a test set with a ratio of 3:1, and MATLAB R2019A was employed to construct and simulate the model. For the classification results between Set D and Set E EEG signals, the accuracies obtained by CNN, SVM, and BP neural network optimized by genetic algorithm (GA-BP) classifiers without feature screening are 98.2%, 94.7%, and 97.5%, respectively. When the 10 features obtained from the screening are used for classification, the accuracies of CNN, SVM, and GA-BP classifiers are improved to be 99.2%, 96%, and 97.9%, respectively. Figure 9 shows the compared results for the classification between Set D and Set E EEG signals. It can be seen that for CNN, SVM, and GA-BP, using a random forest algorithm to screen the importance of features can improve classification accuracy. Especially, the classification accuracy rate of the CNN algorithm can be improved to 99.2% after feature selection.

In order to verify the superiority of the CNN algorithm in classification applications, SVM and GA-BP classifiers along with CNN classifiers were also employed. Table 3 describes the classification tasks in this topic. And the Accuracy, Sensitivity, Specificity, and Precision of the binary classification task are listed in Table 4. Moreover, to visually show the superiority of the CNN classifier, Figs. 10, 11, 12, and 13 shows the Accuracy, Sensitivity, Specificity, and Precision, respectively, where the horizontal axis represents the different cases and the vertical axis represents the value of index. The experimental results show that the classification accuracy obtained by the classification algorithm combining RF and CNN is much higher than that of combining RF with SVM and GA-BP.

Table 3 The specific classification tasks

Full size table

Table 4 Accuracy, sensitivity, specificity, and precision results of SVM, CNN, and GA-BP classifiers

Full size table

Discussion

In order to verify the advantages of the proposed model over other classification techniques, we compare the results obtained by other methods with our proposed methods, which are shown in Table 5. So as to make the calculation results more comparable, only the results obtained from using the same data set and similar cases are listed in this paper. The classification model proposed in this paper can achieve an accuracy of 99.9%, a sensitivity of 100%, a precision of 99.81%, and a specificity of 99.8% in the binary classification task of interictal and ictal periods of Bonn EEG datasets. In terms of New Delhi EEG datasets can achieve an accuracy of 100%, a sensitivity of 100%, a precision of 100%, and a specificity of 100% in the binary classification task of interictal and ictal periods. It can be seen that the RF + CNN algorithm used in this paper for mixed features is considered to be a noteworthy improvement compared to state-of-the-art methods. For the classification of Set D and Set E, Wang et al. [21] utilized a Short Time Fourier Transform (STFT), average energy, and Principal Component Analysis (PCA) feature as the basis for classification, and random forest-grid search optimization (RF + GSO) for the extracted features. Jaiswal and Banka [24], Riaz et al. [6], and Deepti Tripathi and Agrawal [13] proposed an automatic classification technique based on SVM. Xin et al. [36] proposed an Attention Mechanism-based Wavelet Convolution Neural Network (AMWCNN) for epilepsy EEG classification. However, our proposed RF + CNN model system is superior to their method. Jiang et al. [37] used Wavelet Packet Decomposition (WPD) to extract features from EEG and adopted Takagi Sugeuo Kang (TSK) classifier to classify epileptic status. Lu et al. [22]. proposed Kraskov entropy and instantaneous area as features to classify interictal signals and ictal signals using the LS-SVM classifier. Al-Hadeethi et al. [38] recommended the method that the multiple time-domain features combined with Kolmogorov Smirnov Test (KST) are used for feature selection and AdaBoost is used for classification. Although these studies on the classification of interictal and ictal signals have yielded encouraging results, their classification accuracy is lower than the model that we proposed.

Table 5 Comparison of seizure detection methods using the benchmark Bonn EEG dataset

Full size table

Conclusion

Accurate classification may reduce the damage caused by seizures. In this paper, we propose a novel epileptic EEG signal classification methodology using a multivariate feature classification method based on the combination of RF and CNN to classify different epileptic states (i.e., nonictal, preictal, interictal, and ictal). The method is verified by the multichannel EEG signals in the Bonn database and New Delhi database. It can be concluded through the study that: (1) the proposed EEG signal classification method outperforms other benchmark models in classifying different epileptic states; For the C-E case, the proposed model achieves a classification accuracy of 99.9%, a sensitivity of 100%, a specificity of 99.80%, and a precision of 99.81%. For the interictal-ictal case of New Delhi datasets, the proposed model achieves a classification accuracy of 100%, a sensitivity of 100%, a specificity of 100%, and a precision of 100%. (2) the proposed method can extract multiple features from EEG signals; (3) The RF + CNN model can be used to rank the extracted EEG features according to their importance and achieve feature selection, so as to achieve higher classification accuracy. In medicine, the proposed EEG classification method has important practical significance for the diagnosis and treatment of epilepsy. For example, for patients, the high classification accuracy of epileptic states classified by EEG signals (i.e., interictal, ictal) can achieve reliable and timely early warning; For doctors, it can help them understand the classification of epilepsy in patients so that the prevention and treatment of epilepsy can be effectively controlled.

Thus, this work addresses one important challenges of accurately classifying epileptic states by multi-feature EEG signals. As part of our future research, we aim to improve EEG classification methods in the following ways to better serve the prevention and treatment of epilepsy: (1) the proposed EEG classification model will be used to detect seizures; (2) through combining with the temporal correlation between EEG signal frames, the false detection of seizures may be further reduced, however, further studies need to be performed.

Availability of data and materials

Publicly available dataset was analyzed in this study and the dataset can be freely accessed. The datasets we used in our work can be found as the following link https://github.com/RYH2077/EEG-Epilepsy-Datasets, and further inquiries can be directed to the corresponding author.

References

Iasemidis LD. Seizure prediction and its applications. Neurosurg Clin N Am. 2011;22:489–506.
Article PubMed PubMed Central Google Scholar
Iasemidis LD. Epileptic seizure prediction and control. IEEE Trans Biomed Eng. 2003;50:549–58.
Article PubMed Google Scholar
Kurup D, Gururangan K, Desai MJ, Markert MS, Eliashiv DS, Vespa PM, et al. Comparing seizures captured by rapid response EEG and conventional EEG recordings in a multicenter clinical study. Front Neurol. 2022;13: 915385.
Article PubMed PubMed Central Google Scholar
Yıldırım Ö, Baloglu UB, Acharya UR. A deep convolutional neural network model for automated identification of abnormal EEG signals. Neural Comput Appl. 2020;32:15857–68.
Article Google Scholar
AlSharabi K, Ibrahim S, Djemal R, Alsuwailem A. A DWT-Entropy-ANN based architecture for epilepsy diagnosis using EEG signals. In: 2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (atsip). New York: Ieee; 2016. p. 283–6.
Riaz F, Hassan A, Rehman S, Niazi IK, Dremstrup K. EMD-based temporal and spectral features for the classification of EEG signals using supervised learning. IEEE Trans Neural Syst Rehabil Eng. 2016;24:28–35.
Article PubMed Google Scholar
Hemachandira VS, Viswanathan R. A framework on performance analysis of mathematical model-based classifiers in detection of epileptic seizure from eeg signals with efficient feature selection. J Healthc Eng. 2022;2022:1–12.
Article Google Scholar
Kiranmayi GR, Udayashankara V. EEG subband analysis using approximate entropy for the detection of epilepsy. IOSR J Comput Eng. 2014;16(5):21–7.
Gao Y, Gao B, Chen Q, Liu J, Zhang Y. Deep Convolutional neural network-based epileptic electroencephalogram (EEG) signal classification. Front Neurol. 2020;11:375.
Article PubMed PubMed Central Google Scholar
Ali E, Udhayakumar RK, Angelova M, Performance KC, Analysis of Entropy Methods in Detecting Epileptic Seizure from Surface Electroencephalograms. In,. 43rd Annual international conference of the ieee engineering in medicine & biology society (embc). New York: Ieee. 2021;2021:1082–5.
Vavadi H, Ayatollahi A, Mirzaei A. A wavelet-approximate entropy method for epileptic activity detection from EEG and its sub-bands. J Biomed Sci Eng. 2010;13:1182–9.
Article Google Scholar
Aung ST, Wongsawat Y. Modified-distribution entropy as the features for the detection of epileptic seizures. Front Physiol. 2020;11:607.
Article PubMed PubMed Central Google Scholar
Tripathi D, Agrawal N. Epileptic seizure detection using empirical mode decomposition based fuzzy entropy and support vector machine. In: International conference on green and human information technology. 2018.
Raghu S, Sriraam N, Kumar GP. Classification of epileptic seizures using wavelet packet log energy and norm entropies with recurrent Elman neural network classifier. Cogn Neurodyn. 2017;11:51–66.
Article CAS PubMed Google Scholar
Fathillah MS, Jaafar R, Chellappan K, Remli R, Zainal W. Multiresolution analysis on nonlinear complexity measurement of EEG signal for epileptic discharge monitoring. Malays J Fundam Appl Sci. 2018;14:219–25.
Article Google Scholar
Alotaiby TN, Abd El-Samie FE, Alshebeili SA, Aljibreen KH, Alkhanen E. Seizure detection with common spatial pattern and support vector machines. In: 2015 International Conference on Information and Communication Technology Research (ictrc). New York: Ieee; 2015. p. 152–5.
Abásolo D, James CJ, Hornero R. Non-linear analysis of intracranial electroencephalogram recordings with approximate entropy and lempel-ziv complexity for epileptic seizure detection. In: International Conference of the IEEE Engineering in Medicine & Biology Society. 2007. p. 1953–6.
Namazi H, Kulish VV, Hussaini J, Hussaini J, Delaviz A, Delaviz F, et al. A signal processing based analysis and prediction of seizure onset in patients with epilepsy. Oncotarget. 2016;7:342–50.
Article PubMed Google Scholar
Daoud H, Bayoumi MA. Efficient Epileptic Seizure Prediction Based on Deep Learning. IEEE Trans Biomed Circuits Syst. 2019;13:804–13.
Article PubMed Google Scholar
Banupriya C, Devi A. Robust Optimization of electroencephalograph (EEG) signals for epilepsy seizure prediction by utilizing VSPO genetic algorithms with SVM and machine learning methods. Indian J Sci Technol. 2021;14:1250–60.
Article Google Scholar
Wang X, Gong G, Li N, Qiu S. Detection analysis of epileptic EEG using a novel random forest model combined with grid search optimization. Front Hum Neurosci. 2019;13:52.
Article PubMed PubMed Central Google Scholar
Lu Y, Ma Y, Chen C, Wang Y. Classification of single-channel EEG signals for epileptic seizures detection based on hybrid features. Technol Health Care. 2018;26:S337–46.
Article Google Scholar
Sharma R, Pachori R, Acharya U. Application of entropy measures on intrinsic mode functions for the automated identification of focal electroencephalogram signals. Entropy. 2015;17:669–91.
Article Google Scholar
Jaiswal AK, Banka H. Epileptic seizure detection in EEG signal using machine learning techniques. Australas Phys Eng Sci Med. 2018;41:81–94.
Article PubMed Google Scholar
Bajpai R, Yuvaraj R, Prince AA. Automated EEG pathology detection based on different convolutional neural network models: deep learning approach. Comput Biol Med. 2021;133: 104434.
Article PubMed Google Scholar
Zhang S, Chen D, Ranjan R, Ke H, Tang Y, Zomaya AY. A lightweight solution to epileptic seizure prediction based on EEG synchronization measurement. J Supercomput. 2021;77:3914–32.
Article Google Scholar
Wei X, Zhou L, Chen Z, Zhang L, Zhou Y. Automatic seizure detection using three-dimensional CNN based on multi-channel EEG. BMC Med Inform Decis Mak. 2018;18:111.
Article PubMed PubMed Central Google Scholar
Ma M, Cheng Y, Wei X, Chen Z, Zhou Y. Research on epileptic EEG recognition based on improved residual networks of 1-D CNN and indRNN. BMC Med Inform Decis Mak. 2021;21:100.
Article PubMed PubMed Central Google Scholar
Aayesha, Bilal Qureshi M, Afzaal M, Shuaib Qureshi M, Gwak J. Fuzzy-based automatic epileptic seizure detection framework. Comput Mater Contin. 2022;70:5601–30.
Sriraam N, Tamanna K, Narayan L, Khanum M, Raghu S, Hegde AS, et al. Multichannel EEG based inter-ictal seizures detection using teager energy with backpropagation neural network classifier. Australas Phys Eng Sci Med. 2018;41:1047–55.
Article CAS PubMed Google Scholar
Zhao W, Wang W. SeizureNet: a model for robust detection of epileptic seizures based on convolutional neural network. Cogn Comput Syst. 2020;2(3):119–24.
Article Google Scholar
Deivasigamani S, Senthilpari C, Yong WH. Classification of focal and nonfocal EEG signals using ANFIS classifier for epilepsy detection. Int J Imaging Syst Technol. 2016;26:277–83.
Article Google Scholar
Andrzejak RG, Lehnertz K, Mormann F, Rieke C, David P, Elger CE. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys Rev E. 2001;64: 061907.
Article CAS Google Scholar
Pincus SM. Approximate entropy as a measure of system complexity. Proc Natl Acad Sci. 1991;88:2297–301.
Article CAS PubMed PubMed Central Google Scholar
Richman JS, Lake DE, Moorman JR. Sample Entropy. In: Methods in Enzymology. Academic Press; 2004. p. 172–84.
Xin Q, Hu S, Liu S, Zhao L, Zhang Y-D. An Attention-based wavelet convolution neural network for epilepsy EEG classification. IEEE Trans Neural Syst Rehabil Eng. 2022;30:957–66.
Article PubMed Google Scholar
Jiang Y, Wu D, Deng Z, Qian P, Wang J, Wang G, et al. Seizure classification from EEG signals using transfer learning, semi-supervised learning and TSK fuzzy system. IEEE Trans Neural Syst Rehabil Eng. 2017;25:2270–84.
Article PubMed Google Scholar
Al-Hadeethi H, Abdulla S, Diykh M, Green JH. Determinant of covariance matrix model coupled with adaboost classification algorithm for EEG seizure detection. Diagnostics. 2021;12:74.
Article PubMed PubMed Central Google Scholar
Jaiswal AK, Banka H. Local pattern transformation based feature extraction techniques for classification of epileptic EEG signals. Biomed Signal Process Control. 2017;34:81–92.
Article Google Scholar
Shoeibi A, Ghassemi N, Alizadehsani R, Rouhani M, Hosseini-Nejad H, Khosravi A, et al. A comprehensive comparison of handcrafted features and convolutional autoencoders for epileptic seizures detection in EEG signals. Expert Syst Appl. 2021;163: 113788.
Article Google Scholar

Download references

Acknowledgements

We would like to thank the Bonn and New Delhi for providing the public available epilepsy EEG dataset.

Funding

This design of this research is funded by the National Natural Science Foundation of China (No. 31800836) and the Major Science and Technology Projects of Henan Province (No. 221100210500), the research analysis and interpretation of data is funded by the Medical and Health Research Project in Luoyang (No. 2001027A) and the Construction Project of Improving Medical Service Capacity of Provincial Medical Institutions in Henan Province (No. 2017–51).

Author information

Wenna Chen and Yixing Wang have contributed equally.

Authors and Affiliations

The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, China
Wenna Chen, Hongwei Jiang & Ganqin Du
College of Information Engineering, Henan University of Science and Technology, Luoyang, China
Yixing Wang, Yuhao Ren, Jincan Zhang & Jinghua Li

Authors

Wenna Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yixing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuhao Ren
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Ganqin Du
View author publications
You can also search for this author in PubMed Google Scholar
Jincan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinghua Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

WNC, YXW and GQD conceived and designed the study. YHR, HWJ and JCZ prepared the experimental equipment and resources. YHR collected the data. WNC and YXW analyzed the data. HWJ and JHL interpreted the results. WNC and YXW wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Hongwei Jiang, Ganqin Du or Jincan Zhang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Chen, W., Wang, Y., Ren, Y. et al. An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy. BMC Med Inform Decis Mak 23, 96 (2023). https://doi.org/10.1186/s12911-023-02180-w

Download citation

Received: 06 January 2023
Accepted: 21 April 2023
Published: 22 May 2023
DOI: https://doi.org/10.1186/s12911-023-02180-w

An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy

Abstract

Background

Methods

Results

Conclusion

Introduction

Literature survey

Methods and materials

Dataset

Bonn EEG dataset

New Delhi EEG dataset

Research methods

Data preprocessing

Feature extraction

Nonlinear features

Standard Deviation (STD)

Feature selection

Classification

Results and discussion

Evaluation metrics

Experimental results

Discussion

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Medical Informatics and Decision Making

Contact us