Early prediction of sudden cardiac death risk with Nested LSTM based on electrocardiogram sequential features

Electrocardiogram (ECG) signals are very important for heart disease diagnosis. In this paper, a novel early prediction method based on Nested Long Short-Term Memory (Nested LSTM) is developed for sudden cardiac death risk detection. First, wavelet denoising and normalization techniques are utilized for reliable reconstruction of ECG signals from extreme noise conditions. Then, a nested LSTM structure is adopted, which can guide the memory forgetting and memory selection of ECG signals, so as to improve the data processing ability and prediction accuracy of ECG signals. To demonstrate the effectiveness of the proposed method, four different models with different signal prediction techniques are used for comparison. The extensive experimental results show that this method can realize an accurate prediction of the cardiac beat’s starting point and track the trend of ECG signals effectively. This study holds significant value for timely intervention for patients at risk of sudden cardiac death.


Introduction
Sudden cardiac death is defined as a death occurring usually within an hour of onset of symptoms, arising from an underlying cardiac disease.Sudden cardiac death is a complication of a number of cardiovascular diseases and is often unexpected [1,2].Clinical manifestations may include chest pain, shortness of breath, fatigue, weakness, persistent angina pectoris, arrhythmia, etc. [3].At present, there are limited effective methods to predict the occurrence of SCD in individuals without prior cardiac issues.As a simple, easy-to-use, reliable ECG analysis tool, the electrocardiogram (ECG) provides abundant information for the diagnosis and treatment of cardiovascular disease.Based on ECG signals, abnormal and significant fluctuations can be detected in patients before the onset of SCD [4,5].For example, Ventricular Fibrillation (VF) is an important manifestation of sudden cardiac death, and the trend of VF can be obtained by monitoring ECG signals [6,7].However, ECG is a weak signal with strong nonlinearity, non-stationarity, and randomness, which affect the final diagnostic results.Therefore, accurate prediction of ECG signals plays a pivotal role in the early detection and prevention of Sudden Cardiac Death [8,9].
At present, the traditional machine learning models are commonly applied for ECG prediction.These models make forecasting based on historical data, such as classification model and regression model [10][11][12][13].For example, Liu et al. [14] developed a cardiac arrest classification model utilizing wavelet transform and the AdaBoost algorithm.This model effectively distinguishes cardiac arrest from ECG signals and predicts its occurrence with an impressive accuracy of 97.56% within 5 minutes before the event.Ebrahimzadeh et al. [15] employed a Multi-Layer Perceptron (MLP) to classify abnormal ECG signals, with the aim of predicting Sudden Cardiac Death (SCD).Their model demonstrates increasing prediction accuracy as it approaches the critical point of sudden death.Sengupta et al. [16] utilized Random Forest, Least Squares Discriminant, and Support Vector Machine in the classification of 12-lead ECG signals to predict abnormal myocardial relaxation and assess the likelihood of SCD.Hou et al. [17] presented a novel deep learning-based algorithm that combined an LSTM-based auto-encoder (LSTM-AE) network with support vector machine (SVM) for ECG arrhythmias classification.This method exhibits high accuracy, sensitivity, and specificity in classifying various heartbeat types, showcasing its potential for ECG arrhythmia classification.Kaya et al. [18] proposed an innovative approach that combines angle transform (AT) and LSTM for the automatic identification of congestive heart failure (CHF) and arrhythmia (ARR) using ECG signals.However, most of these methods achieve classification based on patients' ECG signals and those of healthy individuals, which struggles to address dynamic system modeling problems related to time [19,20].
With the rapid development of artificial intelligence, neural networks has been broadly applied in signal processing and achieves excellent performance.Jin et al. [21] proposed a regression model based on the Regularized Extreme Learning Machine (RELM) to predict ECG signals by analyzing the correlation between ECG and human gait.Zheng et al. [22]  In recent years, some powerful sequence models have been proposed to assist with ECG analysis with their advantage of exploring time-frequency based features [25][26][27][28][29][30].As one of the most commonly used sequence model, the Long Short-Term Memory (LSTM) network has been proven to be effective to track information over extended periods [31,32].For instance, Liu et al. [33] employed LSTM to predict influenza trends and achieved better results than linear models.Balci et al. [34] presented a hybrid Attention-based LSTM-XGBoost algorithm for detecting atrial fibrillation (AF) in longrecorded ECG data.Combined with preprocessing techniques, this method achieves a high accuracy, offering a reliable support system for clinicians and facilitating data tracking in long ECG record reviews.
However, traditional LSTM networks exhibit weak robustness and low prediction accuracy in complex tasks, as the memory cells store memories unrelated to the current time step [35,36].To address this issue, we propose an integrated approach combining data preprocessing and prediction model construction to predict ECG signal trends in advance.Firstly, data prepocessing is performed on the original ECG signals, including wavelet denoising, normalization, and phase space reconstruction.Then, the Nested LSTM model is utilized for signal prediction.At this step, an inner LSTM unit is adopted, which can guide the memory forgetting and memory selection of ECG signals, so as to improve the data processing ability and prediction accuracy of ECG signals.
This paper is organized as follows.In Introduction section, a brief introduction of the existing ECG signal prediction methods is made.A detailed algorithm and description of the proposed methods are presented in Theory and calculation section.Experiment and results section provides implementation details of the experiments.The effectiveness and superiority of the proposed method are verified through experiments and results analysis.Conclusion section summarizes this paper.

Model construction
The model construction proposed in this paper is presented in Fig. 1, including the data preprocessing strategy, the prediction model Nested LSTM and model evaluation.

Preprocessing methods
Considering the uncertainty and complexity of ECG signals, it would be a difficult to capture the trend of the data directly.Thus, the data preprocessing strategy is adopted to ensure the quality of data, after which some unwanted noise are removed from the ECG signals.Therefore, we employ a data preprocessing strategy to ensure data quality and remove unwanted noise from the ECG signals.
The preprocessing methods include signal denoising, normalization, and phase space reconstruction.
Signals denoising Due to the movement of human limbs, breathing, electromagnetic interference of the surrounding environment, ECG signals are accompanied by a lot of noise, including baseline drift, power frequency interference, electromyographic interference, and motion artifacts, which could have a certain impact on the prediction results.The frequency ranges and signal energy ranges of the four types of noise are as follows: (1) Baseline drift noise: the noise frequency is less than 5Hz; the energy range is between 0.01 Hz and 1 Hz.through the wavelet transform in the wavelet denoising method, and the wavelet function selects DB6; • Extract wavelet coefficients of each layer, including approximation coefficients and detail coefficients; • Obtain the threshold of each layer by using unbiased likelihood estimation; that is, give a threshold L for each layer, calculate its likelihood estimate, and then minimize the likelihood of L to obtain the threshold of each layer.The details of determining the threshold are as follows: Step 1: After squaring the wavelet coefficients of each resolution level, arrange them in order from small to large, and obtain the vector P = [P 1 , P 1 , ..., P N ] , where N represents the length of the wavelet coefficient.
Step 2: Calculate the risk vector R based on the vector P, and find the smallest R i in the risk vector as the risk value.The formula is as follows: Step 3: The threshold value L is calculated from the square of the wavelet coefficient P i corresponding to the risk value R i : • Denoise the decomposed 7-layer signals according to the selected threshold; if X i is the ECG signals of the i-th layer after denoising, and d i is the ECG signals of the i-th layer before denoising, the denoising method of each layer as follows: • Reconstruct the 7-layer signals through inverse wavelet transform; the reconstructed ECG signals is: We adopted the method of wavelet reconstruction after wavelet transform, that is, using the inverse wavelet transform method.The steps are as follows: Step 1: For the highest level of detail coefficients and the lowest frequency approximation coefficients, use the inverse high pass filter and inverse low pass filter of the wavelet basis function for upsampling and convolution to obtain the reconstructed signal.
Step 2: For each level of detail and approximation coefficients, the inverse high pass filter and inverse low pass filter of the wavelet basis function are used for upsampling and convolution to obtain the reconstructed signal.
Step 3: Repeat steps 1 and 2 until all levels of refactoring are completed.The ECG signal in the input layer of the predition model are reconstructed from Equations ( 7) and (8).The construction rule is to start from the first sampling point of the selected ECG signal and use the 1 − st to 99 − th sampling points as the input sample, the 100 − th sample point is taken as the output sample, and so on.The input samples of the i − th training set are the i i + 98 sam- pling points, and the output samples are the i + 99 − th sampling points, where i = 1, 2, ..., 5000 − 99 .A total of 4901 input-output sample pairs are generated, which is the training data set of the model.

Nested LSTMs model
Information exhibits time correlation, and historical information can hold valuable clues for predicting future events.Traditional machine learning methods only have short-term memory, which has prediction limitations in the case of limited information.LSTM introduces ingenious controllable self circulation to generate a path that allows gradient to flow continuously for a long time, which makes it especially suitable for processing tasks related to time series and tracking information in a longer time.As a result, the extended models of LSTM have received increasing attention by virtue of the obvious advantages.
Nested LSTM shares the same input layer, hidden layer, and output layer as LSTM, and its unit structure is illustrated in Fig. 3.In this figure, a new inner LSTM structure is adopted to replace the memory cells of the traditional LSTM.When accessing the inner memory, they are gated in the same way.Therefore, the Nested LSTM can access the inner memory more specifically, which makes the Nested LSTM prediction model has stronger processing capabilities for ECG signals and higher prediction accuracy.This enhancement allows the Nested LSTM to capture and utilize more intricate temporal patterns in the data, making it well-suited for tasks that require detailed information processing and precise predictions in the context of electrocardiogram signals.
Nested LSTM is divided into inner LSTM and outer LSTM.The gating system of both inner and outer LSTM is consistent with that of traditional LSTM.Within this system, there are four gating systems: forget gate, input gate, candidate memory cell, and output gate.The calculation equations for each gate are as follows.
Forget gate: Input gate: Candidate memory cell: Memory cell: The input and hidden states of the inner LSTM are: (9)  x t , h t−1 and c t−1 denote the current input, the hidden state and memory cell of the previous round, respectively.
The output of the output layer is: Where W yh denotes the weight matrix of the output layer.The Nested LSTM model is a deep neural network that incorporates both feedforward and feedback mechanisms.The feedforward mechanism completes the forward calculation of the Nested LSTM using equations ( 6)- (20).In contrast, the feedback mechanism employs the error backpropagation algorithm to train and update various network parameters.
The training process of the feedback mechanism first needs to define the loss function: (18) Where E t denotes the error at time t, E denotes the total error, y t denotes the training value, and ∧ y t is the tar- get value.
The weight matrix and the bias term of each gating system need to be updated by the loss function [31].Its process is shown in Fig. 4 and the specific steps are as follows: • Initialize the parameters of the prediction model and set the error threshold.• Input the ECG signals training set.
• Perform forward calculation according to equations ( 6)-( 20) to obtain the output corresponding to the current input.• Define the loss function.
• Solve the gradient of each weight according to the loss function, and then update the weight matrix according to the gradient guide and update the bias terms.

Data source
In this study, ECG signals were obtained from the Sudden Cardiac Death Holter Database on the PhysioNet website [37], a resource for complex physiological and biomedical signal research.The database features 20 patient groups who experienced actual cardiac arrest and exhibited potential sinus rhythm, persistent rhythm, and atrial fibrillation prior to the event.Medical experts  1.We randomly select a set of ECG signals as the training set, that is, randomly create a model for each subject.The construction rule is to start from the first sampling point of the selected ECG signal and use the 1 st to 99 th sampling points as the input sample, the 100 th sample point is taken as the output sample, and so on.The input samples of the i th training set are the i − i + 98 sampling points, and the output samples are the i + 99th sampling points, where i = 1, 2, ..., 5000 − 99 .A total of 4901 input-output sample pairs are generated, which is the training data set of the model.

Data preprocessing results
Utilizing all ECG signals as model input may introduce noise and decrease prediction accuracy.As a result, a wavelet denoising method is employed to remove noise.Denoising outcomes are displayed in Figs. 5, 6 and 7.
From Figs. 5, 6 and 7, it is evident that the signals 30.dat and 31.datexhibit noticeable baseline drift noise before denoising, while signal 32.dat exhibits both baseline drift and EMG noise.However, after denoising, the signals 30.dat-32.dat become more stable (as supported by Table 2).This demonstrates that the enhanced wavelet denoising method employed in this study effectively eliminates baseline drift and EMG signal noise, indicating its capability to denoise ECG signals.
This study assesses the denoising effect through visual and signal-to-noise ratio (SNR).The denoising results are further compared using SNR, a technical metric that  Where x(n) denotes the original signal, and x m (n) denotes the denoising signal.We can evaluate the denoising effect by comparing the SNR of ECG signal before and after denoising.The SNR results are shown in Table 2.
Table 2 demonstrates that a portion of the noise in the aforementioned ECG signals has been removed, as evidenced by the increased SNR before and after denoising.This finding more clearly indicates the presence of a significant amount of noise in the ECG signal, which could impact prediction outcomes.Consequently, it is necessary to employ a denoising method to process the ECG signal prior to making predictions.(

Predicion results and analysis
The Nested LSTM model was used to predict the risk of actual cardiac arrest for the 20 groups of ECG signals.In the experiment, we trained the model 20 times in total, and the training time was between 43s-58s.A selection of prediction results is presented in Figs. 8, 9 and 10.The error is calculated as the difference between the true value and the predicted value.
Figures 8, 9, and 10 illustrate notable fluctuations in the ECG signal.The actual values are represented in blue, while the fitted predictions are shown in red.Impressively, the proposed method exhibits remarkable proficiency in capturing these variations and accurately predicting trends in the ECG signal.The experimental results highlight the potential of predicting sudden cardiac death (SCD) before its occurrence, which holds lifesaving implications for patients.
There are several techniques for classifying the risk of SCD.Support Vector Machines (SVM) is a classical classification method that is widely used.Echo State Networks (ESN) and Long Short-Term Memory (LSTM) networks are adaptive data analysis methods that have been employed in SCD detection.Bidirectional LSTM (Bi-LSTM) is an improved method of LSTM, comprising forward and backward LSTM components, which enables the summarization of temporal information from both past and future contexts.To validate the performance of the Nested LSTM model, the four models mentioned above are compared.In  (  5.

Conclusion
In this study, we present an early prediction method for sudden cardiac death (SCD) risk using Nested LSTM based on electrocardiogram (ECG) sequential features to predict a patient's ECG signals.ECG prediction is an effective approach for the early prediction of SCD risk.One limitation of the traditional prediction methods is that it has a low predict accuracy for strong nonlinear ECG, which may turn out to be inappropriate for practical applications.Thus, it is highly desirable to develop an optimized ECG prediction model with a high prediction accuracy.Focusing on the timeliness and accuracy of prediction, this paper focuses on the nonlinear mapping capability of Nested LSTM for ECG signals.The memory cell of Nested LSTM is replaced by an inner LSTM, which has strong memory ability.To demonstrate the effectiveness and applicability of the proposed model, the ECG signals of 20 groups of actual cardiac arrest patients are taken for conducting the empirical study.The experiment results show that the proposed model achieves better performance in comparison with other four models.
The current similar methods include classification and regression techniques.References [14][15][16] employ classification models to anticipate abnormal and nonabnormal ECG patterns, whereas references [21][22][23] utilize regression models to align with ECG signal trends and identify ECG outliers.Distinguished from conventional classification techniques, this approach excels in forecasting the onset of SCD heartbeats, effectively capturing the dynamic, nonlinear, and nonstationary nature of time series, and adeptly accommodating the irregular trends in electrocardiogram signals.Furthermore, in contrast to traditional regression methodologies, the study devises an encompassing strategy that merges data preprocessing with predictive model development for ECG prediction.Empirical findings demonstrate a notable reduction in fitting errors, specifically in terms of RMSE and MAE, underscoring the efficacy of this novel methodology.
As for future work, we will study the multi-step prediction method of ECG signal characteristics, and use a series of deep learning methods and reinforcement learning methods to reduce multi-step prediction errors.More significantly, the practical applicability of ECG signal prediction methods will be verified in SCD diagnostic applications, potentially saving patients' lives from SCD events.
(2) Power frequency interference noise: the noise frequency range is 50 Hz or 60 Hz; the energy range is concentrated in frequency components near the power frequency.(3) Myoelectric interference noise: The noise frequency range is in the range of 5HZ 2000Hz.The energy range is between 10 Hz and 500 Hz, depending on the frequency of muscle activity.(4) Motion artifact noise: Noise frequency range is in the range of 3Hz 14Hz, depending on the frequency and amplitude of motion.The energy range is in the range of 5Hz 10Hz, reflecting the frequency components caused by the subject's movement.Therefore, it is necessary to denoise the ECG signals.Wavelet denoising method has the characteristics of multi-resolution analysis and has the ability to characterize the local characteristics of signals in both time and frequency domains.It is very suitable for analyzing nonstationary signals such as ECG signals and extracting the local characteristics of signals.Therefore, the wavelet denoising method is used to denoise the ECG signals in this paper.Its process is shown in Fig. 2. The specific steps are as follows: • Input the original ECG signals containing noise;

•
Output the denoised ECG signals.Normalization In order to obtain better fitting results and prevent the divergence of training results, it is necessary to standardize the training set, and the standardized equation is: Where X denotes the training set, X denotes the aver- age value of the training set, and σ denotes the standard deviation of the training set.Phase space reconstruction In the model training, we aim to predict the ECG signals in the next time step based on the historical data.It is necessary to use prior data as the training data to predict ECG signals in future time steps.In this paper, the final training set is constructed by phase space reconstruction.If the reconstruction dimension is m, the time delay is tau, and the ECG signals of the normalized training set is X = x 1 , x 2 , • • • , x n+1+(m−1)tau , then the recon- structed training set is:

Fig. 4
Fig. 4 Training flowchart of prediction model

Fig. 7
Fig. 7 Comparison of 32.dat signal before and after denoising

Fig. 8
Fig. 8 One-step prediction fitting result and error result of 30.dat signal

Fig. 13 Fig. 14
Fig. 13 MAE results (a) • Judge whether the training error is less than the error threshold; if yes, skip to Step7; if not, skip to Step3.• End of training.

Table 1
ECG Signals Information

Table 2
SNR results

Table 3
RMSE resu12 andE can reflect the overall error of signal prediction.According toFigs.11,12andTable 3, it can be observed that the SVM model has the highest RMSE.The SVM model, which lacks a feedback structure and cannot obtain historical information characteristics, performs poorly in ECG signal prediction.The ESN model, despite having a feedback structure, is simple and also exhibits a large error.The LSTM and Bi-LSTM models have low RMSE values, with their results being quite similar.Nested LSTM enhances the memory function of LSTM and extracts historical information features more accurately.When compared to the SVM, ESN, LSTM,

Table 5
Average results LSTM models, the RMSE of Nested LSTM is the smallest.MAE represents the average of the absolute error between the predicted value and the observed value.According to Figs.13, 14 andTable 4, it can be seen that the SVM model also has the largest MAE, and the Nested LSTM has the smallest MAE among the five models.According to Table 5, it can be seen that the SVM model has the largest average RMSE and average MAE, and the Nested LSTM model has the smallest average RMSE and average MAE.When compared to the SVM, ESN, LSTM, and Bi-LSTM models, the average RMSE of Nested LSTM is reduced by 61.1%, 41.2%, 20.5%, and 17.9%, respectively.And the average MAE of Nested LSTM is reduced by 88.7%, 78.2%, 25.8%, and 32.6%, respectively.In conclusion, the Nested LSTM model demonstrates a strong nonlinear mapping ability for ECG signals.