Evidential MACE prediction of acute coronary syndrome using electronic health records

Background Major adverse cardiac event (MACE) prediction plays a key role in providing efficient and effective treatment strategies for patients with acute coronary syndrome (ACS) during their hospitalizations. Existing prediction models have limitations to cope with imprecise and ambiguous clinical information such that clinicians cannot reach to reliable MACE prediction results for individuals. Methods To remedy it, this study proposes a hybrid method using Rough Set Theory (RST) and Dempster-Shafer Theory (DST) of evidence. In details, four state-of-the-art models, including one traditional ACS risk scoring model, i.e., GRACE, and three machine learning based models, i.e., Support Vector Machine, L1-Logistic Regression, and Classification and Regression Tree, are employed to generate initial MACE prediction results, and then RST is applied to determine the weights of the four single models. After that, the acquired prediction results are assumed as basic beliefs for the problem propositions and in this way, an evidential prediction result is generated based on DST in an integrative manner. Results Having applied the proposed method on a clinical dataset consisting of 2930 ACS patient samples, our model achieves 0.715 AUC value with competitive standard deviation, which is the best prediction results comparing with the four single base models and two baseline ensemble models. Conclusions Facing with the limitations in traditional ACS risk scoring models, machine learning models and the uncertainties of EHR data, we present an ensemble approach via RST and DST to alleviate this problem. The experimental results reveal that our proposed method achieves better performance for the problem of MACE prediction when compared with the single models.


Background
Acute coronary syndrome (ACS) refers to a group of conditions due to decreased blood flow in the coronary arteries such that part of the heart muscle is unable to function properly or dies [1,2]. Major adverse cardiac events (MACE) indicates the composite of a variety of adverse events related to the cardiovascular system [3,4], which may lead severe or fatal outcome for ACS patients. MACE prediction, as a crucial and widely explored topic, plays a pivotal role in the optimal management for ACS patients at their early stage of hospitalization, e.g., clinical decision making of care and treatment, drug development and cost estimation [4,5].
Over the past decades, a mountain of studies has been proposed to facilitate risk assessment [1,4]. Many traditional ACS risk score tools, e.g., TIMI [5], PURSUIT [6] and GRACE [7], have been widely used in real clinical circumstances and shown good discriminatory accuracy in predicting MACE for ACS patients [8,9]. However, these traditional models have several inherent limitations [10]. In particular, these models developed using data from clinical trials and registries may be not representative of a general department patient population because there are strict inclusion and exclusion criteria of the cohort [1]. In addition, to obtain a simple and easy-use tool, traditional risk scoring models are established on a small set of handy-picked risk factors based on the significant univariate relationship to the end point by univariate logistic regression, which may cause deterioration of predicting performance [4,10,11]. Moreover, it is hard to enroll new and more discriminatory risk factors into those traditional models, which limits their extension ability [1].
Recently, with the rapid growth of electronic health records (EHRs) data, a multitude risk prediction models utilizing the potential of EHRs have become available and achieved significant improvements in this field [4,[10][11][12][13]. Most of these models are built based on machine learning and data mining techniques. Although valuable, there are still some deficiencies to apply them on mining EHRs, particularly due to the vagueness, impreciseness and uncertain clinical information contained in EHR data. Specifically, most of these models assume that MACEs have been correctly annotated in the EHR dataset and the focus is on the learning capabilities of the MACE prediction scheme. However, unambiguous MACE annotations may be difficult and imprecise due to the lack of information required for specifying certain MACE labels to patient individuals.
Both the traditional risk scoring models and machine learning based models provide us with diverse perspectives on the problem of MACE prediction [4], so that each of them results in complementary information and could be fused to produce an integrative and reliable result. By utilizing a proper strategy for the construction of an ensemble network, it can be successfully applied to MACE prediction problem with imprecise and uncertain information. Dempster-Shafer Theory [14,15] (DST) of evidence is a general framework for reasoning with uncertainty by combining multiple evidences together to obtain a more reliable result, which has been widely employed in sensor fusion [16], financial distress detection [17], medical diagnosis [18] and etc. To this end, we propose a hybrid method using Rough Set Theory [19] (RST) and Dempster-Shafer Theory of evidence for MACE prediction. The proposed approach integrates four state-of-the-art models, including one traditional ACS risk scoring model, i.e., GRACE, and three machine learning based models, i.e., Support Vector Machine [20] (SVM), L 1 -Logistic Regression [21] (L 1 -LR), and Classification and Regression Tree [22] (CART), to generate comprehensive and reliable MACE prediction results. In particular, RST is applied to determine the weights of the four single models, and then the prediction results generated by these single models are assumed as basic beliefs for the problem propositions and in this way, an ensemble MACE prediction result is generated by combine each single model's evidence such that the overall prediction performance can be enhanced.
We comparatively evaluate the performance of the proposed model on a clinical dataset consisting of 2930 ACS patients and collected from the cardiology department of Chinese PLA General Hospital. The experimental results demonstrate that, in terms of reducing uncertainty caused human subjective cognition on patient data recording and annotation, our proposed method performs better than traditional single models.

Rough set theory
Rough set theory was first proposed by Pawlak [19], which is widely used to deal with problem containing uncertainty. In RST, an information system is defined as a pair I ¼ ðU; A∪RÞ, where U = {u 1 , u 2 , … , u t } is a nonempty set of finite objects, A = {a 1 , a 2 , … , a n } is a nonempty set of finite attributes, R = {r 1 , r 2 , … , r m } is a nonempty set of finite results. With each subset P A, there is an indiscernibility relation (also called equivalence relation) defined asIND(P) = {(x, y) ∈ U 2 | ∀a i ∈ P, a i (x) = a i (y)}. The set of objects U can be partitioned based on the relation IND(P), which is denoted by U ∕ IND(P), where an element from U ∕ IND(P) is called an equivalence class. According to equation above, the indiscernibility relation of A, R, and A − {a j }, are defined as IND(A) = {(x, y) ∈ U 2 | ∀a i ∈ A, a i (x) = a i (y)}, IND(R) = {(x, y) ∈ U 2 | ∀r i ∈ R, r i (x) = r i (y)}, and IND(A − {a j }) = {(x, y) ∈ U 2 | ∀a i ∈ A, a i ≠ a j , a i (x) = a i (y)}, j = 1, 2, … , m. Depending on the theory of entropy, the dependence of R to A can be defined as: where p½x ¼ card½x card½U , pð½y∕ ½xÞ ¼ cardð½y∩½xÞ card½x . The significance of attribute a j can be defined as: Finally, the weight of attribute a j is defined as follows:

Dempster-Shafer theory
Let Θ be the frame of discernment, which represents all possible mutually exclusive states of a system. The power set 2 Θ is the set of all subset of Θ, including the empty set ∅, which represents propositions related to actual state of the system. The basic probability assignment ( conflict between the evidences, called conflict probability. And the coefficient 1 1−K is a normalization factor.

Methods
In this study, we propose an ensemble approach to integrate traditional risk scoring models and advanced machine learning based models together to alleviate the limitations we mentioned above. Figure 1 shows the outline of our proposed method. As depicted in Fig. 1, we firstly calculated the weights for the four single models, i.e., GRACE, SVM, CART, and L 1 -LR, based on RST. After that, we employed the DST to integrate the weighted outputs of each model together as our ensemble MACE prediction result.
To give a more understandable explanation for our proposed method, we employed a subset of our real world dataset to show how we implemented our method step by step. Table 1 shows 10 patient samples from the collected dataset with their corresponding outputs from models trained in our previous work.

Weights calculation using rough set theory
Before calculating the weight of each single prediction model, we need to transform the models' outputs into dichotomous variables, such that we can apply RST to calculate the dependence of each model to the final prediction results. We choose the output that is closest to the top-left point in the area under the curve (AUC) figure as our threshold to transform the model's outputs. Experimentally on all patient samples we have, the thresholds are 0.2348, 0.22689, 0.2584 and 106.5 for SVM, L 1 -LR, CART and GRACE, respectively. We tend to use the data obtained from our work to give a more practical description in this and following sections. According to the dichotomized outputs, we can calculate the weight for each single model based on Eq. (1-3). The weights are 0.5363, 0.1765, 0.1177 and 0.1696 for SVM, L 1 -LR, CART and GRACE. Table 2 shows the dichotomized outputs, optimal thresholds and weights of the 4 single models.

Model fusion using Dempster-Shafer evidence theory
Before using the Dempster-Shafer Theory to combine the four models' outputs together, we need to transform the models' outputs into basic probability assignments (BPA). However, in our study, we notice that the range of GRACE's outputs is from 2 to 258, which cannot be Fig. 1 The outline of the proposed method directly used as the BPA, and moreover, the four single models we employed have different optimal thresholds which may influence the combination results. To alleviate these problems, we first normalize the GRACE's outputs to between 0 and 1 by Eq. (5), and then apply Eq. (6) to adjust the threshold of each single model to the same value, i.e. 0.5, to eliminate the influence caused by different optimal thresholds.
where n is the number of patients, O GRACE, j and A GRACE, j indicate the original and normalized output of the GRACE model for the jth patient, respectively. max GRACE and min GRACE , the maximum value and minimum value of the original output of GRACE, are 37 and 201 in our study, respectively.
where A * i, j is the adjusted output of ith model for the jth patient with i∈{SVM, L 1 -LR, CART, GRACE}, Threshold i is the ith model's optimal threshold utilized in the dichotomization procedure for weights calculation using RST. Table 3 shows the adjusted outputs of each single model based on Eqs. (5,6).
Based on the adjusted outputs, we can obtain the BPA for each patient. In our method, we combined the weights calculated by RST into the BPA using the following functions: where w i is the weight of the ith model with i∈{SVM, L 1 -LR, CART, GRACE}. According to the weighted BPA obtained by Eqs. (7-10), we can employ the Dempster's combinational rule to combine the four models' BPA functions together. Based on Eq. (4), we have:  Table 2 The dichotomized outputs, optimal thresholds and weights of single models for 10 patient samples Thus, the final decision value for the jth patient, i.e., R all, j , can be simply represented as: Table 4 shows the patient sample's BPA, the combined BPA and the final decision value. Note that the prediction results are determined by the optimal threshold of decision value, i.e., 0.4759, determined based on the same criteria as the dichotomization procedure. After all the procedures above, we can obtain the ensemble prediction model, which can consider the weight of each single model calculated by RST when combining the BPA by DST.

Experiments and results
Based on our previous work, we have obtained the original outputs of the four single models, e.g., SVM, L 1 -LR, CART and GRACE, for a total of 2930 ACS patient samples collected from the Cardiology Department of the Chinese PLA General Hospital. We employed  Table 5 illustrates four single models' weights in 5-fold cross validation. Tables 6 and 7 shows the AUC value and accuracy for all models in our study. From Table 5, we can find that each model has different weights in each fold, which indicates that the weight calculation step in our method distinguishes the discrimination ability of each single model and affects the construction of the proposed model in each fold cross validation. As illustrated in Tables 6 and 7, we can notice that our proposed method achieves the highest AUC value comparing with the 4 single models which means it can combine the output of each single model and generate a more reliable prediction result. And also, the accuracy of our model is competitive in all models with AUC values above 0.70. Moreover, when compared with the traditional ensemble methods, i.e., Bagging and AdaBoost, our models achieve a better performance with a significant margin. Furthermore, we can notice that the proposed model is the only one whose all AUC values in 5-fold are above 0.70 with a competitive standard deviation, which indicates the outstanding stability of our method. Figures 2 and 3 presents a more understandable comparison between our proposed model and other models.

Discussion
The problem of MACE prediction plays a vital role in the optimal treatment management for ACS patients during their hospitalizations. Facing with the limitations in traditional risk scoring models, machine learning methods and the uncertainties of EHR data, we present an ensemble approach to alleviate this problem. We firstly employed RST to determine each single MACE prediction model's weight. And then, DST was applied to combine all weighted single models as our ensemble model so as to enhance the performance of MACE prediction. Experiments have been conducted on a clinical dataset collected from the Cardiology Department of the China PLA General Hospital. The experimental results show our proposed method achieves the best prediction performance with 0.715 AUC value, which indicates our model can combine various information provided by the single models to generate more reliable and stable prediction result on the MACE prediction problem.
It should be mentioned that there exist some problems needed further exploration.
In our current work, the single models we employed are based on our previous work directly with no further selection. However, the single model's outputs will have a significant impact on the final prediction results. Thus, we need to explore which single models are the most appropriate for the proposed method to combine so as to improve the prediction performances. Furthermore, resampling, a key technique to construct more single models, is also a potential direction to build more powerful and robust ensemble prediction model based on the proposed method.
In our future research, we plan to develop and deploy a continuous MACE prediction service in practice. Note that the dynamic nature of a patient status is often essential to risk stratification and subsequent treatment interventions adopted in clinical practice. Thus, it would be valuable to provide a continuous MACE prediction    Fig. 3 The average accuracy values with standard deviation service during patients' length of stay. Such a service not only anticipate MACEs at runtime, but also monitors patient treatment processes in a continuous and predictive fashion.

Conclusion
In this paper, we present an ensemble approach to alleviate the limitations in traditional ACS risk scoring models, machine learning models and the uncertainties of EHR data. We first employed RST to determine the weight for each single model. After that, DST was applied to combine the weighted outputs of single models as the final prediction results. The experimental results indicate our proposed method achieves 0.715 AUC value with a competitive standard deviation, which is a better performance for the problem of MACE prediction when compared with the single models.