Skip to main content

Research on early warning of renal damage in hypertensive patients based on the stacking strategy

Abstract

Background

Among the problems caused by hypertension, early renal damage is often ignored. It can not be diagnosed until the condition is severe and irreversible damage occurs. So we decided to screen and explore related risk factors for hypertensive patients with early renal damage and establish the early-warning model of renal damage based on the data-mining method to achieve an early diagnosis for hypertensive patients with renal damage.

Methods

With the aid of an electronic information management system for hypertensive out-patients, we collected 513 cases of original, untreated hypertensive patients. We recorded their demographic data, ambulatory blood pressure parameters, blood routine index, and blood biochemical index to establish the clinical database. Then we screen risk factors for early renal damage through feature engineering and use Random Forest, Extra-Trees, and XGBoost to build an early-warning model, respectively. Finally, we build a new model by model fusion based on the Stacking strategy. We use cross-validation to evaluate the stability and reliability of each model to determine the best risk assessment model.

Results

According to the degree of importance, the descending order of features selected by feature engineering is the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, the average diastolic blood pressure at daytime, body surface area, smoking, age, and HDL. The average precision of the two-dimensional fusion model with full features based on the Stacking strategy is 0.89685, and selected features are 0.93824, which is greatly improved.

Conclusions

Through feature engineering and risk factor analysis, we select the drop rate of systolic blood pressure at night, the red blood cell distribution width, blood pressure circadian rhythm, and the average diastolic blood pressure at daytime as early-warning factors of early renal damage in patients with hypertension. On this basis, the two-dimensional fusion model based on the Stacking strategy has a better effect than the single model, which can be used for risk assessment of early renal damage in hypertensive patients.

Peer Review reports

Background

According to the 2020 international society of hypertension global hypertension practice guidelines, hypertension is related to cerebrovascular disease and ischemic heart disease. It is also a major risk factor for the incidence and death due to chronic kidney disease [1]. Hypertension can affect the function of organs in the whole body, and the kidney is most easily affected.

In China, the number of uremic patients caused by hypertension reaches 1.5 million every year [2]. Furthermore, among the problems caused by hypertension, early renal damage is often ignored because of unclear symptoms. The typical symptoms and signs of chronic renal failure gradually appear as time goes by. It can not be diagnosed until the condition is severe and irreversible damage occurs. Identifying such patients early and making correct interventions is a critical challenge for clinicians because it is related to delaying the progress of renal damage and reducing medical expenses and is closely related to the prognosis of patients. Therefore, we need to pay great attention to the early renal damage in hypertensive patients.

In clinical practice, it is hard to realize the early diagnosis of the high-risk population of hypertensive renal damage and guide different patients to choose the most suitable scheme to receive treatment in time. Because of few or no symptoms in the early stage of chronic kidney disease (CKD), most patients with renal damage fail to get a timely diagnosis. Many hypertensive patients who look healthy may have developed CKD, and current methods fail to diagnose these patients fully. We intend to establish an early-warning model based on data mining to evaluate the risk of early renal damage by integrating the relevant factors. The factors including cardiovascular risk factors, blood pressure parameters, biochemical blood indicators, and related biomarkers [3,4,5]. Then we can use the model to identify the high-risk patients early to make a definite diagnosis and give timely treatment. Then we should explore an effective management mode of early hypertensive renal damage to control the risk factors of this population and reduce the incidence rate and harm of CKD.

Methods

In order to achieve the early diagnosis of a high- risk population with hypertensive renal damage, we will screen the early-warning risk factors of early renal damage by feature engineering [6]. Based on these risk factors, we use a data mining approach to establish an early-warning model of renal damage, which fuses three machine learning sub-models: XGBoost, Random Forest, and Extra-Trees by Stacking strategy [7,8,9]. The specific steps are as follows. (1) data preparation; (2) exploratory data analysis; (3) feature construction; (4) feature selection; (5) model optimization and fusion; (6) model evaluation. The specific process of model construction is shown in Fig. 1.

Fig. 1
figure 1

The steps of model construction

Data preparation

From November 2011 to May 2013, Beijing Anzhen Hospital of Capital Medical University (Beijing Institute of Heart Lung and Blood Vessel Diseases), Third Xiangya Hospital of Central South University (Hunan Hypertension Research Center), and Chenzhou No.1 People’s Hospital of Hunan Province (Translational Medicine Institute of University of South China) received 513 patients without complications who have initially diagnosed hypertension. They aged between 35 and 64, including 319 males and 194 females. None of the patients had ever taken any antihypertensive drugs before their visit. According to their albumin- to-creatinine ratio(ACR) levels, the patients are divided into two groups: positive group (30-300mg/g), which is the early renal damage group, and control group (< 30mg/g), which is the normal renal function group. The number of patients in the two groups is 191 and 322, respectively.

In the comparison of the data of the two groups of patients, the levels of fasting blood glucose(FBG), triglyceride(TG), uric acid(UA), and red cell distribution width(RDW) in the positive group are greatly higher than those in the control group. Furthermore, the differences between the two groups are statistically significant (P <0.05). The levels of sex ratio, body mass index(BMI), high-density lipoprotein(HDL), low-density lipoprotein(LDL), blood urea nitrogen(BUN), and serum creatinine(Scr) are similar. The differences are not statistically significant (P > 0.05). See Table 1.

Table 1 The Comparison of clinical and biochemical data

Exploratory data analysis

Exploratory data analysis is a data analysis method [10, 11] to explore data structures utilizing mapping, tabulation, equation fitting, calculation of characteristic quantity, and other means for existing data under the minimum prior assumption, specifically including statistical characteristics of data fields, missing situation, distribution, correlation and so on, to facilitate the later feature engineering and model construction.

We conducted exploratory data analysis on the collected hypertension patient data. First, we count the number, missing values, mean, standard deviation, median, minimum, maximum, 25% quantile, 50% quantile, and 75% quantile of individual attributes. Then according to the statistical results, we select the appropriate attributes for the distribution statistics. Finally, we count the P-value of a single attribute, ACR and the correlation coefficient between multiple attributes. The relevant processing results are shown in Tables 2, 3, 4, and 5.

Table 2 Missing value statistics
Table 3 Data distribution statistics
Table 4 Multi-collinearity analysis
Table 5 Data statistical analysis

The features with more missing values (> 40%) and unimportant can be deleted. Features with fewer missing values can be filled. We can use statistics to fill in mean, median, and mode. It is recommended to use the median for continuous values, excluding the influence of some large or small outliers. For discrete values, we can use mode to fill in.

Feature construction

Based on the information obtained from data analysis and combined with the understanding of hypertensive renal damage, we analyze and construct the following features.

  1. 1.

    Personal information features: height, weight, age, sex, BMI, smoking or not, and body surface area(BSA).

  2. 2.

    Ambulatory blood pressure features: 24-h average SBP, 24-h average DBP, 24-h average heart rate, day average SBP, day average DBP, day average heart rate, night SBP drop rate, night DBP drop rate, blood pressure circadian rhythm, night average DBP, and night average SBP.

  3. 3.

    Blood biochemical and routine features: HDL, TG, FBG, UA, LDL, RDW, and BUN.

Feature selection

Feature selection is also called feature subset selection or attribute selection. It refers to selecting a subset of features from all features to make the constructed model best [12]. In the application of data mining, the number of features is usually large, among which there may be uncorrelated features, and there may be interdependence between the features. It is easy to increase the model training time and cause a curse of dimensionality [13]. In addition, the model will also become complicated, and its generalization ability will decline.

For feature selection, we use the following methods:

  1. 1.

    Using the variance selection method, we calculated the variance of each feature and then eliminated the feature with variance more minor than the threshold.

  2. 2.

    We calculated the correlation coefficient and P-value between each feature and the target value using the correlation coefficient method.

  3. 3.

    Using the variance inflation factor to determine the correlation between variables to perform multicollinearity detection.

  4. 4.

    Using the random forest as the base model to train to get the importance of different features for selection.

Model optimization and fusion

We use the K-Fold function for cross-validation in the scikit-learn(Python package) to divide the data into five sets of train sets and test sets to perform 5-fold cross-validation [14]. It can effectively avoid the risk of overfitting caused by limited data volume. The data distribution in the train set and test set is similar to the distribution of all data.

In order to determine the best prediction model, we use Random Forest, Extra-Trees, and XGBoost to train the data. During model training, we use grid search to adjust and optimize model parameters. Grid search is a model parameter optimization method [15] whose essence is an exhaustive method. We select a small finite set for each parameter to explore and carry out the Cartesian product on these parameters to obtain several sets of parameters. Then, grid search uses each set of parameters to train the model and picks out the best set of parameters [16].

After the above model training is completed, we use the Stacking method to integrate the above models to build a new model to improve the prediction effect. The Stacking model fusion strategy is based on the idea of K-fold cross-validation, whose essence is a hierarchical model integration framework to stack the learning ability of different models for different features. However, as the number of layers increases, there is a risk of overfitting. Therefore, we usually use a two-layer model to reduce the number of data repeat training. The first layer model comprises several base learners whose input is the original train set. Moreover, the second layer model uses the output of the first layer model as the train set to retrain. The structure of the Stacking model fusion strategy is shown in Fig. 2.

Fig. 2
figure 2

The structure of Stacking model fusion strategy

We use the two-dimensional fusion model based on the stacking strategy. The first layer model uses the combination of random forest, extra trees, and XG-Boost as the base learner to train the data, and the second layer model uses XGBoost to train the output of the first layer model.

Result

Risk factors

The steps of feature selection are shown in Fig. 3. After feature selection, we select eight features. According to the order of importance from high to low, they are as follows: drop rate of systolic blood pressure at night(night SBP drop rate), red blood cell distribution width(RDW), blood pressure circadian rhythm, average diastolic blood pressure at daytime(day average DBP), body surface area(BSA), smoking, age, and HDL. The importance of features is shown in Table 6. Besides, we have compared full features with selected features results based on the Stacking strategy are shown in Table 7. It shows that the selected eight features are of great significance for predicting renal damage.

Fig. 3
figure 3

The steps of feature selection

Table 6 The importance of features
Table 7 The comparison of fivefold cross validation for full features vs selected features

Model

The precision, recall, F1 score results of 5-fold cross- validation for each model are shown in Table 8. In single model training, the effect of Random Forest and XGBoost is similar. Compared with Random Forest, Extra-Trees, and XGBoost, the two-dimensional fusion model based on the Stacking method has the highest precision, recall rate, F1 value. The first layer of the fusion model is consists of XGBoost, Extra-Trees, and RF. And the second layer of the fusion model is XG-Boost.

Table 8 The results of fivefold cross validation for each model

The precision of each fold in 5-fold cross-validation for each model is shown in Fig. 4. The recall is shown in Fig. 5. The F1 score is shown in Fig. 6. The Precision-Recall curve of each model is shown in Fig. 7. From the training results of each fold, we can find that the training effect of the fusion model based on the Stacking method in each fold is in the top two. It shows that the fusion model based on the Stacking method integrates the learning ability of different models for different features to improve the prediction effect on all data. In addition, as can be seen from Fig. 7, the stacking effect is the best of all models.

Fig. 4
figure 4

The precision of each fold

Fig. 5
figure 5

The recall of each fold

Fig. 6
figure 6

The F1 score of each fold

Fig. 7
figure 7

The Precision-Recall curve of each model

Discussion

Risk factor analysis

In screening CKD patients and monitoring renal function in the treatment, the main clinical index is serum creatinine. But serum creatinine assessment is not sensitive to detecting early subclinical changes and predicting renal function decline after treatment. In the preclinical stage of CKD, we need new monitoring indicators to evaluate such patients. Early renal damage can be judged by microalbuminuria and glomerular filtration rate (GFR). However, the role of urinary microalbumin has not been deemed significant due to red measurement errors. GFR is affected by many factors. Even though nuclear medicine method measurement is a gold standard, it is seldom carried out due to the complexity of cost and operation. The estimated GFR can not reflect the real renal function because the formula is complicated, and the results of different formulas are pretty different. This part of the study aims to understand the early renal damage of untreated hypertension patients, screen the relevant risk factors, and find out specific high-risk factors. It also provides quantitative indicators (early warning signals) for early renal damage hypertension patients and cardiovascular clinicians to prevent CKD’s progress better.

Abnormal blood pressure indexes

The comparison results between the two groups show that the patients in the early renal damage hypertension group are older than those in the control group. Moreover, their HDL and BSA are lower, and their blood pressure index is higher than the control group. Especially the nighttime blood pressure level and blood pressure variability. Further analysis shows that abnormal blood pressure rhythm in the two groups is quite different. In the early renal damage group, the proportion of non-dipper type, reverse-dipper type, and deep dipper type account for 75.9%, 14.1%, and 3.1%, respectively, while the normal rhythm is more petite than 10%. In contrast, 72.4% of the patients in the control group have normal blood pressure rhythm. The blood pressure circadian rhythm analysis indicates that the difference in nighttime blood pressure drop rate between the two groups is statistically significant. The drop of nighttime blood pressure is weakened in the early renal damage group.

Cheng Dong et al. [17] found that the drop of blood pressure at night was a significant predictor of renal damage in hypertensive patients. Mingling et al. [18] studied the albuminuria and blood pressure level of hypertension patients in five different regions in China. It found that poor blood pressure control was an essential factor for proteinuria. Effective blood pressure control was critical in reducing proteinuria, improving endothelial function, and renal protection. Our study finds that SBP, DBP, and PP(clinic, 24-h, day, night) in the ACR positive group are higher than those in the control group (P < 0.05) to indicate that the higher the blood pressure level, the higher the incidence of ACR. In addition, our study also concludes that the drop rate of nighttime systolic blood pressure and the average diastolic blood pressure in the daytime are risk factors for ACR occurrence. Furthermore, it shows that controlling blood pressure levels is significant for patients with hypertension.

Abnormal blood pressure rhythm

People’s blood pressure is higher in the day and lower at night. That is to say, the blood pressure drops during sleep at night and is the lowest in the early morning; the blood pressure starts to rise in the early morning and then presents the first peak. In normal people and patients with arytenoid rhythm hypertension, sympathetic activity, cardiac output, and blood pressure decrease during sleep. Huijuan et al. [19] found that compared to patients with dipper hypertension, patients with non-dipper and anti-dipper hypertension were closely related to early renal damage indicators. It indicated a close relationship between the abnormal circadian rhythm of blood pressure and early renal damage. Zeming et al. [20] found that the abnormal blood pressure circadian rhythm was the important factor causing the early-stage renal damage, reverse-dipper make early-stage renal damage was more significant than in the control group. Nighttime systolic blood pressure levels and blood pressure circadian rhythm had crucial clinical significance for earlystage renal damage in patients with hypertension.

This study suggests that the ambulatory blood pressure level of patients with early renal damage of hypertension increases with the increase of urinary microalbumin, which is manifested by the increase of nighttime blood pressure, significantly the increase of nighttime diastolic blood pressure. The study also finds that the early renal damage of hypertension is often accompanied by abnormal blood pressure circadian rhythm, and it has existed in hypertensive patients without microalbuminuria.

In comparison with the control group, the patients with dipper and non-dipper rhythm, the patients with anti-dipper rhythm have higher ACR, night SBP, night DBP, and night PP. While the decrease of eGFR is more prominent than the control group. It suggests that anti-dipper rhythm plays a relevant and independent role in the occurrence and development of early renal damage in hypertension, regardless of whether the clinic blood pressure level and dynamic blood pressure level are the same. In addition, nighttime blood pressure level and circadian rhythm are positively correlated with ACR but not with eGFR. It suggests that the anti-dipper blood pressure circadian rhythm is independently correlated with microalbuminuria in patients with hypertension. Our study finds that all patients in the anti-dipper rhythm group have early renal damage, which may be due to a small sample size or a biased selection. However, it is enough to show that the early renal damage in the anti-dipper rhythm group is more severe than in the control group. In the future, we need larger samples and more evidence to confirm the causal relationship between the anti-dipper rhythm and early hypertensive renal damage.

Red blood cell distribution width

A series of studies confirmed the correlation between RDW(Red cell Distribution Width) and hypertension. Tanindi et al. [21] found that hypertensive patients had higher RDW levels and higher systolic and diastolic blood pressure than prehypertensive patients. Perlstein et al. also found that the systolic blood pressure level and the proportion of hypertensive patients were significantly increased in people with higher RDW [22]; Formal et al. found that RDW is closely related to the delay in the reduction of the nighttime blood pressure in hypertensive patients, which is an independent predictor of nighttime non-dipper blood pressure [23]. Correlation between RDW and renal function has also been reported. Ujszaszi et al. [24] observed that RDW was independently associated with decreased renal function in renal transplant patients and considered it as a potential new auxiliary parameter for clinical evaluation for patients with chronic kidney disease. Recently, Solak et al. found that RDW was significantly increased in patients with CKD from stage 1 to stage 5, which was closely related to endothelial dys- function in patients with chronic kidney disease [25]. However, the above studies are limited to the CKD population, and their results may be affected by drug and disease progression.

This study finds that RDW is associated with early renal damage in hypertensive patients, and the ACR ratio also tends to increase as RDW increases. Combined with the data in this group, hypertensive patients have different degrees of early renal damage. RDW is a sensitive indicator for the diagnosis of early renal damage in hypertensive patients, and RDW is a common item of routine blood examination. The method is convenient, fast, and inexpensive. Of course, RDW, as an indicator of risk assessment of early renal damage in hypertensive patients, still needs evidence support from prospective studies in the future.

Model analysis

When we use the Stacking method for model fusion, the corresponding results may be different when the model combination of each layer is different. In order to determine the best combination of models, the first layer model uses the random combination of Random Forest, Extra-Trees, and XGBoost as the basic learner, and the second layer uses XGBoost(from the figure of precision and recall, the RF is unstable and the generalization ability is weak, so XGBoost is used). Then, we carry out 5-fold cross-validation on the data. Through comparison, we can find that the average precision of two-dimensional fusion model based on Random Forest is the best. However, the random combination of Random Forest, Extra-Trees, and XG- Boost is the most stable. And the F1 and recall of two- dimensional fusion model based on XGBoost, Random Forest and Extra-Trees is the best. Therefore, the random combination of Random Forest, Extra-Trees, and XGBoost is the best. The results of 5-fold cross-validation for each model combination are shown in Table 9.

Table 9 The results of fivefold cross validation for each combination. ET is ExtraTrees, RF is Random Forest, and XGB is XGBoost

Limitations

There are some limitations in current research. In the aspect of screening risk factors of renal damage in hypertension, due to the inherent limitations of a case-control study, to further clarify the relationship between the above risk factors and early renal damage in hypertensive patients, it needs to be further confirmed by more centers, larger samples, and prospective studies. In establishing an early warning model of renal damage, a small sample is a severe limitation, which will affect the precision and generalization ability of the model. However, the small sample and data imbalance are common in clinical research. How to apply the model to clinical research still needs further exploration.

In order to overcome the limitations of this study, we should collect more data about hypertensive patients with early renal damage to validate and optimize the model. Moreover, we may solve small sample limitations by few-shot learning. In addition, we could fuse other models with better effects to get the better result [26].

Conclusion

This study mainly carries out the application research of data mining combined with routine clinical items in early warning of renal damage in hypertensive patients. We then use feature engineering and risk factor analysis to screen for risk factors such as the drop rate of systolic blood pressure at night, red blood cell distribution width, blood pressure circadian rhythm, and the average diastolic blood pressure at daytime as early renal damage’s warning sign. On this basis, the early-warning model of early kidney damage constructed by the Stacking model fusion strategy has a better effect than the single model. This model can diagnose renal damage in hypertensive patients and has important significance for screening high-risk populations. We can try to fuse the better model and test its prediction effect in the future. At the same time, the methods and ideas of this research can also provide new methodological references for similar early-warning research and evaluation.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

BMI:

Body mass index

BSA:

Body surface area

FBG:

Fasting blood-glucose

2hPBG:

2H postprandial blood glucose

TG:

Triglyceride

HDL-C:

High density lipoprotein cholesterol

LDL-C:

Low density lipoprotein cholesterol

BUN:

Blood urea nitrogen

Scr:

Serum creatinine

UA:

Uric acid

eGFR:

Estimated glomerular filtration rate

ACR:

Albumin-to-creatinine ratio

hs-CRP:

High-sensitivity C-reactive protein

RDW:

Red cell distribution width

OBPM:

Office blood pressure measure

ABPM:

Ambulatory blood pressure monitoring

SBP:

Systolic blood pressure

DBP:

Diastolic blood pressure

PP:

Pulse pressure

CV:

Coefficient of variability

CKD:

Chronic kidney disease

LAD:

Left atrium diameter

LVEDD:

Left ventricular end diastolic diameter

LVRWT:

Left ventricular relative wall thickness

MLVRWT:

Maximal left ventricular relative wall thickness

LVMI:

Left ventricular mass index

RF:

Random forest

XGB:

XGBoost

ET:

Extra trees

References

  1. Unger T, Borghi C, Charchar F, et al. 2020 international society of hypertension global hypertension practice guidelines. J Hypertens. 2020;75(6):982–1004.

    Article  Google Scholar 

  2. Ruilope LM. Simultaneous cardiac and renal damage in a hypertensive population. J Clin Hypertens (Greenwich, Conn). 2009;11(6):301.

    Article  Google Scholar 

  3. Ngufor C, Van Houten H, Caffo BS, Shah ND, McCoy RG. Mixed effect machine learning: a framework for predicting longitudinal change in hemoglobin A1c. J Biomed Inform. 2019;89:56–67.

    Article  Google Scholar 

  4. Lin J, Xu R, Yun L, Hou Y, Li C, Lian Y, Zheng F. A risk prediction model for renal damage in a hypertensive Chinese Han population. Clin Exp Hypertens. 2019;41(6):552–7.

    CAS  Article  Google Scholar 

  5. Ramezankhani A, Kabir A, Pournik O, Azizi F, Hadaegh F. Classification-based data mining for identification of risk patterns associated with hypertension in Middle Eastern population: a 12-year longitudinal study. Medicine. 2016;95(35).

  6. Jeon J, Leimbigler PJ, Baruah G, Li MH, Fossat Y, Whitehead AJ. Predicting glycaemia in type 1 diabetes patients: experiments in feature engineering and data imputation. J Healthcare Inform Res. 2020;4(1):71–90.

    Article  Google Scholar 

  7. Chen J, Yin J, Zang L, Zhang T, Zhao M. Stacking machine learning model for estimating hourly PM2.5 in China based on Himawari 8 aerosol optical depth data. Sci Total Environ. 2019;697:134021.

    CAS  Article  Google Scholar 

  8. Zadeh AH, Alsabi Q, Ramirez-Vick JE, Nosoudi N. Characterizing basal-like triple negative breast cancer using gene expression analysis: a data mining approach. Expert Syst Appl. 2020;148:113253.

    Article  Google Scholar 

  9. Bian J, Abdelrahman S, Shi J, Del Fiol G. Automatic identification of recent high impact clinical articles in PubMed to support clinical decision making using time-agnostic features. J Biomed Inform. 2019;89:1–10.

    Article  Google Scholar 

  10. Dey SK, Rahman MM, Siddiqi UR, Howlader A. Analyzing the epidemiological outbreak of COVID-19: a visual exploratory data analysis approach. J Med Virol. 2020;92(6):632–8.

    CAS  Article  Google Scholar 

  11. Fotouhi S, Asadi S, Kattan MW. A comprehensive data level analysis for cancer diagnosis on imbalanced data. J Biomed Inform. 2019;90:103089.

    Article  Google Scholar 

  12. Liu H, Wang Z, Sun Y. Stacking model of multi-label classification based on pruning strategies. Neural Comput Appl. 2020;32(22):16763–74.

    Article  Google Scholar 

  13. de Lima MD, de Oliveira Roque e Lima J, Barbosa RM. Medical data set classification using a new feature selection algorithm combined with twin-bounded support vector machine. Med Biol Eng Comput. 2020;58(3):519–28.

    Article  Google Scholar 

  14. Braun T, Spiliopoulos S, Veltman C, Hergesell V, Passow A, Tenderich G, Koerner MM. Detection of myocardial ischemia due to clinically asymptomatic coronary artery stenosis at rest using supervised artificial intelligence-enabled vectorcardiography—a five-fold cross validation of accuracy. J Electrocardiol. 2020;59:100–5.

    Article  Google Scholar 

  15. Wang X, Gong G, Li N, Qiu S. Detection analysis of epileptic EEG using a novel random forest model combined with grid search optimization. Front Human Neurosci. 2019;13:52.

    Article  Google Scholar 

  16. Batten AJ, Thorpe J, Piegari RI, Rosland AM. A resampling based grid search method to improve reliability and robustness of mixture-item response theory models of multimorbid high-risk patients. IEEE J Biomed Health Inform. 2019;24(6):1780–7.

    Article  Google Scholar 

  17. Cheng D, Tang Y, Li H, Li Y, Sang H. Nighttime blood pressure decline as a predictor of renal injury in patients with hypertension: a population-based cohort study. Aging (Albany NY). 2019;11(13):4310.

    CAS  Article  Google Scholar 

  18. Ningling S, Hongyi W, Dingliang Z, Yuhua L, Shuguang L, Xiaoping C. Association between albuminuria and blood pressure level in patients with essential hypertension. Chin J Nephrol. 2010;26(10):762–5.

    Google Scholar 

  19. Huijuan K, Dengfeng G, Rui M, Xin D, Yongqin L. GW26-e0487 relationship between blood pressure circadian rhythm and early renal damage in the patients with primary hypertension. J Am Coll Cardiol. 2015;66(16):192.

    Google Scholar 

  20. Zemin K, Hu H, Kong Y, Zhengqiu Y, Hong Y. GW25-e1517 blood pressure circadian rhythm impact on early-stage renal damage in patients with hypertension. J Am Coll Cardiol. 2014;64(16S):C175–C175.

    Article  Google Scholar 

  21. Tanindi A, Topal FE, Topal F, Celik B. Red cell distribution width in patients with prehypertension and hypertension. Blood Pros. 2012;21(3):177–81.

    CAS  Article  Google Scholar 

  22. Perlstein TS, Weuve J, Pfeffer MA, Beckman JA. Red blood cell distribution width and mortality risk in a community-based prospective cohort. Arch Intern Med. 2009;169(6):588–94.

    CAS  Article  Google Scholar 

  23. Formal M, Wizner B, Cwynar M, et al. Association of red blood cell distribution width, inflammation markers and morphological as well as rheological erythrocyte parameters with target organ damage in hypertension. Clin Hemorheol Microcirc. 2013;56:325–35.

    Article  Google Scholar 

  24. Ujszaszi A, Molnar MZ, Czira ME, Novak M, Mucsi I. Renal function is independently associated with red cell distribution width in kidney transplant recipients: a potential new auxiliary parameter for the clinical evaluation of patients with chronic kidney disease. Br J Haematol. 2013;161(5):715–25.

    CAS  Article  Google Scholar 

  25. Solak Y, Yilmaz MI, Saglam M, et al. Red cell distribution width is independently related to endothelial dysfunction in patients with chronic kidney disease. Am J Med Sci. 2014;347(2):118–24.

    Article  Google Scholar 

  26. Topaz M, Murga L, Gaddis KM, McDonald MV, Bar-Bachar O, Goldberg Y, Bowles KH. Mining fall-related information in clinical notes: comparison of rule-based and novel word embedding-based machine learning approaches. J Biomed Inform. 2019;90:103103.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This document is the results of the research project funded by the National Natural Science Foundation of China (No.61902034, No.62176026), Beijing Natural Science Foundation(M22009), Engineering Research Center of Information Networks, Ministry of Education of China. The funding body had no input in the objective and the design of the study, the collection, analysis, and interpretation of data nor in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

HE, ZK and MS designed research. XT and XL collected and analyzed data. QB, HE and ZK designed method. QB and ZK performed the experiments. QB and ZK wrote the manuscript. HE, LT and MS reviewed and edited the manuscript. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to E. Haihong.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

None of the authors have a conflict of interest to report.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bi, Q., Kuang, Z., Haihong, E. et al. Research on early warning of renal damage in hypertensive patients based on the stacking strategy. BMC Med Inform Decis Mak 22, 212 (2022). https://doi.org/10.1186/s12911-022-01889-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12911-022-01889-4

Keywords

  • Hypertension
  • Renal damage
  • Risk assessment
  • Data mining
  • Feature engineering
  • Stacking model