Association between borderline dysnatremia and mortality insight into a new data mining approach

Background Even small variations of serum sodium concentration may be associated with mortality. Our objective was to confirm the impact of borderline dysnatremia for patients admitted to hospital on in-hospital mortality using real life care data from our electronic health record (EHR) and a phenome-wide association analysis (PheWAS). Methods Retrospective observational study based on patient data admitted to Hôpital Européen George Pompidou, between 01/01/2008 and 31/06/2014; including 45,834 patients with serum sodium determinations on admission. We analyzed the association between dysnatremia and in-hospital mortality, using a multivariate logistic regression model to adjust for classical potential confounders. We performed a PheWAS to identify new potential confounders. Results Hyponatremia and hypernatremia were recorded for 12.0% and 1.0% of hospital stays, respectively. Adjusted odds ratios (ORa) for severe, moderate and borderline hyponatremia were 3.44 (95% CI, 2.41–4.86), 2.48 (95% CI, 1.96–3.13) and 1.98 (95% CI, 1.73–2.28), respectively. ORa for severe, moderate and borderline hypernatremia were 4.07 (95% CI, 2.92–5.62), 4.42 (95% CI, 2.04–9.20) and 3.72 (95% CI, 1.53–8.45), respectively. Borderline hyponatremia (ORa = 1.57 95% CI, 1.35–1.81) and borderline hypernatremia (ORa = 3.47 95% CI, 2.43–4.90) were still associated with in-hospital mortality after adjustment for classical and new confounding factors identified through the PheWAS analysis. Conclusion Borderline dysnatremia on admission are independently associated with a higher risk of in-hospital mortality. By using medical data automatically collected in EHR and a new data mining approach, we identified new potential confounding factors that were highly associated with both mortality and dysnatremia. Electronic supplementary material The online version of this article (10.1186/s12911-017-0549-7) contains supplementary material, which is available to authorized users.


Background
Dysnatremia is one of the most frequent electrolyte disorders in adult patients admitted to hospital [1]. Plasma sodium concentration is essentially determined by plasma water intake and loss (in urine, feces, and sweat) and the water content of the plasma is finely regulated by a system including sensory organs (e.g. the carotid receptor and hypothalamus), vasopressin and the kidney. Sodium is the main factor determining plasma osmolality. Even small variations of plasma sodium concentration lead to the movement of water between intracellular and extracellular spaces, with a potential clinical impact.
The consensus definition of hyponatremia is a serum sodium concentration of less than 135 mmol/L. As water moves from body compartments with a lower osmolality to those with a higher osmolality, hyponatremia induces the movement of water into cells, leading to edemas of the brain and other tissues. In addition to neurological complications [2], several epidemiological studies have identified an association between hyponatremia and mortality in hospitalized patients [3][4][5], the general population [6,7] and patients with specific diseases, including heart failure [8,9], liver cirrhosis [10,11], pulmonary embolism [12], pulmonary hypertension [13,14], pneumonia [15,16], chronic kidney disease (CKD) [17,18] and myocardial infarction [19,20].
The consensus definition of hypernatremia is a serum sodium concentration exceeding 145 mmol/L, and this condition is far less frequent than hyponatremia [21]. Hypernatremia is caused principally by water loss leading to plasma hyperosmolality, resulting in the dehydration of cells [22]. People with altered thirst sensation (including patients with specific hypothalamic lesions, the elderly and patients with altered mental status) or with difficult access to water are particularly exposed to the risk of severe hypernatremia. The symptoms are often not specific, particularly for intermediate levels of hypernatremia (e.g. thirst, muscle weakness, anorexia, nausea, vomiting and altered consciousness) but severe hypernatremia can lead to brain shrinkage potentially associated with cerebral bleeding, subarachnoid hemorrhage and death [22].
Several recent studies have identified a potential association between mortality and dysnatremia, even for small variations of sodium concentration [3-5, 18, 23, 24]. However, most of these studies were conducted in specific cohorts and it remains unclear whether the patients concerned died due to the effects of dysnatremia or from the underlying diseases causing the dysnatremia [25,26].
Real-life care data from Electronic Health Record (EHR) can be used to identify or confirm unknown disease correlation [27,28]. This automatically captured health data offers new research approach. Methods and tools, known as data mining, have been developed over the years to analyse a large collection of data [29]. In 2010, a new protocol, phenome-wide association analysis (PheWAS) was developed to identify potential correlation between many phenotypes, called the phenome, and a given genetic variant [30]. After the first proof of concept, PheWAS was also used to identify new potential correlation between non-genetic target and phenotypes [31,32]. To our knowledge, none has performed such method to explore methodically all the potential counfonders from EHR.
Our objective was (i) to assess the prevalence of dysnatremia in all patients admitted to hospital, (ii) to identify new confounders through a PheWAS analysis and compare them to classical confounders identified in the literature, (iii) to confirm the impact of dysnatremia on in-hospital mortality after adjustment for classical and new potential confounding conditions.

Data source
The Hôpital Européen Georges Pompidou (HEGP) is a 795-bed quaternary care hospital. Data were extracted from the HEGP clinical data warehouse (CDW), which prospectively collect all data from the electronic health record (EHR) established at the creation of the hospital in 2000.

Study design and definitions
We conducted a retrospective observational study, extracting data for patients admitted to the HEGP between January 1, 2008 and June 31, 2014. We didn't take data prior to 2008 as the coding of ICD-10 codes was not optimal before 2008 in our institution. We included patients having a stay fulfilling the following criteria: at least one serum sodium concentration determined the day of admission and a stay lasting at least 2 days. We used the same categories of sodium concentration as Funk et al. [5]: normal serum sodium concentration (135 ≤ [Na] ≤ 145 mmol/L) was used as the reference category. Hypernatremia was defined as borderline (145 < [Na] ≤ 150 mmol/L), mild (150 < [Na] ≤ 155 mmol/L) or severe ([Na] > 150 mmol/L) and hyponatremia was defined as borderline (130 ≤ [Na] < 135 mmol/L), mild (125 ≤ [Na] < 130 mmol/L) or severe ([Na] < 125 mmol/L). Serum sodium determinations were not corrected for the effects of hyperglycemia.

Data extraction
As patients may have been hospitalized more than once, with normal, high or low sodium levels, we extracted patients as follows, to ensure that each patient corresponded to a unique observation in the dataset: 1) we first extracted patients encountered with severe hypernatremia; 2) we then extracted patients encountered with mild hypernatremia, excluding the patients already selected in the first step; 3) we extracted patients encountered with borderline hypernatremia, excluding the patients selected in the first two steps. We then applied the same process for hyponatremia. Finally, we extracted the first encounters with patients whose sodium levels were normal. We provide the SQL code extraction in Additional file 1.

Demographic characteristics and Comorbidities
We extracted the following data from the HEGP-CDW for each included encounter: age, sex, duration of hospital stay, hospital admission category (via the emergency department vs. others), any surgery during the stay, intensive care unit (ICU) stay, palliative care, dialysis procedures and ICD-10 codes. We determined the ICD-10 adaptation of the Charlson comorbidity index [33]. We recorded vital status at hospital discharge.

Statistical analysis
Binary variables are reported as numbers and frequencies (%); continuous variables are expressed as means and standard deviations and variables with skewed distributions are expressed as medians and interquartile ranges (IQRs). We used chi-squared tests to compare frequencies between sodium concentration categories and Fisher's exact test, as appropriate. We compared medians with Wilcoxon signed-rank tests. We first assessed the association between mortality and each serum sodium concentration category at admission. We also assessed the associations between classical potential confounding factors (i.e.: age, sex, duration of hospital stay, number of diagnosis codes, hospital admission via the emergency department, ICU stay during hospitalization, palliative care, dialysis, dementia and the Charlson comorbidity index) and serum sodium concentration category with ANOVA, and the associations of these factors with mortality, using a logistic regression model. We categorized age, Charlson index and serum sodium level. We retrieved all potential confounding factors associated with both mortality and serum sodium concentration category. We adjusted for these factors in a multivariate logistic model in which these variables and serum sodium concentration category were treated as explanatory variables and in-hospital mortality was treated as the dependent factor.
The formula used for the multivariate models was.

Phenome-wide association analysis
We performed a phenome-wide association study (PheWAS) to identify new potential confounding factors driving the association between borderline dysnatremia and mortality. PheWAS is based on the testing of associations between traits and all ICD-10 billing codes, with the aim of identifying associations that have never been described [31]. We performed three PheWAS analyses to identify new potential confounding factors: (i) one in which mortality was considered as the PheWAS trait, (ii) one in which borderline hypernatremia was the PheWAS trait and (iii) one in which borderline hyponatremia was the PheWAS trait. We considered ICD-10 codes associated with both mortality and dysnatremia as newly identified confounding factors. Correction for multiple tests was performed using a Bonferroni correction.
We used these new potential confounding factors to assess the association between borderline dysnatremia and in-hospital mortality. We compared different multivariate logistic models using classical confounding factors alone, confounding factors identified through the PheWAS analysis alone and both types of factors and finally chose the model with the lower Akaike Information Criterion (AIC). We retained in the final model only significantly associated variables. We provide the R code of the PheWAS analysis in Additional file 1.
All tests were two-tailed and we considered P values <0.05 to be significant. All statistical analyses were performed with R version 3.1.3 (RODBC, stats, MASS, MatchIt and epitools packages). This study was approved by the institutional review board of the HEGP (IRB#00001072 Study #CDW_2015_0013).

Extraction of the patients data
We extracted 606,524 patients from the CDW of the HEGP who were admitted between January 1, 2008 and June 31, 2014. At least one sodium concentration determination was available for the day of admission for 45,834 of these patients with hospital stays of longer than 2 days (Fig. 1).

Study population
Hyponatremia and hypernatremia were recorded for 12.0% and 1.0% of hospital stays, respectively. Severe hyponatremia ([Na] < 125 mmol/L) was reported in 0.6% of hospital stays and severe hypernatremia was reported in 0.08% of hospital stays. In the vast majority of cases (57,462 hospital stays, 88.3%), serum sodium concentration was normal at admission. The mean age ± standard deviation (SD) of the patients included was 60.8 ± 18.5 years, and 59.3% of the patients were male. The mean ± SD serum sodium concentration was 138.3 ± 3.9 mmol/L. Emergency admission in hospital represented 12.1% of the cases and 20.4% of the patients spent some of their hospital stay in the ICU. Dialysis was carried out during the hospital stays for 4.4% of patients; 1.9% of the patients had been diagnosed with dementia and 1.1% were receiving palliative care. The median duration of hospital stay was 5 (3-11) days and the median number of diagnosis billing codes per stay was 4 (2-7). The median Charlson comorbidity index was 1 (0-3). The observed in-hospital mortality for the total cohort was 4.0%.
The characteristics of patients on admission to the hospital are reported in Table 1.
All the patient characteristics other than surgery with mortality were associated with both mortality and serum sodium concentration and were therefore considered to be confounding factors ( Table 2).
The association between mortality and dysnatremia was highly significant, with a dose-dependent effect (Fig. 2). After adjustment for these characteristics, the association between mortality and dysnatremia remained significant ( Fig. 2 and Additional file 2), with a dosedependent effect.

Phenome-wide association analysis
Diagnoses on the studied cohort were coded with 960 different three-digits ICD-10 codes. We therefore tested the association between each ICD-10 code, mortality and natremia. This PheWAS analysis (Table 3 and Additional file 3) that identified 13 new ICD-10 codes associated with both an increased or a decreased risk of mortality and borderline hyponatremia after correction for multiple testing.
Significant threshold was set at p < 5.10 −5 , corresponding to 0.05/960. By using the same method, the PheWAS analysis identified six new ICD-10 codes associated with both an increased or a decreased risk of mortality and borderline hypernatremia (Table 3 and Additional file 3).
We entered these new confounders in the multivariate regression model, removed factors with P value >0.05 and chose the model that minimize the AIC. After this process, borderline hyponatremia (OR = 1.57 95% CI, 1.35-1.81; P < 0.0001) and hypernatremia (OR = 3.47 95% CI, 2.43-4.90; P < 0.0001) were still associated with in-hospital mortality (Table 4 and Additional file 4).

Discussion
Dysnatremia is one of the most frequent electrolyte disorders in medical care. It is associated with well-known morbidities and mortality. The risk of mortality is well-   documented for moderate and severe dysnatremia, but recent studies have revealed a potential association between mortality and borderline dysnatremia [3-5, 18, 23, 24]. However, most of these studies concerned specific patients, from specific units and/or with specific diseases. We studied a large cohort of unselected hospitalized patients presenting the entire spectrum of dysnatremia. The data were extracted from the HEGP-CDW, a prospectively and automatically constituted database, representing real-life care data in a quaternary care hospital. After adjustment for several predefined covariables, including the Charlson comorbidity index, to take into account the complex condition of the patients, our results confirmed the findings of previous: we identified a higher risk of in-hospital mortality for patients with borderline dysnatremia. Using claims data allowed us to carry an original approach to avoid any other potential confounding factors: a PheWAS analysis. Hence, we identified new potential confounding factors without any a priori hypotheses. This approach resulted in a decrease of the observed association that confirmed the relevance of the new identified confounding factors and the association between borderline dysnatremia and in-hospital mortality. This is unprecedented, to our knowledge that such method is used for epidemiologic purpose. The prevalence of hyponatremia at admission in patients hospitalized for at least 2 days was 12%. This estimate is close to the prevalence of 13% reported by Waikar et al. [3] for a comparable cohort, but lower than the 17.7% reported by Funk et al. [5] for patients admitted to the ICU. The prevalence of hypernatremia in our cohort was 1%, which is within the range of values generally reported: 0.2 to 2.5% of the general hospitalized population [34]. The patients with dysnatremia studied here were older than the general hospital population, as already reported for dysnatremia [5] and hyponatremia [3,35]. An increase in the risk of in-hospital mortality in patients with hyponatremia has already been reported in several studies. In 2008, Zilberberg et al. [35] reported a 55% increase (OR 1.64, 95% CI 1.42-1.69) in the risk of in-hospital mortality after adjusting for various confounders in a large set of 198,281 discharge summaries. In 2009, Waikar et al. [3] reported a similar risk after multivariable adjustment, even for borderline    18-1.51). These previous results and the results presented here confirm the relationship between hyponatremia and in-hospital mortality even for small variations of serum sodium concentration. The relationship between hypernatremia and in-hospital mortality has been less extensively studied. The small number of epidemiologic studies carried out may be explained by the low prevalence of hypernatremia. However, Darmon et al. [34] showed that there was a higher risk of in-hospital mortality in patients with borderline (serum Discovering new potential confounding factors using PheWAS analysis and CDW The re-use of medical data automatically collected in the CDW allowed us a new data mining approach, a PheWAS analysis. Thus, we identified new potential confounding factors that were for all of them highly associated with both mortality and dysnatremia ( Table 3). Most of them were not part of the pre-defined confounders. These new confounders cannot be taken alone in the multivariate model as they do not provide an optimal adjustment. Relationships between these new confounders and dysnatremia have to be confirmed on other datasets. However, used with the pre-defined confounders, they provided a better adjustment in the final model (Table 4). This is the first time, to our knowledge, that such an approach, combining expert knowledge and data mining approach, is used for epidemiologic purpose.

Limitations
Use of procedures and billing codes as covariates The limitations of using EHR data for epidemiologic purposes have been well described [36][37][38]. One of those, undercoding, may vary between diseases [39], depending on the severity of the disease and its classification as a principal or additional diagnosis. However, this bias decreases power, as patients may appear healthier than they really are. In our study, all the selected variables were associated with both dysnatremia and mortality. Another major concern is misclassification, which might have affected the attribution of patients to the different groups. The accuracy of ICD-10 billing codes has been confirmed in previous studies, particularly for the Charlson comorbidity index diagnosis codes [39,40]. In a 2011 study based on the population-based Danish National Registry of patients, the positive predictive value of the different codes ranged from 82.0% for "diabetes mellitus with chronic complications" to 100%, for "chronic pulmonary disease" or "hemiplegia" [41]. We therefore consider this bias to be minimal and nondifferential, although it was not possible to determine the extent to which this misclassification might have affected our results. The observation of an association between higher levels of the Charlson comorbidity index with higher in-hospital mortality in our cohort supports this hypothesis and consolidate that the re-use of EHR data for this kind of study is possible.

Low prevalence of Hypernatremia
The curve of the relationship between serum sodium concentration and mortality was U-shaped, but the U was more pronounced for hyponatremia than for hypernatremia, with a paradoxically weaker relationship for mild and severe hypernatremia than for borderline hypernatremia. This paradoxical decrease was also observed by Sakr et al., in their study of 277 patients hypernatremic on admission to a surgical ICU [23]. This observation may reflect a lack of power, due to the small numbers of patients with hypernatremia in the two cohorts. This hypothesis is supported by the absence of such a paradoxical observation in the larger cohorts of Funk et al. [5] and Darmon et al. [24].

Causal association between Dysnatremia and in-hospital mortality
As in other studies, we identified a statistically significant association between dysnatremia and in-hospital mortality, even for small variations, but the direct contribution of dysnatremia to mortality remains unclear. The association between severe dysnatremia and morbidity can be explained by severe damage to the brain, but it remains unclear why small variations of serum sodium concentration cause morbidity and mortality. It is thus possible that dysnatremia is merely a marker of underlying disease severity, leading to death. Several studies have investigated the causal relationship between hyponatremia and mortality and have generated several hypothesis. First, hyponatremia is known to be related to higher mortality in patients with diseases activating the renin-angiotensin system and increasing vasopressin secretion, such as heart failure [8,9], liver cirrhosis [10,11], pulmonary embolism [12], pulmonary hypertension [13,14], pneumonia [15,16], and myocardial infarction [19,20]. In this causal model, activation of the neurohumoral systems induces hyponatremia, and the associated diseases are the cause of death. However, recent studies have reconsidered this assumption: (i) Waikar et al., in their study of a prospective cohort of 98,411 patients in 2009, found that the correction of hyponatremia during hospitalization attenuated the increase in the risk of death associated with hyponatremia [3], (ii) 2 years later, the same team showed that, even for oligoanuric patients, in whom neurohumoral responses cannot influence serum sodium levels, hyponatremia was associated with an increase in the risk of mortality [17], (iii) this important result was confirmed in 2012 by Kevesdy et al., in a large cohort of 655,493 patients with chronic kidney disease in whom the relationship between dysnatremia and mortality was not dependent on the severity of kidney disease [18]. In this particular cohort, dysnatremia was related to mortality even after adjustment for multiple confounding factors.

Conclusion
Our results confirm, by combining real-life care data and a phenome-wide association analysis, that even small variations of sodium concentration are associated with a poor prognosis. The re-use of medical data automatically collected in the CDW allowed new data mining approach. Thus, we identified new potential confounding factors that were for all of them highly associated with both mortality and dysnatremia.