Skip to main content

Identifying subgroups in heart failure patients with multimorbidity by clustering and network analysis


This study presents a workflow for identifying and characterizing patients with Heart Failure (HF) and multimorbidity utilizing data from Electronic Health Records. Multimorbidity, the co-occurrence of two or more chronic conditions, poses a significant challenge on healthcare systems. Nonetheless, understanding of patients with multimorbidity, including the most common disease interactions, risk factors, and treatment responses, remains limited, particularly for complex and heterogeneous conditions like HF. We conducted a clustering analysis of 3745 HF patients using demographics, comorbidities, laboratory values, and drug prescriptions. Our analysis revealed four distinct clusters with significant differences in multimorbidity profiles showing differential prognostic implications regarding unplanned hospital admissions. These findings underscore the considerable disease heterogeneity within HF patients and emphasize the potential for improved characterization of patient subgroups for clinical risk stratification through the use of EHR data.

Peer Review reports


As life expectancy increases, the population suffering from more than one chronic condition is increasing dramatically. The co-occurrence of two or more chronic conditions, defined as multimorbidity, is estimated to affect around 50 million people in the European Union, making it one of the most challenging problems faced by the health sector at the current time [1]. Multimorbidity is a significant healthcare problem, associated with poor health outcomes, poorer quality of life, rapid disease progression, and increased healthcare costs [1, 2].

Heart Failure (HF) is estimated to affect 64.3 million people worldwide. In developed countries, its prevalence is generally estimated at 1% to 2% of the general adult population [3]. Despite significant improvements over the last years, HF prognosis remains poor and patients’ quality of life remains low [4]. The current approach to managing HF relies on phenotyping patients according to Ejection Fraction (EF), a measure of cardiac function that categorizes patients into distinct groups with prognostic and therapeutical implications [5]. This is, however, an oversimplification of a complex phenomenon. In recent years, several attempts were made to better characterize HF population regarding etiology, symptoms and comorbidities [6, 7].

The Electronic Health Record (EHR) is the tool for capturing patients’ medical history, containing structured data, such as laboratory results and diagnoses, along with unstructured data, such as radiology reports, discharge summaries, and other clinical narratives. EHRs, therefore, present a rich data source of patients with complex and heterogeneous conditions, such as HF, COPD (Chronic Obstructive Pulmonary Disease), Dementia, and Parkinson’s Disease, leading to important insights regarding disease pathophysiology [7]. The characterization of patient cohorts through their phenotypes not only better elucidates individual conditions but can also provide a better understanding of the most common associations and interactions between diseases, as well as improving clinical risk assessment [8, 9]. Disease sub-typing is also important for drug development and clinical trial recruitment strategies [10].

There are several approaches to phenotyping the EHR. In recent years, research has been shifting to machine learning methodologies, including unsupervised learning methods that require no prior classification and are truly data-driven [10]. One of the challenges of applying cluster analysis to EHR data is the mixture of categorical and continuous data [11]. Among previous methods to tackle mixed-type data is the use of hybrid distance approaches, that is, using specific distance functions prepared for mixed-type data before applying clustering, such as Gower’s Distance [12]. Other approaches include performing data transformations, such as discretization or dimensionality reduction, using Factor Analysis of Mixed Data (FAMD) [13].

Cluster interpretation in clinical context is a complex task. For instance, clustering data with a high number of features turns interpretation and visualization difficult. Graph visualizations are a simple representation of how entities connect and interact with each other and several previous works explore clustering over graph representations [14,15,16]. When applied to clinical data, graphs can provide a better understanding of patient characteristics. Phenotypic Disease Networks (PDNs) are a network representation of comorbidities that can be used to study their associations, differences in phenotypes between patients, and disease progression. In a PDN, nodes represent diseases and weighted edges represent links between diseases. The weight can be quantified using different measures, such as co-occurrence frequency or Pearson correlation. These networks have the ability to reveal non-obvious relationships between comorbidities that could bring up important information to improve patient treatment approaches [17].

Another main challenge of finding patient clusters is assessing their clinical implications, including prognostic information. Survival analysis is used for comparing the risk for an event of interest for different groups [18]. Commonly used statistical tests include the estimation of the survival curve using the Kaplan-Meier model, the statistical comparison of two groups using the log-rank test, and the possibility of incorporating additional variables through Cox’s hazards model [19].

This study presents a comprehensive workflow for the identification and characterization of HF patients by harnessing the power of EHR data and employing advanced machine learning methodologies. Our approach seeks to enhance the clinical interpretation of derived findings, enabling healthcare professionals to better understand disease associations, interactions, and progression. By leveraging the rich information contained within EHRs, we aim to uncover novel insights into disease pathophysiology and patient phenotypes, ultimately contributing to improved risk assessment, treatment strategies, and overall patient outcomes. The proposed workflow not only holds promise for advancing our understanding of HF but also serves as a valuable framework for investigating other complex and heterogeneous conditions, paving the way for more personalized and effective healthcare in the face of the growing challenge of multimorbidity.


We developed a workflow for the identification and characterization of HF patient subgroups. The workflow consists of the following steps (see Fig. 1): (1) preprocessing of the data and exploratory data analysis to characterize the dataset; (2) clustering; (3) statistical analysis for characterization and visualization of the obtained clusters, (4) survival analysis to stratify the patient clusters according to the risk for a given outcome.

Fig. 1
figure 1

Overview of the proposed approach for the identification and characterization of HF patient subgroups using data from Hospital da Luz Lisboa (HLL)


We developed the workflow using real-world data from Hospital da Luz Lisboa, Portugal (HLL), the largest private hospital in Portugal. HLL is a university hospital with availability of all medical specialties. Despite being a tertiary hospital, it also has significant primary care activity with a Family Medicine department, covering the full spectrum of care for patients with multimorbidity. The dataset used to develop the pipeline was generated from the EHR of HLL. We used an initial population of 54 827 patients, with an observation period between January 2007 and August 2021. From this initial pool, 3 745 HF patients with multimorbidity were identified using a locally developed phenotyping algorithm that used both ICD-9-CM does and free text mentions from the field “diagnostics and symptoms” from our EHR system (see Table 1). For this algorithm we used keywords that directly refer to HF name, abbreviation or New York Heart Association (NYHA) mention, which is a commonly used staging scale to classify HF symptoms

Table 1 ICD-9 Codes and keywords used to identify HF patients. ICFEp - Insuficiência Cardíaca com Fração de Ejeção Preservada, ICFEr - Insuficiência Cardíaca com Fração de Ejeção Reduzida, IC - Insuficiência Cardíaca, ICC - Insuficiência Cardíaca Congestiva, NYHA - New York Heart Association
Table 2 Feature summary of the HF dataset

For the clustering analysis, relevant features were selected based on prior literature review and clinical expertise. These features consisted of clinical variables, demographics, physical characteristics, laboratory data prescriptions, and the most common comorbidities associated with HF, amounting to a total of 35 features (see Table 2). Lifestyle-related features and gender were provided as text fields, comorbidities as binary variables based on the presence or absence of the disease, and all others as numeric. In addition to the features used for clustering, we also extracted the date of HF diagnosis and gathered data on other comorbidities, relevant drug prescriptions, and clinical outcomes. Clinical outcomes evaluated in this study were unplanned hospital admissions: hospitalizations and emergency department visits. Prescription data spans a period of 9 years and 8 months, ranging from January 2012 until August 2021. We grouped drug prescriptions into pharmacological groups that are relevant for HF treatment: ACEis/ ARBs, Beta-blockers, Diuretics, Digoxin and MRA [20]. We also analyzed other groups that are relevant for the most common comorbidities: anticoagulants, antiplatelets, statins and bronchodilators. The number of different drugs is relative to the observation period for medical prescriptions (9 years and 8 months). Percentages of drug groups represent the percentage of patients that had at least one prescription of the relevant medication group during the period analyzed.


Prior to the clustering, it was necessary to preprocess the dataset obtained from the EHRs extraction. Categorical features were converted into numeric binary features and features with a prevalence lower than 2% in the cohort were removed. Features with a percentage of missing values higher than 40% were deleted. We utilized the age of patients as recorded in their final observation. Regarding BMI and laboratory results we compiled the mean value for data analysis purposes. This decision was made to mitigate the risk of bias that might arise from relying solely on recent values, which could potentially be anomalous or not reflective of the patient’s typical health status. We employed two imputation methods that were previously shown to have the least imputation error and prediction difference when applied to laboratory data: missForest and multivariate imputation by chained equations (MICE) [21]. Missing values were imputed using Python’s function Iterative Imputer, which is based on the MICE method. The MICE method models the missing values of each feature as a function of other features [22]. To do that, at each step, one of the feature columns is designated output y and the rest of the feature columns are designated as inputs x. Comorbidities were identified using ICD-9 codes. Additionally, we used laboratory data and BMI to increase the sensitivity of anemia and obesity phenotyping, respectively. Continuous features were normalised to have a mean of 0 and a standard deviation of 1. Categorical binary features were scaled from \(\{0,1\}\rightarrow \{-0.5,0.5\}\). After preprocessing, the total number of features used for clustering was 25. The features are identified in Table 2 in bold.

Patient clustering

To apply clustering to the HF dataset, it was first necessary to determine which clustering algorithm and possible complementary techniques to use, and how to evaluate the clustering. It also was necessary to take into account that the dataset was composed of both numerical and categorical data. Several clustering algorithms were tested to understand which one would be more suitable for the HF data. Based on the literature regarding clustering mixed-type data and on HF clustering, we tested the following combinations of methods:

  1. 1

    Gower’s distance matrix [12] together with Ward’s Agglomerative Hierarchical Clustering [23],

  2. 2

    dimensionality reduction followed by Ward’s Agglomerative Hierarchical Clustering [23];

  3. 3

    dimensionality reduction followed by K-Means [24].

Gowers distance is a metric that measures the similarity of two items with mixed numeric and non-numeric data [12]. The Gower distance for instances x and y is given by:

$$\begin{aligned} d_G(x, y) = \frac{\sum _{j=1}^{m} w_j \cdot f_j(x_j, y_j)}{\sum _{j=1}^{m} w_j} \end{aligned}$$


$$\begin{aligned} f_j(x_j, y_j) = \left\{ \begin{array}{ll} \frac{|x_j - y_j|}{r_j}, &{} \text {if}\ x_j\ \text {and}\ y_j\ \text {are interval scales} \\ 1 \text { if } x_j \ne y_j, &{} \text {if}\ x_j\ \text {and}\ y_j\ \text {are categorical scales} \end{array}\right. \end{aligned}$$

where m is the number of features, wj is the feature weight and I is the indicator function, that is, I is 1 if xj and yj are equal and 0 otherwise. The weights wj were considered 1 for all features.

The dimensionality reduction method chosen was Factor Analysis of Mixed Data (FAMD), a principal component method specific to analyse quantitative and qualitative variables [13]. The FAMD algorithm can be seen as a mix between Princiapl Componente Analysis (PCA) and Mulitple Correspondence Anaoysis (MCA), as it acts as PCA quantitative variables and as MCA for qualitative variables [25].

Given that we were conducting an exploratory analysis without ground truth labels, we could rely solely on the data being clustered and compute internal validity indices for the resulting labels. The used indices were Silhouette Score, Calinski-Harabasz and Davies-Bouldin [26,27,28]. After computing the three indices for the different clustering algorithms and different values of k, we chose the best method using a majority vote (i.e. the algorithm and k that performed best in at least two of the indices). Additionally, a minimum of \(N > 375\) was also defined to promote stability and ensure that none of the clusters had less than 10% of the total population, an approach that was previously employed by others [7].

Statistical analysis and characterization of obtained clusters

Clusters were characterized according to age, gender, and comorbidities. Demographic, clinical, and laboratory characteristics were compared using Chi-squared tests for categorical variables and Kruskal-Wallis test for continuous variables. Differences were considered statistically significant whenever p-value was inferior to 0.05.

To enhance the interpretation and clinical meaning of the derived findings, we developed a visualization of the clusters’ comorbidities prevalence and associations through graph representation of each cluster. Each graph’s node represents a disease in the cluster and an edge represents a co-occurrence of the two nodes (diseases) connected by that edge. The graphs were created with Python’s NetworkX package and Gephi for visualization. The settings were adjusted so that node size was proportional to the number of connections to other nodes (node degree) and edge thickness was proportional to disease co-occurrence prevalence (edge weight). Co-occurrences (edges) with a co-occurrence prevalence lower than 2% were discarded to declutter the visualisation. Graph representation of comorbidities provides a better understanding of which diseases co-occur more frequently and which diseases have the most connections with other diseases.

Survival analysis

We conducted a survival analysis to assess outcome differences among the derived clusters. Survival curves and Hazards Ratio (HR) were computed for unplanned hospital admissions (hospitalizations and emergency department visits), for each cluster [18]. To achieve this, patients were temporally aligned at the starting point \(t_{0}\), the moment of HF diagnosis. Time intervals between HF diagnosis and outcomes were subsequently derived. Patients who did not experience the outcome were censored. Analysis was performed using Python’s package Lifelines, to create Kaplan-Meier curves and Cox proportional regression models. Kaplan-Meier curves were computed for each outcome individually and were stratified per cluster, with differences between groups tested using the log-rank test. For Cox proportional regression we adopted a previously reported strategy by using three different models for each outcome [29]: an unadjusted model that only took into account the clusters, a second model adjusted for age and gender, and a third model adjusted for age, gender and the laboratory value NT-proBNP, which is a risk marker for HF prognosis. HR from Cox regression models are presented in relation to the lowest risk cluster (determined by the lowest percentage of outcomes).


Description of heart failure population

The studied population included 3,745 patients who had HF and at least another comorbidity (i.e. multimorbidity patients with HF). The median age was 82 years (IQR 73-88), and 52.84% were women (Fig. 2). The median number of chronic diseases was 5 (IQR 3-7) with approximately 40% of the population having between 3 and 5 comorbidities and approximately 30% having 6 to 8 comorbidities.

Fig. 2
figure 2

Gender and age distribution of the dataset of HF patients with multimorbidity

The most common comorbidity in the dataset was Hypertension (56.58%), followed by Anaemia (53.75%) and Atrial fibrillation (33.78%). Other highly prevalent conditions were Chronic kidney disease (24.94%) and Coronary artery disease (24.09%) (Fig. 3). In addition to assessing the prevalence of each comorbidity, a graph representation of the comorbidities was also computed (see Fig. 4). In this analysis, Hypertension, Atrial fibrillation and Coronary artery disease are the conditions with the highest prevalence and that co-occur most frequently with other diseases.

Fig. 3
figure 3

Prevalence of comorbidities used for clustering the HF population. Abbreviations: HT for Hypertension, AFIB for Atrial Fibrillation, CKD for Chronic Kidney Disease, ICM for Ischemic Heart Disease, VD for Valvular Heart Disease, CVD for Cerebrovascular Disease, COPD for Chronic Obstructive Pulmonary Disease, and CM for Other Cardiomyopathies

Fig. 4
figure 4

Graph representation of comorbidities used for clustering. In the graph a node represents a disease and its size is proportional to the node degree. An edge represents a co-occurrence of two diseases and its width is proportional to the prevalence of the co-occurrence in the dataset. Using this visualisation, it is possible to obtain extra insights into the relationship between comorbidities, as it is possible to observe which diseases tend to co-occur more frequently. The thickness of the edges makes it possible to verify that HT and Anaemia, and HT and AFIB occur frequently together. CKD and ICM also show a high co-occurrence with HT and Anaemia. ICM–Ischaemic Cardiomyopathy, HT–Hypertension, AFIB–Atrial Fibrillation, CVD–Cerebrovascular Disease, VD-Valvular Disease, CKD–Chronic Kidney Disease, COPD–Chronic Obstructive Pulmonary Disease, CM–Other Cardiomyopathies

Clustering algorithm and choice of k

Taking into account the mixed-type nature of the data (containing both binary an continuous variables), we explored three different clustering approaches and results are summarized in Table 3. We found that the combination of Gower’s distance matrix and Ward’s Hierarchical Agglomerative Clustering was the one that produced clusters with higher scores in all clustering metrics analyzed.

Table 3 Values obtained for clustering metrics Silhouette Score. Calinski-Harabasz Index and Davies-Bouldin Score for the different clustering algorithms, namely, Gower’s Distance and Hierarchical Clustering, FAMD and Hierarchical Clustering, and FAMD and K-Means. For Silhouette Score and Calinski-Harabasz a higher value indicates a better performance, for Davies-Bouldin a lower value is best

After defining the clustering method, the choice of the number of clusters (k) was performed by applying Silhouette Score, Calinski-Harabasz, and Davies-Bouldin. We used a minimum of 375 patients to promote stability and performed metrics scores for each value of k so that a majority vote (i.e. the best value of k with the highest score in the majority of the metrics) is considered. Table 4 shows the clustering evaluation metrics for clusters with \(k \in [2,12]\). We found that \(k=4\) was the best value across clustering metrics and clinical interpretability. Although \(k=2\) had a slighter better score, it produced two asymmetric clusters where one cluster had the majority of the patients and the other a minority (under 375) that also did not differ significantly in most measured attributes. Therefore, the value chosen for the analysis was \(k=4\).

Table 4 Values obtained for clustering metrics Silhouette Score. Calinski-Harabasz Index and Davies-Bouldin Score for Hierarchical Clustering with Gower’s Distance using k=[4]. For Silhouette Score and Calinski-Harabasz a higher value indicates a better performance, for Davies-Bouldin a lower value is best

Multimorbidity characterization

Figure 5 shows a tileplot of cluster-specific percentages of comorbidities that allows to easily find the most prevalent comorbidities in each cluster while comparing the prevalence of diseases among clusters. Table 5 shows the detailed characterization of each cluster and of the HF dataset. Cluster1 included older male patients with more chronic conditions. Anemia, CKD, and Hypertension were particularly prevalent. In addition, this was the group with the highest values of NT-proBNP. Differently, Cluster2 was characterized by elder women, had highly prevalent Hypertension, AF, and Obesity, and the median NT-proBNP value was the lowest among the four clusters. Cluster3 was the largest one (n=1 231), had male predominance (59.46%), and was characterized by an almost universal prevalence of Anemia (99.35%), despite the lower number of prevalent comorbidities compared to the other clusters. Cluster4 had the lowest median age and a female preponderance (73.92%). This cluster was also the cluster with the lowest number of comorbidities.

Fig. 5
figure 5

Cluster-specific percentages of comorbidities. A darker color indicates a higher percentage of the comorbidity in the cluster. ICM-Ischaemic Cardiomyopathy, HT-Hypertension, AFIB-Atrial Fibrillation, CVD-Cerebrovascular Disease, VD-Valvular Disease, CKD-Chronic Kidney Disease, COPD-Chronic Obstructive Pulmonary Disease

Table 5 Characterisation of the four resulting clusters obtained from HF dataset. Continuous variables are described as median (inter-quartile range) and categorical variables as %. p-values for the comparison of the characteristics across clusters

Network analysis provides additional insight regarding multimorbidity and cluster complexity in HF patients, both through visual inspection of graphs and by comparing average degree (average number of other diseases that are connected to one disease) and average clustering (measure of density that indicates the degree to which nodes in a graph tend to cluster together) coefficients. The clustering coefficient is a measure that quantifies the tendency of nodes in a graph to form clusters or groups of interconnected neighbors. It helps us understand the local structure of the network by assessing how likely it is for neighboring nodes to also be connected to each other [14]. Figure 6 shows the graph representation of each cluster’s comorbidities. We can see that Cluster1 has the highest number of nodes (11) and edges (55), while Cluster4 has the lowest (7), therefore indicating that patients from the first cluster have higher complexity than patients from the former. The most common comorbidity associations are also depicted. For instance, Cluster1 has a high average degree of 10, meaning that all diseases are connected to each other, while also having strong associations between diseases. Cluster2 also has a high average degree of 8.9, whereas Cluster3 and Cluster4 have much lower average degrees (5 and 2, respectively). The fact that all nodes are of similar size means that every disease co-occurs at least once with almost all other diseases. The width of the edges is what allows us to understand which of these co-occurrences, also known as dyads, are more common in each cluster. We can witness that not only comorbidity prevalence differs between clusters, but also dyads exist differently between clusters. For instance, while in Cluster1 the strongest dyads are CKD/Anemia and Hypertension/Anemia, in Cluster2 the most significant dyads are Obesity/AF and Hypertension/AF. Cluster3 has a high number of patients with Hypertension and Anemia, some of them also showing AF, Coronary artery disease, and Valvular heart disease. Cluster4 has a common association between Hypertension and AF, while Obesity is connected to several diseases in the graph but with a lower co-occurrence. The average clustering coefficient is also highest for Cluster1 and lowest for Cluster4 (1 vs. 0.48), which suggests patients from Cluster1 have higher order interactions of diseases than patients from Cluster4.

Fig. 6
figure 6

Prevalence and graph representation of comorbidities used for clustering in the HF dataset. In the graph a node represents a disease and its size is proportional to the node degree. An edge represents a co-occurrence of two diseases and its width is proportional to the prevalence of the co-occurrence in the dataset. ICM-Ischaemic Cardiomyopathy, HT-Hypertension, AFIB-Atrial Fibrillation, CVD-Cerebrovascular Disease, VD-Valvular Disease, CKD-Chronic Kidney Disease, COPD-Chronic Obstructive Pulmonary Disease, CM-Other Cardiomyopathies

Drug prescription analysis

Table 6 shows the mean drug prescriptions of specific pharmacological groups in each cluster. As expected, patients show prescription profiles that are concordant with HF severity and comorbidity profiles. Among all clusters, anticoagulants are the most frequent drug prescription group (47.42%). Cluster1 shows the highest percentage of utilization of all drug groups, including drugs used for treating comorbidities (bronchodilators and hematinic factors), which is both consistent with it being the cluster with most advanced disease stages and the highest number of comorbidities. Patients from Clusters 3 and 4 have a lower percentage of all drug groups when compared to Cluster1 and Cluster2. In Cluster4, despite the high prevalence of Anemia, only 16.37% of patients were prescribed hematinic factors.

Table 6 Average presciptions per year and prevalence of prescribed drugs per cluster. p-values for the comparison of the characteristics across clusters. ACEi - angiotensin-converting enzyme inhibitor; MRA - Aldosterone receptor antagonists; DPP4i - Dipeptidyl peptidase-4 inhibitor

Unplanned hospital admissions

Overall, 60.91% of the patients had at least 1 hospitalization and 83.71% had at least one emergency admission during the observation period (see Table 7). Cluster1 had the highest rate of unplanned hospital admissions, both aggregated and number of admissions per year. Cluster2 and Cluster3 had lower percentages of unplanned hospital admissions, while Cluster4 had the lowest rate among the four clusters.

Table 7 Characterisations of outcomes related variables per cluster and in the entire dataset. Continuous variables are described as median (inter-quartile range) and categorical variables as %. p-values for the comparison of the characteristics across clusters

Figure 7 shows univariate Kaplan-Meier curves for the outcomes of hospitalisation and emergency admission, stratified by clusters, for a time period of 2 years post HF diagnosis. Cluster1 appears as the highest risk cluster, having the highest rate of unplanned hospital admissions at all times. Contrarily, Cluster4 shows the lowest rate of unplanned hospital admissions during the study period, while Cluster2 and Cluster3 have in-between values.

Fig. 7
figure 7

Kaplan-Meier survival curves for the outcomes Hospitalisation and Emergency admission for each cluster (within 2-years after HF diagnosis)

Table 8 shows the hazard ratios for three different Cox proportional hazard models computed: Model 1 is an unadjusted model, Model 2 is adjusted for age and gender only and Model 3 is adjusted for age, gender and NT-pro-BNP level. Hazard ratio (HR) is computed in relation to the lowest risk cluster (Cluster4). Cluster 1 shows the highest HR for both emergency department visit (5.86; CI 4.80 - 7.15) and hospital admission (2.73; CI 2.38 - 3.14). Even after adjusting for age, gender, and NT-pro-BNP, differences (Models 2 and 3) in HR among clusters remain significant.

Table 8 Risk of clinical events hospitalisation and emergency admission compared with Cluster4 (lowest risk). Hazard ratios and 95% confidence intervals computed using Cox Regression. adjusted for Age, Gender and NT-proBNP (Model3). Model1 adjusted for Age and Gender, Model 2 adjusted for Age, Gender and NT-proBNP


The results of this study add to the increasing evidence that HF is a complex syndrome with diverse phenotypes that are partly explained by patterns of multimorbidity. By utilizing clustering and network analysis on a real-world dataset, we identified four distinct clusters of HF patients with differing comorbidity profiles. These clusters exhibited varying disease prevalences and co-occurrence patterns, as well as differences in demographic characteristics and clinical severity, as measured by the risk of unplanned hospital admissions.

We encounter similarities between our findings and previous works. For example, a prior study using model-based clustering on 12 comorbidities of an HF cohort identified five clusters that differed in comorbidities, sociodemographic factors, and prognosis [4]. This study also found a cluster with a worse prognosis and the highest percentage of comorbidities, a cluster with anemia predominance, and lower-burden clusters. Another recent study specifically evaluated multimorbidity profiles in the HF population and identified six different profiles of multimorbidity using exploratory factor analysis, which also had prognostic implications [30]. In our study, Cluster1 had the highest risk profile, with older patients and a greater number of comorbidities. It represents what we call as the Cardio-renal-metabolic syndrome where patients are more likely to be men, have more advanced HF, higher levels of NT-pro-BNP, more severe CKD, and a longer history of Diabetes mellitus. These patients frequently have HF with reduced EF. Cluster2, on the other hand, was slightly younger, had a high prevalence of AF, Obesity, and HT, and showed the second highest risk for the outcome. This pattern clearly resembles one found in the literature with female predominance, obesity, Pulmonary Hypertension, lower levels of NT-pro-BNP and preserved ejection fraction [31]. Cluster3 and Cluster4 showed a better prognosis but differed in the age of patients and the prevalence of conditions.

This study has several strengths, including a novel workflow for phenotyping HF patients through EHR that can be replicated in different settings. We utilized Ward’s Hierarchical Clustering with Gower’s distance, an underexplored clustering algorithm that has advantages on multimodal data such as EHR [12]. Additionally, we employed network analysis metrics, which we believe are highly intuitive for understanding not only the prevalence of comorbidities but also the co-occurrence under different clusters. We utilized a real-world longitudinal database, which allowed us to conduct a survival analysis, presenting advantages over most similar clustering studies that often discretize time and do not account for censoring on outcome measurement [19].

We must acknowledge several limitations in this work. From a methodological perspective, although we chose the optimal number of clusters according to several described metrics, we acknowledge that they are not data-agnostic, and therefore, it may be difficult to compare what constitutes ’good’ clustering across studies [32]. Additionally, as with other studies that make secondary use of EHR data, there are important implicit biases that warrant further validation of findings in different datasets. For example, we make the assumption that the absence of prescribed medications implies that these patients are not currently undergoing any drug treatment. However, it should be noted that we cannot guarantee that they may not have prescriptions for the same drugs at other healthcare facilities, which could potentially impact our conclusions. Although we developed and internally validated a phenotyping algorithm for HF using ICD-9 codes and free text from the field “diagnostics and symptoms”, we understand both false positive and false negative cases might occur, due to wrong data entry under-reporting and and a lack of comprehensiveness in keywords. Using more data modalities and possibly statistical learning methods for phenotyping may improve case detection accuracy. While we measured unplanned hospital admissions as a surrogate for clinical severity and healthcare resource utilization, different outcomes of interest should also be considered, such as mortality and patient-reported outcomes. We also did not include EF information in our data analysis, an essential parameter for establishing the diagnosis and classifying HF. The patients included in this study were identified as having the diagnosis of HF already established but we strongly believe that it would be insightful to analyze EF range of our derived clusters in future work. We intend to develop future work to tackle these limitations.


This study developed a data workflow to identify and phenotype subgroups of HF patients with multimorbidity, using real-world data from a hospital’s EHR. We identified four clusters that differed in clinical and demographic characteristics, as well as in risk for unplanned hospital admissions. Our findings strengthen the conviction that HF is a complex syndrome with different phenotypes, and that currently available EHR can be utilized to find subgroups with prognostic implications that may be clinically useful for tailoring management. Future work should clarify the relevance of these findings on datasets from other hospitals and through the incorporation of other features extracted from clinical notes and medical imaging (e.g. EF).

Availability of data and materials

The data used in this study, while obtained from deidentified Electronic Health Records, are subject to strict confidentiality and privacy regulations. Data access requests may be considered on a case-by-case basis and will require approval from the relevant institutional review boards and data custodians. Researchers interested in obtaining access to the data for the purpose of validating or extending the findings presented in this paper should contact the corresponding author for further information on the data access process and the necessary legal and ethical requirements.



Angiotensin-converting enzyme inhibitors


Angiotensin receptor blockers


Body mass index


Chronic kidney disease


Chronic obstructive pulmonary disease


Electronic health records


Ejection fraction


Factor analysis of mixed data


Heart failure


Hospital da Luz Lisboa


Hazards ratio




Intensive care unit


Internation classification of diseases version 9


Interquartile range


Multivariate imputation by chained equations


Mineralocorticoid receptor antagonists


Phenotypic disease networks


  1. Navickas R, Petric VK, Feigl A, Seychell M. Multimorbidity: What Do We Know? What Should We Do? J Comorbidity. 2016;6:4–11.

    Article  Google Scholar 

  2. Bähler C, Huber C, Brüngger B, Reich O. Multimorbidity, Health Care Utilization and Costs in an Elderly Community-dwelling Population: A Claims Data Based Observational Study. BMC Health Serv Res. 2015;15:23.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Groenewegen A, Rutten F, Mosterd A, Hoes A. Epidemiology of heart failure. Eur J Heart Fail. 2020;22.

  4. Savarese G, Lund L. Global Public Health Burden of Heart Failure. Card Fail Rev. 2017;03:7.

    Article  Google Scholar 

  5. McDonagh TA, Metra M, Adamo M, Gardner RS, Baumbach A, Böhm M, et al. 2021 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: Developed by the Task Force for the diagnosis and treatment of acute and chronic heart failure of the European Society of Cardiology (ESC) With the special contribution of the Heart Failure Association (HFA) of the ESC. Eur Heart J. 2021;42(36):3599–726.

    Article  CAS  PubMed  Google Scholar 

  6. Gulea C, Zakeri R, Quint J. Model-based comorbidity clusters in patients with heart failure: association with clinical outcomes and healthcare utilization. BMC Med. 2021;19.

  7. Ahmad T, Pencina M, Schulte P, O’Brien E, Whellan D, Piña I, et al. Clinical Implications of Chronic Heart Failure Phenotypes Defined by Cluster Analysis. J Am Coll Cardiol. 2014;64:1765–74.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Ahmad T, Lund L, Rao P, Ghosh R, Warier P, Vaccaro B, et al. Machine Learning Methods Improve Prognostication, Identify Clinically Distinct Phenotypes, and Detect Heterogeneity in Response to Therapy in a Large Cohort of Heart Failure Patients. J Am Heart Assoc. 2018;7:e008081.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Raherison C, Ouaalaya E, Bernady A, Casteigt J, Nocent-Eijnani C, Falque L, et al. Comorbidities and COPD severity in a clinic-based cohort. BMC Pulm Med. 2018;18.

  10. Shivade C, Raghavan P, Fosler-Lussier E, Embi P, Elhadad N, Johnson S, et al. A Review of Approaches to Identifying Patient Phenotype Cohorts Using Electronic Health Records. J Am Med Inform Assoc JAMIA. 2014;21.

  11. Foss AH, Markatou M, Ray B. Distance Metrics and Clustering Methods for Mixed-Type Data. Int Stat Rev. 2019;87(1):80–109.

    Article  Google Scholar 

  12. Gower J. A General Coefficient of Similarity and Some of Its Properties. Biometrics. 1971;27:857–71.

    Article  Google Scholar 

  13. Husson F, Josse J, Lê S. FactoMineR: An R Package for Multivariate Analysis. J Stat Softw. 2008;25.

  14. Cramer C, Porter M, Sayama H, Sheetz L, Uzzo S. Network Literacy: Essential Concepts and Core Ideas. 2015. NetSciEd. Available online at

  15. Hu L, Zhang J, Pan X, Yan H, You ZH. HiSCF: leveraging higher-order structures for clustering analysis in biological networks. Bioinformatics. 2021;37(4):542–50.

    Article  CAS  PubMed  Google Scholar 

  16. Hu L, Yang Y, Tang Z, He Y, Luo X. FCAN-MOPSO: An Improved Fuzzy-based Graph Clustering Algorithm for Complex Networks with Multi-objective Particle Swarm Optimization. IEEE Trans Fuzzy Syst. 2023;31(10):3470–84.

  17. Hidalgo C, Blumm N, Barabasi AL, Christakis N. A Dynamic Network Approach for the Study of Human. PLoS Comput Biol. 2009;5:e1000353.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Clark T, Bradburn M, Love S, Altman D. Survival Analysis Part I: Basic Concepts and First Analyses. Br J Cancer. 2003;89:232–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Bewick V, Cheek L, Ball J. Statistics review 12: Survival analysis. Crit Care (London, England). 2004;11(8):389–94.

    Article  Google Scholar 

  20. UK, NAA, Atherton JJ, Bauersachs J, UK AJC, Carerj S, Ceconi C, Coca A, UK PE, Erol Ç, Ezekowitz J, Fernández-Golfın C. 2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure. Euro Heart J. 2016;37:2129–200.

  21. Waljee A, Mukherjee A, Singal A, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open. 2013;3.

  22. Azur M, Stuart E, Frangakis C, Leaf P. Multiple Imputation by Chained Equations: What is it and how does it work? Int J Methods Psychiatr Res. 2011;20:40–9.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Nielsen F. Hierarchical Clustering. Introduction to HPC with MPI for Data Science. Cham: Springer; 2016. pp. 195–211.

  24. Gordon AD. Classification. CRC Press; 1999.

  25. Lê S, Josse J, Husson F. FactoMineR: an R package for multivariate analysis. J Stat Softw. 2008;25:1–18.

    Article  Google Scholar 

  26. Rousseeuw P. Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis. Comput. Appl. Math. 20, 53-65. J Comput Appl Math. 1987;20:53–65.

  27. Caliński T, Harabasz J. A dendrite method for cluster analysis. Commun Stat. 1974;3(1):1–27.

    Article  Google Scholar 

  28. Davies DL, Bouldin DW. A Cluster Separation Measure. IEEE Trans Pattern Anal Mach Intell. 1979;PAMI–1(2):224–7.

    Article  Google Scholar 

  29. Shah S, Katz D, Selvaraj S, Burke M, Yancy C, Gheorghiade M, et al. Phenomapping for Novel Classification of Heart Failure With Preserved Ejection Fraction. Circulation. 2014;131.

  30. Gimeno-Miguel A, Gutiérrez AG, Poblador-Plou B, Coscollar-Santaliestra C, Pérez-Calvo JI, Divo MJ, et al. Multimorbidity patterns in patients with heart failure: an observational Spanish study based on electronic health records. BMJ Open. 2019;9(12):e033174.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Peters AE, Tromp J, Shah SJ, Lam CSP, Lewis GD, Borlaug BA, et al. Phenomapping in heart failure with preserved ejection fraction: insights, limitations, and future directions. Cardiovasc Res. 2022;118(18):3403–15.

    Article  CAS  PubMed Central  Google Scholar 

  32. Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez JM, Perona I. An extensive comparative study of cluster validity indices. Pattern Recog. 2013;46(1):243–56.

    Article  Google Scholar 

Download references


This work was supported by the IntelligentCare project LISBOA-01-0247-FEDER-045948, which is is co-financed by the ERDF/LISBOA2020. Also funded by FCT, under CMU-Portugal and its INESC-ID and LASIGE Research Units, ref. UIDB/50021/2020, ref. UIDB/00408/2020, and ref. UIDP/00408/2020.

Author information

Authors and Affiliations



CM, AST, MS, BN, NS and FL contributed to the conceptualization of the research project. Formal analysis was performed by CM, AST, MS, and MF. Funding acquisition was secured by NS, FL, and MS. Supervision was provided by AST and MS. JM and CAM performed data curation. CM was responsible for writing the original draft preparation. The manuscript was reviewed and edited by BN, MS, PS, AST, NS and FL. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Bernardo Neves.

Ethics declarations

Ethics approval and consent to participate

This study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki and was approved by the local Institutional Review Board from Hospital da Luz Lisboa (HLL). The ethical approval number by IRB is CES/03/2021/ME. No informed consent was collected because the project is secondary analysis of deidentified data, obtained through fully automated processes (no human interaction). Thus, the local IRB granted waivers for informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martins, C., Neves, B., Teixeira, A.S. et al. Identifying subgroups in heart failure patients with multimorbidity by clustering and network analysis. BMC Med Inform Decis Mak 24, 95 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: