Coronary heart disease and mortality following a breast cancer diagnosis.

Background Coronary heart disease (CHD) is a leading cause of morbidity and mortality for breast cancer survivors, yet the joint effect of adverse cardiovascular health (CVH) and cardiotoxic cancer treatments on post-treatment CHD and death has not been quantified. Methods We conducted statistical and machine learning approaches to evaluate 10-year risk of these outcomes among 1934 women diagnosed with breast cancer during 2006 and 2007. Overall CVH scores were classified as poor, intermediate, or ideal for 5 factors, smoking, body mass index, blood pressure, glucose/hemoglobin A1c, and cholesterol from clinical data within 5 years prior to the breast cancer diagnosis. The receipt of potentially cardiotoxic breast cancer treatments was indicated if the patient received anthracyclines or hormone therapies. We modeled the outcomes of post-cancer diagnosis CHD and death, respectively. Results Results of these approaches indicated that the joint effect of poor CVH and receipt of cardiotoxic treatments on CHD (75.9%) and death (39.5%) was significantly higher than their independent effects [poor CVH (55.9%) and cardiotoxic treatments (43.6%) for CHD, and poor CVH (29.4%) and cardiotoxic treatments (35.8%) for death]. Conclusions Better CVH appears to be protective against the development of CHD even among women who had received potentially cardiotoxic treatments. This study determined the extent to which attainment of ideal CVH is important not only for CHD and mortality outcomes among women diagnosed with breast cancer.

found that a poorer ideal CVH score, comprising the aforementioned factors plus blood pressure, cholesterol, and glucose, was associated with a higher incidence of cardiovascular disease, cancer, and breast cancer specifically [20].
Our evaluation of California cancer registry data highlighted the possible role of shared risk factors in the development of both cancer and CHD, reporting that cancer survivors tend to have multiple CHD risk factors, and that survivorship care often does not address these risk factors [21,22]. Favorable levels of risk factors common to both CHD and cancer are associated with improved CHD and cancer survival [23]. Yet, in addition to the problem of shared risk factors, therapies used to treat breast cancer are linked with cardiovascular injury, thus increasing CHD susceptibility via the "multiple-hit" hypothesis [24][25][26][27][28][29][30][31][32][33]. Breast cancer therapies that are potentially cardiotoxic include chemotherapies, radiotherapy, hormonal treatments, and monoclonal antibodies [24].
To our knowledge, existing studies have not yet assessed the joint effect (interaction) of predisposing cardiovascular risk factors and cancer treatments among breast cancer survivors. Subpopulations, such as breast cancer survivors in poor CVH prior to their cancer diagnosis, may be particularly susceptible to the late effects of chemotherapy, radiation, and other cancer treatments. Thus, this analysis will build on our previous work in the WHI which assessed the relationship between CVH and incident CHD and cancer [20].
A better understanding of synergistic associations between poor CVH and breast cancer treatments on CHD risk after breast cancer has the potential to guide CHD and cancer treatment, as well as post-treatment cancerrelated follow-up care is warranted. Screening and treatment of poor CVH at the time of cancer diagnosis and treatment planning may improve morbidity and mortality from CHD among breast cancer survivors [4,21,[34][35][36]. Existing literature indicates that left-sided radiation, in certain doses, has a synergistic effect with pre-existing cardiac risk factors on the risk of ischemic heart disease [17]. Our goal was to add to this literature by investigating the receipt of radiation alongside other types of cancer therapies on risk of CHD and mortality using novel statistical techniques [37].

Data source and study design
In this study, electronic health record (EHR) data was obtained from a large midwestern medical center. The patients (n = 1934) were all initially diagnosed with breast cancer during 2006 or 2007 and did not have preexisting CHD. We included follow-up data for 10 years following the initial diagnosis. Our goal was to investigate the association between CVH, potentiallycardiotoxic cancer treatments, age, race, and the 10-year risk of post-treatment CHD [38] and death, respectively. We defined CHD according to 217 unique ICD 9/10 diagnosis codes and 14 unique procedure codes, and date of death was ascertained from EHR data which were updated regularly with data from the National Death Index.
We utilized measures of CVH as follows: smoking status, body mass index (BMI), blood pressure, glucose/ hemoglobin A1c, and cholesterol [20], which were introduced by the AHA and shown in detail in Table 1. The most recent CVH data were ascertained within the 5 years prior to the diagnosis of breast cancer. We used these baseline data to assign a pre-treatment CVH score to each woman with a breast cancer diagnosis. A value of "ideal" corresponded with 2 points on that submetric; a value of "intermediate" with 1 point; and a value of "poor" with 0 points.
For smoking status, for which data were complete, we classified current smoking as 0 points, and not current smoking as 1 point, as there were no data available to indicate if they had never smoked or had quit for more than 1 year (representing the "ideal" category). However, not all CVH data were completed for all women; therefore, we imputed a value of 2 points for all missing submetric values. We tested the robustness of this strategy by imputing a value of 1 or 0, respectively, for missing values. For all analyses, overall CVH was calculated by a sum of all points, divided by the total possible points [10], and multiplied by 100, and was defined as: 0-< 30% for poor; 30-< 80% for intermediate; and 80-100% as ideal.
Of interest for these analyses were the following treatments, due to their potential adverse effects on the myocardium: chemotherapy, left-sided radiation, hormonerelated or anti-estrogen pills, and Herceptin. We categorized the breast cancer treatments as eight categories according to what medicines were ordered: anthracyclines, hormone therapy, aromatase inhibitors, monoclonal antibodies, antimicrotubule agents, alkylating agents, antimetabolites, and other (e.g., Bortezomib). In our current analysis, we included the receipt (yes/no) of each type of treatment.

Statistical analysis
We classified age into three groups: 20-40, 41-60, and 61-100 years. The age is the age at the breast cancer diagnosis. Race/ethnicity was defined as: black, nonblack, and unknown. After we quantified and categorized the features of CVH, cancer treatments, age, and race, we applied both traditional statistical methods and machine learning algorithms [i.e., support vector machines (SVM) [39], decision tree [40], and logistic regression [41]] to investigate the associations between age, race, CVH, cancer treatments, the interaction between CVH, cancer treatments, and CHD and all-cause mortality, respectively. We also conducted the Welch's t-test and produced boxplots to evaluate the differences between independent and joint effects of CVH and cancer treatments. In the machine learning models, we used CVH, treatment and the interaction of CVH and treatment as features, and applied linear SVM, decision tree, and logistic regression models, respectively, to predict if a woman had incident CHD or death during 10 years of follow-up. For the death prediction, we randomly selected a similar number of patients who had died (n = 468) to compare to a sample of patients who had not died (n = 374) due to the imbalance of our data according to mortality. We used all patient observations in the CHD risk prediction models. We tested the 5 CVH submetrics and 8 treatment categories as input features for the classification models. The dataset was randomly split into training (80%) and test (20%) data sets, on which the models were trained and then applied. Criteria of accuracy and area under the receiver operator curve (AUC) were calculated to evaluate the performance of the models. Analyses were conducted by using the libraries of Scikit-learn, Scipy, Matplotlib with Python, version 3.6.5 (2018).

Results
The average age of the population was 58.5 years, and the majority of women (73%) were non-black (Table 2). Approximately 20% of women were currently taking a cholesterol medication, and few women (3%) were current smokers (Table 2). There were 341 patients with receipt of any class of cardiotoxic cancer treatments. Among these 341 patients, 46% women received aromatase inhibitors and 26% women received hormone therapy. During the 10-year follow-up period, one-third of the population developed CHD and 19% died. Figure 1 shows the counts of women with each outcome of interest and the proportion of women represented in that strata. Women with a lower occurrence of CHD were younger (20-40 years), and the prevalence of CHD steadily increased across older age groups (Fig. 1a). Black women experienced a higher occurrence (48%) of CHD compared to the other race groups (31%) (Fig. 1b). Rates of CHD were lower among women with an ideal CVH score (24%) as compared to those with CVH at non-ideal levels (61.9%) (Fig. 1c). Receipt of potentially cardiotoxic breast cancer treatments was associated with an increased occurrence of posttreatment CHD. Rates of incident CHD were higher among women who received any cancer treatment (58.9%) compared to the women who did not receive any cancer treatments (29.1%) (Fig. 1d). We observed similar trends for the outcome of mortality (Fig. 1e-h).
Particularly, Fig. 1f shows 29% of the black women died compared to 15.9% for the non-black race; Fig. 1h shows that higher percentage of patients died in the recipient of treatment group. Women who died during the 10year follow-up tended to be older, of black race, who received cancer treatments, and who had non-ideal CVH.
In Fig. 2, we show the independent and joint effects of receipt of cardiotoxic breast cancer treatments and poorer CVH. Women in poor (non-ideal) CVH who were also exposed to cardiotoxic treatments had a synergistically higher occurrence of post-treatment CHD (75.9%) compared to women not exposed to cardiotoxic treatments who were in good CVH (20.8%) (Fig. 2a). Women in poor CVH who were not exposed to cardiotoxic treatments, as well as women in good CVH who were exposed to cardiotoxic treatments, had an elevated occurrence of post-treatment CHD (55.9 and 43.6% respectively), but did not experience a rate as high as those who were doubly-exposed. Similar trends were observed for the outcome of mortality (Fig. 2b). In addition, the independent effect of treatment is bigger on CHD (43.6%) than on death (35.8%).
The boxplots in Fig. 2c also indicate the significant difference between CHD rates among women who were doubly-exposed and the women who were independently We obtained similar results using machine learning models. Table 3/4 (in the supplementary material) lists the performance results for death/CHD prediction by the three models. The accuracy for predicting death by SVM was 69% for models containing CVH (68% by decision tree, 69% by logistic regression), 63% for models containing cancer treatment (69% by decision tree, 66% by logistic regression) and 70% for models containing both CVH and treatment (72% by decision tree, 72% by logistic regression). The metrics of precision, recall and f1-score had a similar trend for the prediction. The prediction performance results held the same trends for the CHD prediction (Table 4).
The first column in Fig. 3 shows the AUC plots for CHD prediction while the second column shows the prediction characteristics for the outcome of death by SVM, decision tree, and logistic regression classifiers under three different conditions of features: CVH, cancer treatments, and combined CVH and cancer treatments. The average AUC of the three machine learning models was 0.65 for CVH, 0.60 for cancer treatment, and 0.73 for both CVH and cancer treatments. We obtained similar results for the mortality analyses. We also performed 10-fold cross validation for each model and the results were similar (data not shown). The results from Table 3 and Fig. 3 indicate that all three models achieved higher accuracy with the inclusion of joint effects as compared to only individual effects. Specifically, models which include both CVH and receipt of treatment data provide additional information and improve the prediction of CHD and death. Patients with poor overall CVH who received cancer treatments had the highest risk of CHD and death.

Discussion
In this study, we utilized data from the EHR to identify women who were diagnosed with breast cancer in order to examine the independent and joint effects of CVH and cancer treatments on 10-year risk of post-treatment CHD or death. Our results indicated women with ideal CVH scores, and those who did not receive potentially cardiotoxic cancer treatments had the lowest risk of post-treatment CHD or death, while the joint effects of poor CVH and exposure to cancer treatments significantly increased the risk of post-treatment CHD or death. Additional factors that were associated with a higher prevalence of CHD and death included older age and black race.
Our results were consistent with previous conclusions that minority and older adults were more likely to have poorer CVH and ideal CVH was inversely associated with cancers and cardiovascular disease [20]. Consistent with biologic plausibility, our results indicated a higher risk of post-treatment CHD among those who received breast cancer treatments such as ionizing radiation to the heart [27].
The innovation in this study was to investigate the joint effects of CVH and potentially cardiotoxic breast cancer treatments by both statistical methods and multiple machine learning approaches. The results from all these methods were consistent, indicating the robustness of our methods and results. Our next step is to investigate some questions such as which individual treatments (e.g. anthracyclines, hormone therapy) and individual CVH submetrics (e.g. BMI, blood pressure) are the most important variables for predicting CHD and death, but these questions are beyond the scope of this paper. We also plan to replicate these analyses in distinct cancer types. Another next step is to involve deploying and evaluating clinical decision support in the cancer survivorship setting for managing cardiovascular late effects among cancer survivors. Our clinical decision support system (CDSS) presents CVH and cancer treatment data separately in the EHR-embedded data visualization. Our goal is to 1 day integrate a validated cardiovascular risk algorithm into our existing CDSS to better target cardiovascular disease prevention and management efforts in cancer survivorship.

Limitations
We encountered many limitations in using EHR data for these analyses. First, there were many missing data for CVH, likely because these women were not being seen for preventive care but rather for cancer care and treatment. Second, we acknowledge that we may be missing CHD and mortality outcome data for women who obtained cancer care and treatment at our medical center but after which returned home and sought care outside of our medical center. Third, we used CVH measurements up to 5 years prior to the cancer diagnosis, of which the time frame varied for each woman. Fourth, physical activity and diet data are not commonly recorded in the EHR as structured, actionable data elements. If physical activity and diet data do exist in the EHR, they are usually recorded as clinical notes using free text. Importantly, these data are not easily translated into the American Heart Association's metric definitions and thus are not actionable at the point-of-care or easily incorporated into risk scoring algorithms. Similarly, data on diagnosis and treatment, have not always been stored as structured data elements in the EHR. Conducting this analysis required mining data from legacy EHR systems and for pragmatic reasons we accessed only structured data elements for this analysis, which resulted in incomplete data ascertainment. Finally, we are missing data on radiotherapy. We will explore the missing radiotherapy data from multiple data sources as our future work.

Conclusions
An ideal CVH score predicted a lower risk of posttreatment CHD or death. Receipt of cardiotoxic breast cancer treatments was associated with increased posttreatment CHD or death, and there was a synergistic Fig. 3 The first column (a-c) represent CHD prediction, and the second column (d-f) show results of mortality prediction. a and d show the AUC in ROC by SVM models, b and e show the AUC in ROC by decision tree models, and c and f show the AUC in ROC by logistic regression models. The three curves in each plot represent the individual and joint effects of CVH and potentially-cardiotoxic treatments effect of CVH such that better CVH seemed to be protective against the development of CHD even among women who had received potentially cardiotoxic treatments. This study determined the extent to which ideal CVH is important to attain and maintain for more favorable outcomes following a breast cancer diagnosis.
Additional file 1 Table S1 Performance results of the three models for prediction of the outcome of mortality. Table S2 Performance results of the three models for prediction of the outcome of CHD.
Abbreviations CHD: Coronary heart disease; CVH: Cardiovascular health; AHA: American Heart Association; EHR: Electronic health record; BMI: Body mass index; AUC: Area under the receiver operator curve; SVM: Support vector machines