- Research article
- Open Access
- Open Peer Review
Description and validation of a Markov model of survival for individuals free of cardiovascular disease that uses Framingham risk factors
BMC Medical Informatics and Decision Making volume 4, Article number: 6 (2004)
Estimation of cardiovascular disease risk is increasingly used to inform decisions on interventions, such as the use of antihypertensives and statins, or to communicate the risks of smoking. Crude 10-year cardiovascular disease risk risks may not give a realistic view of the likely impact of an intervention over a lifetime and will underestimate of the risks of smoking. A validated model of survival to act as a decision aid in the consultation may help to address these problems. This study aims to describe the development of such a model for use with people free of cardiovascular disease and evaluates its accuracy against data from a United Kingdom cohort.
A Markov cycle tree evaluated using cohort simulation was developed utilizing Framingham estimates of cardiovascular risk, 1998 United Kingdom mortality data, the relative risk for smoking related non-cardiovascular disease risk and changes in systolic blood pressure and serum total cholesterol total cholesterol with age. The model's estimates of survival at 20 years for 1391 members of the Whickham survey cohort between the ages of 35 and 65 were compared with the observed survival at 20-year follow-up.
The model estimate for survival was 75% and the observed survival was 75.4%. The correlation between estimated and observed survival was 0.933 over 39 subgroups of the cohort stratified by estimated survival, 0.992 for the seven 5-year age bands from 35 to 64, 0.936 for the ten 10 mmHg systolic blood pressure bands between 100 mmHg and 200 mmHg, and 0.693 for the fifteen 0.5 mmol/l total cholesterol bands between 3.0 and 10.0 mmol/l. The model significantly underestimated mortality in those people with a systolic blood pressure greater than or equal to 180 mmHg (p = 0.006).
The average gain in life expectancy from the elimination of cardiovascular disease risk as a cause of death was 4.0 years for all the 35 year-old men in the sample (n = 24), and 1.8 years for all the 35 year-old women in the sample (n = 32).
This model accurately estimates 20-year survival in subjects from the Whickham cohort with a systolic blood pressure below 180 mmHg.
The evaluation of the risk of developing coronary heart disease (CHD) is increasingly used as a basis for making treatment decisions to prevent cardiovascular disease (CVD). Internationally, the most established measure is the Framingham risk  for developing CHD over a 10-year period [2–4]. Such measures also have some value as a means of communicating risk to individuals in consultations. This facilitates patient participation in treatment decisions and can help inform advice about the risks of smoking. However, there are weaknesses to this approach. A common approach is to use a 15% 10-year risk of CHD as a threshold for using antihypertensive drugs in people with a systolic BP between 140 and 160 mmHg [3, 4]. However, there are weaknesses to using simple absolute CHD risk without consideration of other factors . Cardiovascular risk increases with age and so the elderly more often cross this treatment threshold. Most other causes of death also increase with age and life expectancy reduces with age. Given that the benefits of treatment are accrued over time, younger patients with a lower risk of CHD, but a greater life expectancy might have more to gain from treatment. Also when communicating the risks of smoking, the CHD risk is an underestimate of the true risks of smoking because of the wide variety of other pathologies that it causes.
The 10-year Framingham CHD risk for an individual only gives a crude idea of the likely impact of treatment or smoking cessation on an individual's life as it does not take into account the impact of competing causes of death, in particular other significant causes of mortality related to smoking. Other Markov models have been developed to assess the impact of cardiovascular disease mortality  or to evaluate the cost-effectiveness of treatment , but most do not take into account the relative risks of non-cardiovascular death for smokers compared to non-smokers [7–10], or model survival in populations rather than individuals . Grover et al developed a model based on the Lipid Research Clinics program which makes some adjustment for the relative risks of smoking . It has been validated against a number of intervention trials showing that its predictions of survival correlate highly with the observed survival. A Markov cycle tree evaluated using cohort simulation was developed to estimate survival over a lifetime. The model uses the Framingham equations for calculating CVD risks. The model also takes into account non-cardiovascular competing causes of death and models the changes in CHD risk factors with age. A Markov cycle tree structure was used because of the complex variety of pathways between the starting 'well' state and the absorptive 'dead' state .
The model presented in this paper takes into account competing causes of death, changes in risk factors with age and the relative risks of smoking on non-CVD mortality. It can estimate the probability of survival (between 0.0 and 1.0) at annual increments from the start age to the age of 85. Before such a model could be used in clinical practice to inform treatment decisions, it is important that some measure of its predictive accuracy is obtained.
This study uses data from the Whickham study [14, 15] to assess the accuracy of the model at predicting survival in this cohort at twenty years. The original Whickham study was conducted between 1972 and 1974 in a mixed urban and rural area close to Newcastle upon Tyne. The cohort included 2779 adults aged over the age of 18 years, randomly identified to generate a sample that closely matched the United Kingdom (UK) population in terms of age, gender and social class. The original data set included blood pressure (BP), electrocardiogram (ECG) and serum total cholesterol (TC), thus allowing its use for Framingham equations. A 20-year follow-up study was also conducted which collected further data on the incidence of thyroid disorders and also collected data on morbidity and mortality.
Absolute, annual non-CVD risk of death were derived by linear interpolation from the UK National mortality statistics for 1998 . The risks for those causes of death that were smoking related were adjusted for smokers and non smokers using the relative risks from the 4-year follow-up of the US Cancer Society's 50-state study (CPS-II) quoted in the US Surgeon General's report of 1989 (Table 1) .
These relative risks were used to adjust the death rates from smoking related causes of death using the formula below:
RS = Y.M/(S.(Y-1) + 1) (for smokers)
RN = M/(S.(Y-1) + 1) (for non-smokers)
Y = Relative risk for smokers.
S = Proportion of smokers.
RN = Absolute risk for non-smokers.
RS = Absolute risk for smokers.
M = Mortality rate per person.
The proportion of smokers at each age for each sex was taken from the Health Survey for England 1998 . A similar formula was used by Pharoah in his life table modelling intervention with statins .
The model used the Framingham risks for CHD and stroke rather than the Framingham risk for CVD death as this accounted for the majority of CVD deaths and was easier to map to the UK mortality statistics. The annual risks for the various CVD risks included in the model were calculated by taking one quarter of the 4-year Framingham risk . The 4-year Framingham risk was used as this is the shortest period calculable using the Framingham equation.
For the states 'Survived a myocardial infarction (MI)', 'Survived other CHD', 'Survived a stroke' and 'Survived other CVD' the risk of death is estimated by taking the Framingham risk for that individual and multiplying it with the corresponding relative risk in Table 2.
Six states are modelled: 'Alive and well', 'Survived an MI', 'survived other CHD ', 'survived a stroke', 'survived other CVD' and 'Dead'. The state 'survived other CHD' would largely consist of those who develop angina without first suffering an MI. The state 'survived other CVD' would consist of other CVD states such as intermittent claudication. Their relationships are shown in the state transition diagram in Figure 1. There is no transfer between the four states representing survival of a cardiovascular event and there is a modelling assumption that the risk of death is not increased by further cardiovascular events.
The basic method used by the model is outlined in the Markov cycle-tree in Figure 2. The activity diagram is shown in Figure 3. The time horizon of the model is to the age of 85, and the cycle length is one year. Through the lifetime of the individual, the model adjusts the BP and TC using the change in mean BP for each year derived from the 'Health Survey for England: cardiovascular disease in 1998' . It was implemented using the Microsoft Excel™ spreadsheet package.
The model estimates survival for an individual between 35 and 75. It starts at the current age of the individual, estimating mortality in each successive year, up to the time horizon of the age of 85 years taking account of the risk factors in Table 3.
All cases within the Whickham data set were identified between the ages of 35 and 65 and the risk factors described in Table 3 were extracted. Absence of left ventricular hypertrophy (LVH) and an average high density lipoprotein (HDL) of 1.3 for males and 1.6 for females was assumed as these are the approximate averages in the Health Survey for England: CVD in 1998 .
Cases with missing data or any kind of heart disease or cerebrovascular disease at baseline were excluded. The model estimate of the probability of survival for each case was identified at 20 years. Once the survival probabilities had been generated the average actual survival was calculated by finding the proportion of the cohort still alive at 20 years. The average probability of survival was calculated by finding the mean of all the estimated 20-year survival probabilities generated by the model.
Further analysis was conducted by sorting and grouping subjects in the order of their rank for each factor in Table 4. The mean actual survival in each group of these subjects was plotted against the mean estimated survival probability at 20 years.
The biological and measurement variability of BP and TC are significant. A sensitivity analysis for these two factors was conducted. For systolic BP, a total coefficient of variation (CVT) of 5.6% was taken  giving 95% confidence intervals (95% CI) of approximately +/- 11%. For TC, a CVT of 7.4% was used giving approximate 95% CI of +/- 15%. A grid of 12 hypothetical subjects of the ages 35, 50 and 65 with gender and smoking status was drawn up using a systolic BP of 135 mmHg for the TC sensitivity analysis and a TC of 5.5 mmol/l for the systolic BP sensitivity analysis.
A sub-group analysis was performed to examine the performance of the model in the sub-groups in Table 5.
The model was used to estimate the potential gains in life expectancy (PGLE) from the elimination of CVD as a cause of death. Half cycle correction was used . This was firstly done on an typical example 35 year-old non-smoking man and 35 year-old woman non-smoking with a systolic BP of 131 mmHg and a TC/HDL ratio of 4.08. Secondly, the model was used to estimate the PGLE of each 35-year-old in the Whickham sample to give the average PGLE's for these men and women at the age of 35 years.
Results and discussion
Of the 2779 people in the Whickham cohort, 1,541 were between the ages of 35 and 65 inclusive. Of these 8 had missing data. Another 142 were excluded because of pre-existing CVD. The results of the subgroup analysis are shown in Table 5. The correlation between the model's estimated survival and actual survival are shown in Table 6. Graphs showing the plots of the model's estimates of survival probability and the average survival in the Whickham cohort for each of the groups analysed in Table 3 are shown in the Figures 4, 5, 6, 7, 8 and 9.
There is a high level of agreement between the predictions of the model and the actual survival in the Whickham cohort. However, as can be seen from the Table 5, the model statistically significantly underestimates mortality at twenty years in those people with a systolic BP over 180 mmHg even though there were only 95 cases in that group (p = 0.006). There were no other significant differences between the two groups.
The correlations between model estimates of the probability of survival and actual survival in the specified groupings are given in Table 6. The results of the sensitivity analysis are given in Table 7.
The PGLE for our typical 35 year-old man was 2.7 years and 1.8 years for our typical 35-year-old woman. For the twenty-four 35-year-old men in the sample used in the validation, the average PGLE was 4.0 years. For the thirty-two 35-year-old women the PGLE was 1.8 years.
The survival estimated by this model, and the observed survival in the Whickham study correlate highly. It significantly overestimates survival in those with a systolic BP of 180 mmHg or more (p = 0.006).
For 50-year-old men the sensitivity analysis shows up to a 9.5% range in the 95% confidence intervals (CI) for the estimated survival at 20 years based on 3 BP readings and a 6.5% range based on a single measurement of the TC. Otherwise the 95% CI do not exceed +/- 2% of the mean. This would underline the need to take more than 3 BP readings and at least two serum TC measurement in middle aged men.
This model is focused on individuals and is not a population simulation as is the CHD Policy Model . Consequently it can be used to give individualised risk information in real time. Grover et al used data from the Lipid Research Clinics cohort and included a multivariate model for death from 'other causes' in addition to CHD and stroke and so took account of competing causes of death . In addition to modelling competing risk of death, the model described here adjusts the risks of non-cardiovascular death by smoking status and also models the change in BP and cholesterol through a lifetime.
This system may be of value in the development of public health programs where different intervention and prevention strategies could be modelled giving reliable estimates of the impact on survival. The inclusion of an adjustment for the non-CVD risks of death from smoking is particularly important here, as these competing causes of death will have a greater impact in smokers than in non-smokers.
Tools such as these will naturally be attractive to insurers for actuarial assessment but their use may be controversial. The model may reduce uncertainty in survival in certain groups and reduce risk in setting the levels of premiums. However, this could be regarded as undermining the very principles on which insurance is based – uncertainty and the sharing of risk.
The weaknesses of the model and further development
Historical variation and the effect on conclusions of validity
The Framingham equations used here were developed from cohort data collected in the 1960s and 1970s. We know that since then there has been a fall in the incidence of CHD mortality and a change in the prevalence of smoking. Whilst it may be that the change in incidence is due to the change in the prevalence of risk factors included in the Framingham equation, we cannot be sure that there are not other extraneous factors that have varied over that time and may have affected the incidence of CHD. If so, the validity of the Framingham equation in modern populations may be undermined. For example, some other risk factor such as serum fibrinogen levels, homocysteine, soft water, chlamydia infection or an as yet undetected factor my have altered which would distort the Framingham predictions. It is not just risk factors that are relevant here, but also protective factors such as a moderate intake of red wine  or exercise . We have used the Whickham survey that collected its 20-year follow-up data in the early 1990s. This model will be used in individuals at the beginning of a period of prediction rather than at the end, so the best we can say is that this model was valid when making predictions of survival in a UK population 30 years ago when the Whickham data was first being collected.
This model makes a number of assumptions about non-smokers and smokers that do not quite fit the real world. These assumptions are that:
• Smokers remain smokers and do not quit.
• Non-smokers have never smoked at all.
If this model were to be used to assess the benefits of quitting smoking, it would be on the assumption that a quitter's risk falls instantly to the risk of non-smokers. These are clearly invalid assumptions. When evaluating the survival of smokers it is on the assumption that they will remain smokers all their life. This impacts upon our evaluation, because those individuals who were recorded as smokers at their baseline assessment at time 0 were assumed to remain smokers until death or their 20-year follow-up. Clearly many will have given up in the intervening period. The net results of this would be an overestimate of mortality in smokers. This is in keeping with our results that show a small, non-significant overestimate of mortality in smokers.
The risks of ex-smokers probably never fall to those of non-smokers, and the reduction in risk varies with the age at which you quit. Quitting before middle-age reaps greater benefit than quitting later in life . It would be feasible to use the different relative risk for ex-smokers given in the Surgeon General's report of 1989 to improve the model for ex-smokers. It would even be possible to indicate the impact of quitting at various times in the future.
This model can give an indication of life gained with intervention or elimination of CHD and stroke. For example, our typical 35 year-old, non-smoking man (systolic BP of 131 and a TC/HDL ratio of 4.08) eliminating CHD and stroke as a cause of death (relative risk reduction of 100%) would reduce his mortality to the age of 85 of by about 12.1% (Figure 10). This amounts to an average PGLE of 2.71 years. The pale yellow coloured part of the graph in Figure 10 represents the life gained.
The Coronary Heart Disease Policy Model predicted the PGLE from eliminating CHD to be 3.1 years for men. This is in keeping with the predictions of our model. However, the Coronary Heart Disease Policy Model also predicted a PGLE of 3.3 years for women. This is actually higher than the quoted PGLE for men and nearly double the PGLE predicted by our model. CVD death rates are higher in men than women across all ages in the UK and so the model described here would seem more consistent with the observed epidemiological data .
If the forty people aged 55 in this study reduced their risk of CHD or stroke by 88%, then the PGLE would be 2.2 years. Wald et al in their paper on the Polypill estimated that a third of 55 year-olds would gain about 11 years free of CHD events . Their 'simple Markov model' did take account of CVD as well as of 'dying from another cause' but had markedly divergent results from this study. This may reflect the small sample size of 55 year-olds in this study (n = 40), or a failure of Wald's model to take into account all of the factors included in this model.
Mackenbach estimated that the PGLE from the elimination of CVD would be about 4.0 years from birth . This would seem to be in keeping with the results of our model as the proportion of deaths from CVD prior to the age of 35 is very small . On the whole it would seem that the predictions of this model are in keeping with the bulk of other model estimates of PGLE.
This model gives valid estimates of 20-year survival in Whickham cohort members between the ages of 35 and 60 who are free of CVD and have systolic BPs below 180 mmHg. It could form the basis of a decision aid in the primary prevention of CVD. It would be useful in the modelling of intervention and prevention strategies and could be a valuable tool for actuarial assessment.
Anderson KM, Wilson PW, Odell PM, Kannel WB: An updated coronary risk profile. A statement for health professionals. Circulation. 1991, 83: 356-362.
Wood D, Durrington P, Poulter N, McInnes G, Rees A, Wray R: Joint British recommendations on prevention of coronary heart disease in clinical practice. British Cardiac Society, British Hyperlipidaemia Association, British Hypertension Society, endorsed by the British Diabetic Association. Heart. 1998, 80 (Suppl 2): S1-29.
Department of Health: National Service Framework for Coronary Heart Disease: Main document. Department of Health. 2000
Baker S, Priest P, Jackson R: Using thresholds based on risk of cardiovascular disease to target treatment for hypertension: modelling events averted and number treated. BMJ. 2000, 320: 680-685. 10.1136/bmj.320.7236.680.
Montgomery AA, Fahey T, Ben Shlomo Y, Harding J: The influence of absolute cardiovascular risk, patient utilities, and costs on the decision to treat hypertension: a Markov decision analysis. J Hypertens. 2003, 21: 1753-1759. 10.1097/00004872-200309000-00026.
Tsevat J, Weinstein MC, Williams LW, Tosteson AN, Goldman L: Expected gains in life expectancy from various coronary heart disease risk factor modifications. Circulation. 1991, 83: 1194-1201.
Lai D, Hardy RJ: Potential gains in life expectancy or years of potential life lost: impact of competing risks of death. Int J Epidemiol. 1999, 28: 894-898. 10.1093/ije/28.5.894.
Lloyd-Jones DM, Larson MG, Beiser A, Levy D: Lifetime risk of developing coronary heart disease. Lancet. 1999, 353: 89-92. 10.1016/S0140-6736(98)10279-9.
Magni P, Quaglini S, Marchetti M, Barosi G: Deciding when to intervene: a Markov decision process approach. Int J Med Inf. 2000, 60: 237-253. 10.1016/S1386-5056(00)00099-X.
Pharoah PD, Hollingworth W: Cost effectiveness of lowering cholesterol concentration with statins in patients with and without pre-existing coronary heart disease: life table method applied to health authority population. BMJ. 1996, 312: 1443-1448.
Weinstein MC, Coxson PG, Williams LW, Pass TM, Stason WB, Goldman L: Forecasting coronary heart disease incidence, mortality, and cost: the Coronary Heart Disease Policy Model. Am J Public Health. 1987, 77: 1417-1426.
Grover SA, Paquet S, Levinton C, Coupal L, Zowall H: Estimating the benefits of modifying risk factors of cardiovascular disease: a comparison of primary vs secondary prevention. Arch Intern Med. 1998, 158: 655-662. 10.1001/archinte.158.6.655.
Sonnenberg FA, Beck JR: Markov models in medical decision making: a practical guide. Med Decis Making. 1993, 13: 322-338.
Vanderpump MP, Tunbridge WM, French JM, Appleton D, Bates D, Clark F, Grimley EJ, Rodgers H, Tunbridge F, Young ET: The development of ischemic heart disease in relation to autoimmune thyroid disease in a 20-year follow-up study of an English community. Thyroid. 1996, 6: 155-160.
Tunbridge WM, Evered DC, Hall R, Appleton D, Brewis M, Clark F, Evans JG, Young E, Bird T, Smith PA: Lipid profiles and cardiovascular disease in the Whickham area with particular reference to thyroid failure. Clin Endocrinol (Oxf). 1977, 7: 495-508.
Office for National Statistics: Mortality Statistics: General: Series DH1 no. 31. The Stationery Office. 1998
Surgeon General: 1989 Surgeon General Report: Reducing the Health Consequences of Smoking. US Dept Health and Human Services. 1989
Erens B, Primatesta P: Health Survey for England: Cardiovascular Disease. National Statistical Office. 1999
Moore CR, Krakoff LR, Phillips RA: Confirmation or exclusion of stage I hypertension by ambulatory blood pressure monitoring. Hypertension. 1997, 29: 1109-1113.
Wollin SD, Jones PJ: Alcohol, red wine and cardiovascular disease. J Nutr. 2001, 131: 1401-1404.
Manson JE, Greenland P, LaCroix AZ, Stefanick ML, Mouton CP, Oberman A, Perri MG, Sheps DS, Pettinger MB, Siscovick DS: Walking compared with vigorous exercise for the prevention of cardiovascular events in women. N Engl J Med. 2002, 347: 716-725. 10.1056/NEJMoa021067.
Taylor DH, Hasselblad V, Henley SJ, Thun MJ, Sloan FA: Benefits of smoking cessation for longevity. Am J Public Health. 2002, 92: 990-996.
Wald NJ, Law MR: A strategy to reduce cardiovascular disease by more than 80%. BMJ. 2003, 326: 1419-10.1136/bmj.326.7404.1419.
Mackenbach JP, Kunst AE, Lautenbach H, Oei YB, Bijlsma F: Gains in life expectancy after elimination of major causes of death: revised estimates taking into account the effect of competing causes. J Epidemiol Community Health. 1999, 53: 32-37.
Hannerz H, Nielsen ML: Life expectancies among survivors of acute cerebrovascular disease. Stroke. 2001, 32: 1739-1744.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6947/4/6/prepub
This study was conducted as the dissertation for the MSc in Health Informatics at the Centre for Health Informatics and Medical Education at the University College London. Particular thanks go to Dr Paul Taylor Director of the M.Sc. course and my PhD supervisor. Professor Patrick Vallance Director of the Centre for Clinical Pharmacology, Therapeutics and Toxicology at UCL who supervised the M.Sc. dissertation. The Eastern Regional Health Authority of the NHS supported my M.Sc. and this study with an Enterprise Award Scheme grant directed by Dr Jim Elliot. The principal author is the lead researcher in the Laindon Health Centre Primary Care Research Team, which is supported by a grant from the Essex Primary Care Research Network (EPCRN) formerly part of the East London and Essex Network of Researchers (ELENoR). Thanks are due to Dr Elizabeth Murray, my secondary supervisor, and Drs Montgomery and Grover and Martin for constructive criticism of the paper.
Dr Chris Martin: Thurrock Primary Care NHS Trust has purchased a license to use an implementation of this model in primary care consultations. Dr Martin may further exploit any intellectual property rights arising from this work.
Dr Mark Vanderpump: none declared.
Mrs Joyce French: None declared.
CM conceived and created the model. He performed the validation and was the principal author of the paper. MVP and JF collected, cleaned and analyzed the Wickham cohort data that was used for the model validation and commented on the final draft of the paper.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Martin, C., Vanderpump, M. & French, J. Description and validation of a Markov model of survival for individuals free of cardiovascular disease that uses Framingham risk factors. BMC Med Inform Decis Mak 4, 6 (2004) doi:10.1186/1472-6947-4-6
- Coronary Heart Disease
- Coronary Heart Disease Risk
- Lipid Research Clinic
- Framingham Equation
- Framingham Coronary Heart Disease