Concordance between decision analysis and matching systematic review of randomized controlled trials in assessment of treatment comparisons: a systematic review
BMC Medical Informatics and Decision Making volume 14, Article number: 57 (2014)
Systematic review (SR) of randomized controlled trials (RCT) is the gold standard for informing treatment choice. Decision analyses (DA) also play an important role in informing health care decisions. It is unknown how often the results of DA and matching SR of RCTs are in concordance. We assessed whether the results of DA are in concordance with SR of RCTs matched on patient population, intervention, control, and outcomes.
We searched PubMed up to 2008 for DAs comparing at least two interventions followed by matching SRs of RCTs. Data were extracted on patient population, intervention, control, and outcomes from DAs and matching SRs of RCTs. Data extraction from DAs was done by one reviewer and from SR of RCTs by two independent reviewers.
We identified 28 DAs representing 37 comparisons for which we found matching SR of RCTs. Results of the DAs and SRs of RCTs were in concordance in 73% (27/37) of cases. The sensitivity analyses conducted in either DA or SR of RCTs did not impact the concordance. Use of single (4/37) versus multiple data source (33/37) in design of DA model was statistically significantly associated with concordance between DA and SR of RCTs.
Our findings illustrate the high concordance of current DA models compared with SR of RCTs. It is shown previously that there is 50% concordance between DA and matching single RCT. Our study showing the concordance of 73% between DA and matching SR of RCTs underlines the importance of totality of evidence (i.e. SR of RCTs) in the design of DA models and in general medical decision-making.
Medical decision-making requires a comprehensive analysis of benefits and harms associated with available treatment options. Randomized controlled trials (RCTs), and in turn systematic reviews (SRs) of RCTs, are considered the reference standard in resolving treatment uncertainties [1, 2]. However, in many instances RCTs and in turn SR of RCTs lack the power and long duration of follow up needed to assess the long-term outcome estimates . Decision analysis (DA) can be used to provide the required estimates to inform medical decision-making .
Decision analysis models have been used in medical-decision making since 1972 [5–7]. For a DA model to be useful and applicable it should reflect real problems of patients and data on clinical outcome probabilities should be generated using a systematic approach. While DAs can allow users to make informed decisions when confronted with difficult clinical scenarios, their oversimplification of real world scenarios can be problematic . DA models are often based on data derived from empirical studies with short-term follow up and the biases from these studies influence modeled outcomes . Although there are guidelines for assessing the usefulness of DA and their role in medical decision-making [9, 10] very few studies have assessed the soundness of DAs. That is, how often DA results agree with findings of matching RCTs or SR of RCTs (published after the DA) has not been comprehensively investigated. We are aware of two studies that have assessed the soundness of DA using subsequent clinical study results [11, 12]. The study by Bress et al. conducted in the field of infectious diseases determined that findings of DAs were in concordance with findings of clinical studies (including RCTs and observational studies) in 75% of the cases assessed . We have previously shown that findings of DAs and matching RCTs are in concordance only 50% of the time . However, it is not known how often findings of DAs correspond with matching SRs of RCTs. Accordingly, the objective of this study is to assess how often findings of DAs are in concordance with matching SRs of RCTs.
Decision analyses comparing two or more treatments were eligible for inclusion in this study. Systematic reviews of RCTs published after the matching DA models were included. That is, SR of RCTs published before the matching DA was not included in this study. If a matching SR for an included DA model was not found, the DA was excluded for the study.
We searched PubMed and Cochrane library for identifying DA and matching SR of RCTs.
Search strategy: decision analysis papers
“Decision Support Techniques” was introduced as medical subject heading term in year 2000. Hence, we searched PubMed (Medline) for DAs from 01/2000 to 12/2008 for DAs using the following search strategy: (“Decision Support Techniques”[Mesh] OR (“Decision Making”[Mesh]) OR (“Decision Analysis”) AND (“Therapeutics”[Mesh] OR “therapy ”[Subheading] OR “Treatment Outcome” [Mesh] OR “Therapies, Investigational”[Mesh])).
Search strategy: systematic reviews of RCTs
We searched PubMed and Cochrane library for systematic reviews (SRs) of RCTs that matched the identified DAs based on patient population (P), intervention (I), control (C) and outcome (O) (PICO) criteria. Clinical Queries search strategies in Pubmed which have been updated based on the filter developed by Haynes et al. were also utilized to search for SRs of RCTs matching the DAs . Systematic reviews of RCTs published after the matching DA models were included. Keywords from DA intervention and control arms were used and, if necessary, search returns were narrowed by using keywords from the DA patient population.
Abstracts of all the identified studies were reviewed by one reviewer (RM) for inclusion according to the pre-determined criteria. In addition, 2 reviewers (BD and AK) randomly selected and reviewed 15% of the citations for inclusion to assess for accuracy. Another set of reviewers (HG and HW) reviewed list of all citations to identify matching SRs of RCTs for the included DAs. The list of matched SR of RCTs and DA were further confirmed independently by 2 reviewers (RM and AK). Any disagreements in the selection process of DA and matching SR of RCTs were resolved by consensus.
Data were extracted from each included DA and SR of RCTs using a standardized data extraction form. Data were extracted on PICO elements from all DAs and matching SRs of RCTs. From each DA we also extracted data on whether single versus multiple data sources were used the design of DA model. From each included SR of RCTs data were also extracted on the number of RCTs included, sample size and year of publication. Data abstraction from included DAs was done by one reviewer (RM) and from SR of RCTs by two reviewers (HG and HW). Senior reviewers (BD and AK) randomly selected and reviewed 15% of the extracted data from included studies to assess for accuracy.
Matching of DA and SR of RCTs
Abstracts of identified SRs of RCTs were reviewed by two reviewers (HW and HG) independently to determine the degree of matching based on PICO elements as follows: Overall the matching of DA and SR was done for all individual PICO elements at 3 levels classified either as optimum, broad or broadest match. First the match was done at participant/patient population level followed by intervention(s), control and outcome(s). If the initial match at participant/disease level was not achieved, the DA was excluded from the review. PICO elements of SR of RCTs were considered an optimum match to a DA if it involved same PICO elements. The PICO elements of SR of RCTs were considered a broad match to a DA if it involved similar PICO elements. The PICO elements of SR of RCTs were considered a broadest match to a DA if it involved only slightly similar PICO elements.
Examples of optimum, broad and broadest match are shown in Table 1. In situations where multiple matches were found, the most recently published SR of RCTs was chosen.
Concordance and impact of sensitivity analysis
It is well established that majority of the DAs conduct and report sensitivity analyses and the final outcomes of the DA model may be influenced by these sensitivity analyses. Hence, to assess the impact of the DA outcomes after these sensitivity analyses on the concordance or discordance with the findings of matching SR of RCTs we extracted data on the sensitivity analyses. Specifically, for each included DA and SR of RCTs, two review authors (HW and HG) independently extracted data on the author’s overall conclusion (indicating which treatment was better); whether author’s conclusion changed after sensitivity analysis and whether the conclusion(s) of DA and SR of RCTs agreed or disagreed. Discrepancies between the two reviewers’ judgments were resolved by discussion and mutual consensus with other reviewer (RM).
We used descriptive statistics to report concordance or discordance between results of DA and SR of RCTs. Impact of variables that are used in the construction of DA model on concordance between DA and matching SR of RCTs was assessed by Fishers’ exact test. The impact of sample size of SR of RCTs on concordance of findings between DA and SR of RCTs was tested using Kruskal–Wallis one-way analysis of variance test .
The PubMed search for DAs yielded 42,704 citations (Figure 1). We excluded 42,617 studies after reviewing abstracts and found 87 studies that used DA modeling to compare two or more interventions. We found matching SR of RCTs for 32% (28/87) of DAs. These 28 DAs included 37 comparisons for which we found a matching SR of RCTs.
Characteristics of included DAs and SRs
Infection (11/37) and cancer (10/37) were the most frequently studied diseases using DA modeling. The included DAs investigated effects of medications in 56% (21/37) of cases compared with surgical interventions in 38% (14/37) of cases (Table 2). Ninety five percent (35/37) of DAs did not collect any primary data and used data published in the literature in designing DA model. Only 5% (2/37) of DA models used a systematic approach (e.g. meta–analysis) to data collection. Similarly, 5% (2/37) of DA models used expert opinion in designing the DA model. Ninety seven percent (36/37) of DAs conducted sensitivity analysis while 51% (19/37) of included SRs of RCTs conducted sensitivity or sub-group analysis. The median sample size of included SR of RCTs was 2610 (range: 42 to 32523).
Matching between DA and SR of RCTs
As summarized in Table 3 the match between DA participant characteristics with SR of RCTs was considered optimum in 57% (21/37), broad in 21% (8/37) and broadest in 18% (8/37) of cases, respectively. The match for interventions studied in DAs with interventions in the SR of RCTs was optimum in 95% (35/37) and broad and broadest in 1/37 cases each. Similarly, the matching of controls in DA with the controls used in the SR of RCTs was optimum in 92% (34/37), broad in 5% (2/37) and broadest in 3% (1/37) of cases (Table 3).
Concordance between findings of DA and SR of RCTs
Overall, the findings of the DAs and the SRs of RCTs were in concordance in 73% (27/37) of cases. Twenty-seven percent (10/37) of the SR of RCTs findings were discordant with the findings of the DA (Figure 2).
Out of the 21 pairs of DA and SR with the optimum match of the patient characteristics 66% (14/21) of the DA findings were in concordance with the findings of the matching SR of RCTs. Out of the 8 pairs of DA and SR with the broad match of the patient characteristics 87% (7/8) of the DA findings were in concordance with the findings of the matching SR of RCTs. Out of the 8 pairs of DA and SR with the broadest match of the patient characteristics 75% (6/8) of the DA findings were in concordance with the findings of the matching SR of RCTs. There was no association between degree of matching based on patient characteristics and concordance of findings of DAs and SR of RCTs (p = 0.52). Similarly, there was no association between degree of matching based on intervention characteristics (p = 0.21) and control characteristics (p = 0.17) and concordance of findings of DAs and SR of RCTs.
The majority of the sensitivity analyses conducted in DA (33/37) did not impact the concordance between findings of DA and matching SR of RCTs. In three cases, the findings of DA and matching SR of RCTs were similar after the sensitivity analysis [14, 16, 27].
Impact of decision analysis design attributes on concordance
The findings of DAs using multiple data sources were more likely to be concordant with matching SRs of RCTs than DAs using single data source and this association reached statistical significance (p = 0.05) (Table 4). Incorporation of data from meta-analysis (p = 1.00), use of expert opinion (p = 0.06) and primary versus secondary data collection (p = 0.47) in the design of DA model did not have any impact on concordance between findings of DA and SR of RCTs (Table 4). It is important to note that, the meta analysis used to inform the design of the DAs were obviously published before the DA model and were not used for matching the results of DAs and SRs. The distribution of sample size of SR of RCTs was similar across the SRs which were in concordance with the findings of their matched DAs compared with SRs with the findings discordant with their matched DAs (p = 0.78).
Summary of evidence
To our knowledge, this is the first SR to date comparing the results of DAs with matching SRs of RCTs. The findings show that there is high level of concordance between findings of DA and matching SRs of RCTs and use of multiple sources of data in decision analyses appears to increase the predictive value of DA.
Comparative effectiveness research (CER) is gaining popularity and employs assessment of multiple interventions by comparing their long-term outcomes. However, in many instances the RCTs and other studies that are used in CER lack the power and long duration of follow up needed to assess the long-term outcome estimates . Over past 39 years, DA has been applied to a variety of clinical problems to provide these much desired estimates to improve clinical decision making. However, the concordance between findings of empirical efficacy studies that are used for decision making (i.e. RCTs) and DA findings is not known. This largest SR to date shows concordant findings of DAs and matching SRs of RCTs in 73% of cases. The findings from our study also emphasizes on the importance of SR as we have previously shown that results of DAs and matching single RCT disagree about only 50% of the time .
Our findings are also in line with the other research study on the topic. The study by Bress et al. focused on infectious diseases and assessed the concordance of DA findings compared with subsequent clinical study results . The study by Bress et al. determined that findings of DAs were in concordance with findings of clinical studies in 75% of the cases assessed . However, the study by Bress et al. was limited to the field of infectious diseases and compared findings of DAs with either RCTs or observational studies . Moreover, this study by Bress et al. did not comprehensively report the impact of DA design attributes and sample size of matching SRs on concordance between findings of DAs and SRs. We also explored the reasons for concordance and discordance between findings of DAs and matching SRs of RCTs employing multiple analyses. Specifically, we investigated the impact of DA design factors and sample size of matching SR of RCTs on concordance and discordance between findings of DA and SR of RCTs. Our results indicate that none of the attributes except use of single versus multiple data source in the design of DA models is significantly associated with concordance of findings between DA and matching SR of RCTs. Sample size of matching SR of RCTs did not have any impact on concordance and discordance between findings of DAs and SR of RCTs either. Another factor that may impact concordance between findings of DA and SR is the degree of matching between DA and SR PICO attributes which was performed in our study. In our study, the intervention and controls studied in DAs closely matched with SRs in majority of the cases. However, patient population enrolled in DAs closely matched with SR in 57% of cases. Nonetheless, the degree of patient population matching did not have any impact on concordance between findings of DA and SR of RCTs.
Our study has some limitations. There were a relatively small number of published DAs and an even smaller number with matching SR of RCTs. However, since DAs are mostly conducted when a RCT is not available this was expected. Nonetheless, our findings are based on small sample size (n = 37) and hence should be interpreted with caution. We did not search for unpublished DAs or SR of RCTs. As noted by Bress et al., our literature search also could not distinguish DA from other study designs. If “decision analysis” were a MeSH term, such searches would be more efficient and reproducible . As a result, we reviewed large volume of citations (n = 42,704) and hence our search is not updated since 2008.
Our results show the high concordance of findings of current DA models compared with findings of SR of RCTs. Moreover, our results outline the importance of SR of RCTs compared with a single RCT in medical decision making. That is, the concordance between DA findings and matching single RCT findings was only 50%  but the concordance between findings of DA and matching totality of evidence (i.e. SR of RCTs) was 73%. This underscores the importance of use of research synthesis in medical decision making.
Our study findings are important and informative to the design of DA models. It is known that, unless all clinically important factors have been included, the DA lacks sufficient representativeness to be clinically useful [7, 28–30]. Moreover, DA designs need to follow a consistent set of best practices for selecting (estimates from SR/MA rather than individual studies), adjusting for bias and incorporating empirical evidence [3, 31–33]. Our findings further highlight the need of further investigation of the impact of DA design attributes such as use of meta-analysis data and data from multiple sources on clinical rationality of DA models. Investigation of influence of DA design attributes on usefulness of DA model in decision making will further improve use of DAs in healthcare decision making and policy development.
Salmond SS: Randomized controlled trials: methodological concepts and critique. Orthop Nurs. 2008, 27: 116-122. 10.1097/01.NOR.0000315626.44137.94. quiz 123–114
Deeks JJ: Systematic reviews of published evidence: miracles or minefields?. Ann Oncol. 1998, 9: 703-709. 10.1023/A:1008335706631.
Goldhaber-Fiebert JD: Accounting for biases when linking empirical studies and simulation models. Med Decis Making. 2012, 32: 397-399. 10.1177/0272989X12441398.
Sibbald B, Roland M: Understanding controlled trials. Why are randomised controlled trials important?. BMJ. 1998, 316: 201-10.1136/bmj.316.7126.201.
Pauker SG, Kassirer JP: Decision analysis. N Engl J Med. 1987, 316: 250-258. 10.1056/NEJM198701293160505.
Pauker SG, Kassirer JP: Clinical application of decision analysis: a detailed illustration. Semin Nucl Med. 1978, 8: 324-335. 10.1016/S0001-2998(78)80018-X.
Kassirer JP, Moskowitz AJ, Lau J, Pauker SG: Decision analysis: a progress report. Ann Intern Med. 1987, 106: 275-291. 10.7326/0003-4819-106-2-275.
Sears ED, Chung KC: Decision analysis in plastic surgery: a primer. Plastic Reconstruct Surg. 2010, 126 (4): 1373-1380. 10.1097/PRS.0b013e3181ead10a.
Richardson WS, Detsky AS: Users’ guides to the medical literature. VII. How to use a clinical decision analysis. B. What are the results and will they help me in caring for my patients? Evidence Based Medicine Working Group. JAMA. 1995, 273: 1610-1613. 10.1001/jama.1995.03520440064038.
Richardson WS, Detsky AS: Users’ guides to the medical literature. VII. How to use a clinical decision analysis. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1995, 273: 1292-1295. 10.1001/jama.1995.03520400062046.
Bress JN, Hulgan T, Lyon JA, Johnston CP, Lehmann H, Sterling TR: Agreement of decision analyses and subsequent clinical studies in infectious diseases. Am J Med. 2007, 120 (461): e461-e469.
Mhaskar RS, Kumar A, Djulbegovic B: Agreement of Decision Analyses and Matching Randomized Controlled Trials in Assessment of Treatment Comparisons: A Systematic Review. 2008, Pennsylvania: Society for Medical Decision Making
Wilczynski NL, McKibbon KA, Haynes RB: Sensitive Clinical Queries retrieved relevant systematic reviews as well as primary studies: an analytic survey. J Clin Epidemiol. 2011, 64: 1341-1349. 10.1016/j.jclinepi.2011.04.007.
Elkin EB, Weinstein MC, Kuntz KM, Bunnell CA, Weeks JC: Adjuvant ovarian suppression versus chemotherapy for premenopausal, hormone-responsive breast cancer: quality of life and efficacy tradeoffs. Breast Cancer Res Treat. 2005, 93: 25-34. 10.1007/s10549-005-3380-2.
Cuzick J, Ambroisine L, Davidson N, Jakesz R, Kaufmann M, Regan M, Sainsbury R, LHRH-agonists in Early Breast Cancer Overview group: Use of luteinising-hormone-releasing hormone agonists as adjuvant treatment in premenopausal patients with hormone-receptor-positive breast cancer: a meta-analysis of individual patient data from randomised adjuvant trials. Lancet. 2007, 369: 1711-1723.
Verhoef LC, Stalpers LJ, Verbeek AL, Wobbes T, van Daal WA: Breast-conserving treatment or mastectomy in early breast cancer: a clinical decision analysis with special reference to the risk of local recurrence. Eur J Cancer. 1991, 27: 1132-1137. 10.1016/0277-5379(91)90310-A.
Morris AD, Morris RD, Wilson JF, White J, Steinberg S, Okunieff P, Arriagada R, Le MG, Blichert-Toft M, van Dongen JA: Breast-conserving therapy vs mastectomy in early-stage breast cancer: a meta-analysis of 10-year survival. Cancer J Sci Am. 1997, 3: 6-12.
Boughey JC, Cormier JN, Xing Y, Hunt KK, Meric-Bernstam F, Babiera GV, Ross MI, Kuerer HM, Singletary SE, Bedrosian I: Decision analysis to assess the efficacy of routine sentinel lymphadenectomy in patients undergoing prophylactic mastectomy. Cancer. 2007, 110: 2542-2550. 10.1002/cncr.23067.
Kell MR, Burke JP, Barry M, Morrow M: Outcome of axillary staging in early breast cancer: a meta-analysis. Breast Cancer Res Treat. 2010, 120: 441-447. 10.1007/s10549-009-0705-6.
Higgins KM, Shah MD, Ogaick MJ, Enepekides D: Treatment of early-stage glottic cancer: meta-analysis comparison of laser excision versus radiotherapy. J Otolaryngol Head Neck Surg. 2009, 38: 603-612.
Wong JB, Sonnenberg FA, Salem DN, Pauker SG: Myocardial revascularization for chronic stable angina. Analysis of the role of percutaneous transluminal coronary angioplasty based on data available in 1989. Ann Intern Med. 1990, 113: 852-871. 10.7326/0003-4819-113-11-852.
Jeremias A, Kaul S, Rosengart TK, Gruberg L, Brown DL: The impact of revascularization on mortality in patients with nonacute coronary artery disease. Am J Med. 2009, 122: 152-161. 10.1016/j.amjmed.2008.07.027.
Stalpers LJ, Verbeek AL, van Daal WA: Radiotherapy or surgery for T2N0M0 glottic carcinoma? A decision-analytic approach. Radiother Oncol. 1989, 14: 209-217. 10.1016/0167-8140(89)90169-2.
Nease RF, Ross JM: The decision to enter a randomized trial of tamoxifen for the prevention of breast cancer in healthy women: an analysis of the tradeoffs. Am J Med. 1995, 99: 180-189. 10.1016/S0002-9343(99)80138-7.
Cuzick J, Powles T, Veronesi U, Forbes J, Edwards R, Ashley S, Boyle P: Overview of the main outcomes in breast-cancer prevention trials. Lancet. 2003, 361: 296-300. 10.1016/S0140-6736(03)12342-2.
Practical Nonparametric Statistics. Edited by: Conover WJ. 1999, John Wiley & Sons, 3
Col NF, Eckman MH, Karas RH, Pauker SG, Goldberg RJ, Ross EM, Orr RK, Wong JB: Patient-specific decisions about hormone replacement therapy in postmenopausal women. JAMA. 1997, 277: 1140-1147. 10.1001/jama.1997.03540380054031.
Dolan JG: Can decision analysis adequately represent clinical problems?. J Clin Epidemiol. 1990, 43: 277-284. 10.1016/0895-4356(90)90008-D.
Dolan JG: Clinical decision analysis. Med Decis Making. 2001, 21: 150-151.
Dolan JG: Multi-criteria clinical decision support: a primer on the use of multiple criteria decision making methods to promote evidence-based, patient-centered healthcare. Patient. 2010, 3: 229-248. 10.2165/11539470-000000000-00000.
Weinstein MC, O’Brien B, Hornberger J, Jackson J, Johannesson M, McCabe C, Luce BR, ISPOR Task Force on Good Research Practices--Modeling Studies: Principles of good practice for decision analytic modeling in health-care evaluation: report of the ISPOR Task Force on Good Research Practices–Modeling Studies. Value Health. 2003, 6: 9-17. 10.1046/j.1524-4733.2003.00234.x.
Philips Z, Bojke L, Sculpher M, Claxton K, Golder S: Good practice guidelines for decision-analytic modelling in health technology assessment: a review and consolidation of quality assessment. Pharmacoeconomics. 2006, 24: 355-371. 10.2165/00019053-200624040-00006.
Philips Z, Ginnelly L, Sculpher M, Claxton K, Golder S, Riemsma R, Woolacoot N, Glanville J: Review of guidelines for good practice in decision-analytic modelling in health technology assessment. Health Technol Assess. 2004, 8 (iii-iv, ix-xi): 1-158.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6947/14/57/prepub
All authors had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors declare no conflict of interest.
RM conducted the search, identified all the DA models, and extracted the data from DA models. HW and HM extracted data from SR of RCTs. AK and BD randomly checked the accuracy of extracted data. RM wrote the first draft of the manuscript. HW, HM, AK and BD contributed to the final draft of this manuscript. All authors read and approved the final manuscript.
About this article
Cite this article
Mhaskar, R.S., Wao, H., Mahony, H. et al. Concordance between decision analysis and matching systematic review of randomized controlled trials in assessment of treatment comparisons: a systematic review. BMC Med Inform Decis Mak 14, 57 (2014). https://doi.org/10.1186/1472-6947-14-57