Study Setting
Massachusetts regulations require all acute care non-federal hospitals that provide cardiac surgery to collect data using a standardized data collection instrument based on the Society of Thoracic Surgery (STS) registry [15]. Each institution is required to submit data on a quarterly basis to Mass-DAC, and participating centers collect the data using a variety of point-of-care collection tools, chart review, and patient follow-up. Mass-DAC performs manual adjudication of all cases with adverse outcomes as well as a sample of all other case records. Yearly reports of hospital and surgeon 30-day mortality performance are published. Additional information and annual public reports are available online [3].
A total of 23,020 isolated adult coronary artery bypass surgery admissions were conducted from January 1, 2002 to September 30, 2007. The surgeries did not involve valve replacement or other associated cardiac surgical procedures. We selected these surgeries for our study because the state uses them as the primary index of institution and surgeon quality for cardiovascular surgery. In 2006, Mass-DAC changed reporting from a calendar year basis to a fiscal year basis that runs from October 1 through September 30. Consequently, the 2006 fiscal year analysis included the last three months of the 2005 calendar year. The primary patient outcome of the registry is the 30-day all-cause mortality after isolated coronary artery bypass graft surgery. We focused on 30-day all-cause hospital-specific risk-standardized mortality rates. The current study was approved by the Brigham & Women's and Harvard Medical School's Institutional Review Boards.
Gold Standard Statistical Analysis
Mass-DAC reports the data annually utilizing Bayesian hierarchical logistic regression [1]. The model assumes that the log-odds of mortality is linearly related to a set of patient risk factors and permits baseline risk to vary across hospitals through the inclusion of a hospital-specific intercept. Estimation of the model parameters, including the between-hospital variance, hospital overall mean log-odds, and regression coefficients of patient-level risk factors are obtained via Markov chain Monte Carlo (MCMC) methods. The MCMC method uses the Gibbs sampler to sequentially sample from probability distributions and produces a Markov chain with the joint posterior density as its stationary distribution [16]. This is accomplished by selecting a set of starting values, then performing a number of "burn-in" sampling iterations that are not recorded, followed by the collection and averaging of additional sampling iterations to form the final posterior estimates. The primary analysis used all of the data for the year and declared a hospital as an outlier if the lower limit of the 95% posterior interval of the risk-adjusted institutional standardized mortality rate exceeded the unadjusted statewide mortality rate. Because of the small number of cardiac surgery hospitals in Massachusetts, Mass-DAC also performs cross-validation analyses in which each hospital is eliminated and data from the remaining hospitals are used to assess hospital performance in the eliminated hospital. This strategy was developed to avoid one large center from having too great an influence on statewide risk expectations. A hospital was considered an outlier if either the 95% posterior interval from the statewide comparison exceeded the unadjusted statewide mortality rate or the posterior predictive p-value from the cross-validation analysis was 0.01 or smaller.
Hospital 9 was declared an outlier in the 2002-2005 reports, and hospital 8 was declared an outlier in the 2004 report. All of hospital outliers were detected with only the cross-validation evaluation by MASS-DAC. Although the original 2002 public report did not include the cross-validation analysis method, we have exactly reproduced this analysis protocol on this data to provide consistency across all years. A summary of the risk-adjusted standardized mortality incidence rate (with upper and lower limits of the 95% interval) by hospital and year are reported in Figure 1. The dotted line in the figure represents the statewide unadjusted mortality incidence rate, and values in red indicated an outlying hospital.
Automated Risk Adjusted Sequential Probability Ratio Testing (RA-SPRT)
The SPRT control chart methodology detects unacceptable event rates by evaluating each unit of analysis sequentially in time [12, 13]. The risk hypothesis is whether the observed outcome event rate exceeds the accepted or baseline event rate given a specific odds ratio (OR) and Type I and II error [12]. This method accepts or rejects this hypothesis after each sequential case is evaluated. Risk adjustment is performed through the use of a risk prediction model whereby the cumulative log likelihood ratio is adjusted by the probability of the outcome [17]. Repeated measurement (re-analysis after each additional case) error adjustments are incorporated explicitly in the framework. These features are uncommon in statistical process control methods and are strengths of this method.
The following describes the calculations necessary to construct risk-adjusted SPRT control charts as refined by Rogers and colleagues[13] based on Spiegelhalter's work [12]. The control limits are defined by
(1)
and
(2)
where h0 and h1 are the cumulative log-likelihood ratio values in which the null hypothesis (H0) or the alternate hypothesis (H1) is accepted (respectively), OR is the odds ratio, α is the type I error rate, and β is the type II error rate.
The cumulative log-likelihood ratio value (Tcum) is calculated in sequence for higher risk detection (OR > 1) with
(3)
where T0
cum = 0, Oi is the observed outcome (0 or 1) for a binary procedure for ith case, and
(4)
where pi is the calculated probability of the outcome for the ith case as determined by the risk prediction model.
We have previously described an automated real-time safety monitoring tool, Data Extraction and Longitudinal Trend Analysis (DELTA), that is able to perform larger numbers of concurrent prospective analyses using a variety of statistical methodologies and alerting thresholds [18]. The system uses a SQL 2005 server (Microsoft Corp., Redmond, WA) to provide internal data storage and configuration information, as well as providing the capability to integrate with external databases. The user interface was developed in the Microsoft. NET programming environment and was displayed in a web browser from a Microsoft IIS 6.0 Web Server (Microsoft Corp., Redmond, WA). Security of patient data is further addressed by record de-identification steps and user login access restrictions [19].
The RA-SPRT method was implemented directly within DELTA, and the statistical module evaluated the data after setting cohort inclusion and exclusion criteria as well as the necessary statistical parameters, such as the desired odds ratio and the type I and II error. Parameters and risk variable selection for the logistic regression risk adjustment were then passed through a bi-directional interface to SAS (Version 9.1, Cary, NC) in order to develop the required logistic regression models.
Statistical Analysis
The risk-adjusted sequential probability ratio testing (RA-SPRT) method was used to evaluate the data separately for each calendar or fiscal year. Although one of the strengths of this method is that it can accumulate data continuously until the alerting odds ratio hypothesis is accepted or rejected, analyses were terminated at the end of each calendar or fiscal year and the cumulative log-likelihood ratio was reset to 0. This was done in order to be directly comparable to the gold standard. Risk adjustment was performed using standard logistic regression with the same risk factors used in the source method by Mass-DAC. Each logistic regression model was developed from data in the prior 11 months and then applied sequentially to each case in the "current" month. This process was repeated throughout the entire range of the data. Data were not available prior to 2002, so the models developed prior to December 2002 used from one to ten months of data (depending on the analysis month). Data from January 2002 were not analyzed by RA-SPRT because they were required to build the first model. It should be noted that the risk models developed from the first two months of data showed regression coefficient instability and poor calibration, as would be expected with low sample sizes. Both of these measurements subsequently stabilized for the remainder of the data. A type I error of 0.05 and type II error level of 0.10 were used in each of the RA-SPRT analyses, and an OR threshold of 2.0 was defined as the reasonable thresholds for concern regarding the clinical quality of the institution evaluated. An outlier was declared if a hospital exceeded the log likelihood ratio threshold at any point during that calendar year.