Open Access

Use of outcomes to evaluate surveillance systems for bioterrorist attacks

  • Kerry A McBrien1, 2Email author,
  • Ken P Kleinman3,
  • Allyson M Abrams3 and
  • Lisa A Prosser3, 4
BMC Medical Informatics and Decision MakingBMC series – open, inclusive and trusted201010:25

DOI: 10.1186/1472-6947-10-25

Received: 22 June 2009

Accepted: 7 May 2010

Published: 7 May 2010

Abstract

Background

Syndromic surveillance systems can potentially be used to detect a bioterrorist attack earlier than traditional surveillance, by virtue of their near real-time analysis of relevant data. Receiver operator characteristic (ROC) curve analysis using the area under the curve (AUC) as a comparison metric has been recommended as a practical evaluation tool for syndromic surveillance systems, yet traditional ROC curves do not account for timeliness of detection or subsequent time-dependent health outcomes.

Methods

Using a decision-analytic approach, we predicted outcomes, measured in lives, quality adjusted life years (QALYs), and costs, for a series of simulated bioterrorist attacks. We then evaluated seven detection algorithms applied to syndromic surveillance data using outcomes-weighted ROC curves compared to simple ROC curves and timeliness-weighted ROC curves. We performed sensitivity analyses by varying the model inputs between best and worst case scenarios and by applying different methods of AUC calculation.

Results

The decision analytic model results indicate that if a surveillance system was successful in detecting an attack, and measures were immediately taken to deliver treatment to the population, the lives, QALYs and dollars lost could be reduced considerably. The ROC curve analysis shows that the incorporation of outcomes into the evaluation metric has an important effect on the apparent performance of the surveillance systems. The relative order of performance is also heavily dependent on the choice of AUC calculation method.

Conclusions

This study demonstrates the importance of accounting for mortality, morbidity and costs in the evaluation of syndromic surveillance systems. Incorporating these outcomes into the ROC curve analysis allows for more accurate identification of the optimal method for signaling a possible bioterrorist attack. In addition, the parameters used to construct an ROC curve should be given careful consideration.

Background

Given the realistic possibility of bioterrorist attacks, a key public health challenge lies in identifying practical disease surveillance methods that will minimize associated casualties and costs by enabling a timely response. Illnesses caused by many bioterrorism agents, including anthrax, present with a prodrome indistinguishable from that of influenza or other common illnesses, making syndromic surveillance systems a useful option for the detection of bioterrorist attacks [1]. These systems differ from traditional public health surveillance methods, which rely upon reported disease-specific diagnoses and instead use statistical algorithms to detect aberrations in pre-diagnostic data. For example, cases of inhalational anthrax may manifest as an increase in the number of ICD-9 codes for bronchitis, cough, or pneumonia in an electronic medical record system [2].

Due to the paucity of authentic data on bioterrorist attacks, researchers have used simulated bioterrorist attacks to assess the performance of syndromic surveillance systems [3]. Previous studies have used the sensitivity, specificity, predictive values, and variations of receiver operating characteristic (ROC) curves to evaluate the performance of syndromic surveillance systems using simulated data [3]. The success of a surveillance system, however, will depend not only on whether the attack was detected, but also on the timeliness with which it was detected [4]. Kleinman et al simulated a set of hypothetical bioterrorist attacks with anthrax [2] and, using modified ROC curve analysis, evaluated seven detection algorithms by weighting the sensitivity measure by the time lag in detecting an attack. They found that both the absolute performance as well as the relative performance of the systems differed after timeliness was incorporated into the metric [4].

Although timeliness adds an important element to the sensitivity metric, it remains a proxy for key health and financial outcomes: deaths, illnesses, and costs. In a follow-up study, Kleinman et al weighted the sensitivity metric by the proportion of affected individuals and found again that the weighting changed the relative performance of the systems [5]. This study extends previous research by incorporating associated costs, lives lost and illness averted into the sensitivity metric of ROC curves. It accounts for the health and financial benefits of early detection, while also accounting for consequences of side effects of prophylaxis, adverse events from treatment, and the long-term sequelae of disease.

Methods

We simulated a series of anthrax attacks and performed an evaluation of seven statistical detection algorithms applied to syndromic surveillance data. The evaluation employed weighted ROC curves that incorporate the following outcomes in order of increasing comprehensiveness: lives, quality-adjusted life years (QALYs), and costs. A decision-analytic approach was used to predict outcomes using data from the simulated attacks. Predicted outcomes were then used to construct outcomes-weighted ROC curves for each of the candidate metrics.

Simulated Attack Data

We used two data sets: simulated attack data and observed surveillance system data. Full data on the simulations and observed data can be found in a previous publication; we offer a brief summary here [2]. The observed data are counts of respiratory complaints recorded in electronic medical records of the ambulatory encounters of approximately 250,000 patients in eastern Massachusetts. In the simulation data, an attack is assumed to be the result of the release of anthrax spores from a crop dusting airplane. The attack resulted in simulated cases, of which a proportion corresponding to the proportion of the total population included in the observed surveillance system were added to the surveillance system data, by zip code. We then assessed the evidence for the attack using various statistical algorithms described below. Since detection ability depends on the time of year and day of the week, we repeated this exercise with three separate simulated attacks on each day of 2003. We recorded the simulated number affected on each day. The total population in this region is approximately 2.5 million [2].

We recorded which of the simulated attacks were detected and on which day, by each of seven algorithms, using 11 different sensitivity thresholds. Three of the algorithms, Scan 1, Scan 3, Scan 7, used space-time scan statistic methods with maximum signal durations of one, three or seven days respectively [6]. Three others, GLMM 1, GLMM 3, GLMM 7 used a Poisson generalized linear mixed effects model with fixed one, three or seven day durations [7]. The last method used a time series approach [4]. The number of false positive signals that occurred over a one-year period for each of the seven methods and thresholds was calculated as the number of "detections" in the absence of simulated attacks.

Decision analytic model

Figure 1 depicts the states that an individual has the potential to progress through following a bioterrorist attack with anthrax. A healthy individual exposed to spores of bacillus anthracis may develop anthrax illness. The illness begins in the prodromal phase and, if not treated, will progress to the fulminant phase according to a time-dependent probability distribution [8]. Death was assumed to occur within 24 hrs of developing fulminant disease, regardless of whether medical treatment was provided [8]. Using the time-dependent probability of disease progression, we were able to estimate the number of individuals that would be expected to fall into each illness state (i.e., prodromal illness, fulminant disease, recovery with and without long-term sequelae, death) on each of days one through ten following a given attack.
https://static-content.springer.com/image/art%3A10.1186%2F1472-6947-10-25/MediaObjects/12911_2009_Article_326_Fig1_HTML.jpg
Figure 1

Decision model. Decision analytic model for a bioterrorist attack with bacillus anthracis.

If an attack was signaled by the detection system, it was assumed that all cases of anthrax would receive appropriate and timely medical care. Those with prodromal illness would be treated with multiple antibiotics, with regimens similar to those used for the anthrax attacks in the U.S. in 2001 [8]. If treated, an individual with prodromal illness has a chance of survival. Those who recover may experience sequelae of anthrax illness. Those with fulminant disease would be admitted to critical care units and treated, but were assumed to ultimately die from anthrax exposure.

The remainder of the population of Eastern Massachusetts would be considered at risk, and would be started on antibiotic prophylaxis with a 60-day course of oral ciprofloxacin. Prophylaxis was considered to be 100% effective in preventing the development of illness among those exposed to anthrax. A percentage of these individuals would be expected to experience mild or severe adverse effects from the medication. However, it was assumed that oral ciprofloxacin does not carry any risk of mortality.

Decision Analysis Inputs

The model inputs were derived using a combination of literature review, empirical calculation, and expert interview. They are summarized in Table 1.
Table 1

Decision analytic model inputs

Variable

Value

Range for sensitivity analysis

Source

Health state transition probabilities

   

Probability of recovery from fulminant anthrax

0

-

[8]

Probability of recovery from prodromal anthrax

0.857

0.66-0.9375

[8]

Probability of prophylactic antibiotic effectiveness

1.0

-

[9]

Probability of developing mild side effects from antibiotics

0.57

0.3-0.57

[10]

Probability of developing severe side effects from antibiotics

0.003

0-0.01

[10]

Probability of seeking medical attention for mild side effects

0.16

0-0.25

[10]

Probability of seeking medical attention for severe side effects

1.0

-

[10]

Health state utility values

   

Death

0

-

 

Recovery from anthrax

0.56*

0.4*-0.56

[12, 13, 24],

Mild side effects from antibiotics

0.998

0.994-0.999

[10, 14]

Severe side effects from antibiotics

0.992

0.980-0.997

[10]

No side effects from antibiotics - healthy

1.0

-

 

Cost estimates (2006 USD)

   

Cost of treatment of prodromal anthrax

9,223

9223-18446

[1517],

Cost of a course of prophylactic antibiotics

638

-

[19, 20]

Cost of office visit for mild side effects

30

-

[17]

Cost of treatment of severe side effects

189

-

[17, 18]

Willingness to pay to avoid sequelae associated with recovery from anthrax

214,910§ per year

Life*-5 yrs

[12, 22],

Value of a statistical life

7.3 m

5.51 m-13.23 m

[22]

*This health state utility was matched to the EQ-5D [24], and assumed to continue throughout the life span. An average age of 36.5 years was assumed according to the US Census Bureau and average remaining life span of 43 years was calculated using the 2002 US Life Tables [25, 26].

A five year duration for this health state utility was used as the upper bound.

The upper bound was estimated by doubling the cost for the base case.

§This estimate was adjusted to assume that one out of every six anthrax patients would be able to return to work.

Analysis of Lives and Quality-Adjusted Life-Years (QALYs)

Three main endpoints were included in the analysis: lives, quality-adjusted life-years, and costs. The total number of deaths, given detection on a given day following an attack, was determined by summing the number of fulminant and dead cases and adding to this the number of prodromal cases expected to progress to the fulminant stage despite treatment. The probabilities of progressing through the states of anthrax illness were taken from a comprehensive review of anthrax illness [8]. The base case estimate is that associated with the 2001 US attacks, where six out of seven individuals with prodromal illness survived. The probabilities associated with prophylactic antibiotics were derived from CDC surveillance reports of the 2001 prophylaxis population [9, 10]. Other probabilities were derived from the literature and expert opinion. (P. Brachman, pers. comm.)

To calculate the number of quality-adjusted life years gained through earlier detection, each final health state was assigned a utility value between zero and one, one being equivalent to perfect health and zero being equivalent to death. Utilities represent individuals' relative preference for a health state and are used to adjust for the lower quality of life associated with short- or long-term morbidity [11]. By adjusting years of life by their associated health state utilities, the number of quality adjusted life years (QALYs) can be calculated. One QALY can be roughly described as equivalent to one year in perfect health. This metric accounts for both life years gained due to averted deaths and morbidity averted due to earlier detection. The number of life years gained for an averted death was calculated using US life tables and an average population age of 36.5 years.

The long-term sequelae of anthrax illness include depression, anxiety and long-term cardiac and respiratory disability [12]. The base case estimate for the utility associated with this state assumes life-long sequelae [13]. The range for sensitivity analysis varies the duration of sequelae from 5 years to life. Side effects from antibiotics were assumed to last for three days in the base case and were varied from one to seven days in the sensitivity analysis. Health state utilities were derived from the literature [14]. An annual discount rate of 0.03 was applied to all future health states.

Cost analysis

Defining outcomes in terms of costs allows for the most comprehensive adjustment as it takes into account morbidity and mortality, as well as cost of treatment. Costs of medical care were based on Medicare payment rates, wholesale prescription drug prices, and published estimates from health economic literature. The cost of treating prodromal anthrax included costs of hospitalization and infectious disease consultative services [1517]. The cost associated with side effects is that of a brief office visit for those with mild side effects who seek treatment and is the cost of an emergency room visit for all those with severe side effects [17, 18]. The cost of prophylactic antibiotics assumed a 60-day regimen of oral ciprofloxacin [19, 20]. All costs were converted to 2006 US dollars [21].

In the cost analysis, health effects are converted to dollar values using published willingness-to-pay amounts and estimates for the value of a statistical life [12, 22]. For example, the published value for willingness to pay to avoid permanent disability was $1,032 per day and this value would replace the health state utility value used in the QALY analysis [22]. Therefore, the cost analysis includes direct costs of medical care as well as morbidity and mortality effects converted into dollar values.

Weighted ROC curves

We used weighted ROC curves to determine the relative performance of the detection algorithms. Traditional ROC curves are constructed by plotting the sensitivity versus 1-specificity for various decision thresholds of a test. A comparison of the area under the ROC curve (AUC) can be used to determine the relative performance of several different tests [4]. The weighted ROC curves used here replace the sensitivity of the test traditionally found on the y-axis of the ROC curve with a metric weighted by health outcomes.

In the case of disease surveillance, the benefit of detection depends on the number of adverse events averted, which in turn depends on the timeliness of detection. The surveillance system was considered to have failed if the attack was not detected before the tenth day, as it is generally accepted that an anthrax epidemic would be caught by traditional surveillance methods no later than the tenth day [4]. Using the expected outcomes on day nine as a baseline, we calculated the benefits of early detection by weighting each detected attack by the proportion of the outcome saved given the day of detection. The proportions of lives saved for each detected attack were then averaged together with undetected attacks, over all 1095 simulations, to determine a weighted sensitivity for each algorithm at each threshold. This same procedure was used to determine a weighted sensitivity for QALYs and costs saved.

The weighted ROC curves were constructed with the weighted sensitivity on the y-axis, and the false positive rate per day on the x-axis. The area under the weighted ROC curves (AUC) was calculated for each of the seven statistical algorithms across the three dimensions of lives, QALYs and costs. The relative performance of the seven statistical algorithms was assessed by comparing their respective AUCs. In the base case analysis, the AUC was calculated using a non-parametric trapezoidal method [23].

Sensitivity Analysis

Three types of sensitivity analyses were performed. First, input parameters were varied using the ranges identified in Table 1 to represent best and worst case scenarios. These scenarios reflect the parameter sets that bound the results for best and worst performance of the surveillance systems. Second, we reduced the proportion of the population required to receive prophylaxis, assuming it would be possible to accurately identify the location of the release, thus requiring prophylaxis for only 40% of the population. Third, we used alternate methods to calculate the AUC: a rectangular method and a truncated method. The rectangular method assumes that movement between test thresholds is discrete and that the sensitivity of the systems remains constant between false positive rates. The truncated method is used to reflect the fact that a high false positive rate would not be tolerated, and that only a portion of the curve is relevant. We used a false positive rate of 0.1 alarms per day as a cut-off point. Figure 2 shows a graphical depiction of the three AUC methods.
https://static-content.springer.com/image/art%3A10.1186%2F1472-6947-10-25/MediaObjects/12911_2009_Article_326_Fig2_HTML.jpg
Figure 2

AUC calculation methods. Weighted receiver operator characteristic curve for the generalized linear mixed effects model (GLMM 1) using lives-weighted sensitivity, depicting three methods for determining the area under the curve (AUC). From top to bottom: trapezoidal, rectangular, and truncated.

Results

Lives, QALYs, and Costs by Day of Detection

Figure 3 shows the number of people predicted to have each of the three phases of Anthrax illness on Days 1 through 9 following an attack. The remainder of the population at risk would be eligible for prophylaxis. Of note, the increase in the total number of people affected each day follows a non-linear pattern. Figure 4 depicts the number of lives, QALYs and costs that we predict could be saved by day of detection. Again, the change by day is non-linear, indicating that a one-day delay in detection has a differential impact depending on the number of days that have elapsed since the attack. For instance, a delay from day 4 to day 5 would result in a larger loss than a delay from day 1 to day 2. Detection in the first three days has a similar effect; in this case our model estimates that approximately 1400 lives, 50,000 QALYs, and $18 billion USD could potentially be saved (Figure 3).
https://static-content.springer.com/image/art%3A10.1186%2F1472-6947-10-25/MediaObjects/12911_2009_Article_326_Fig3_HTML.jpg
Figure 3

Time dependent illness counts. Number predicted to have Anthrax disease by phase of illness and day of detection following a bioterrorist attack with bacillus anthracis.

https://static-content.springer.com/image/art%3A10.1186%2F1472-6947-10-25/MediaObjects/12911_2009_Article_326_Fig4_HTML.jpg
Figure 4

Time-dependent attack outcomes. Potential lives, quality adjusted life-years (QALYs), and costs saved by day of detection following a bioterrorist attack with bacillus anthracis.

Weighted ROC Curve Analysis - Base case

Using weighted ROC curves that incorporate lives saved, QALYs gained, or costs, the Time-series system is consistently the best-performing method. GLMM7 consistently performed worst and GLMM1 is second best. The relative performance of other tests varies by the measure used. (Table 2) These areas were calculated using the trapezoidal method, which assumes that the sensitivity of the detection algorithms is continuously modifiable. In some cases the curves required extrapolation to a false positive rate of 1 alarm per day- in these cases the extrapolation point chosen was (1,1).
Table 2

AUC of weighted ROC curves - base case

Lives - weighted

QALYs - weighted

Costs - weighted

Timeliness - weighted[4]

System

AUC

System

AUC

System

AUC

System

AUC

TS

0.514

TS

0.448

TS

0.424

TS

0.65

GLMM 1

0.447

GLMM 1

0.411

GLMM 1

0.397

Scan 1

0.41

Scan 1

0.431

Scan 1

0.36

GLMM 3

0.336

Scan 3

0.378

Scan 3

0.413

GLMM 3

0.356

Scan 1

0.333

Scan 7

0.349

GLMM 3

0.409

Scan 3

0.34

Scan 3

0.313

GLMM 3

0.276

Scan 7

0.377

Scan 7

0.309

Scan 7

0.283

GLMM 7

0.271

GLMM 7

0.316

GLMM 7

0.269

GLMM 7

0.251

GLMM 1

0.235

Weighted ROC Curve Analysis - Sensitivity analysis

In the sensitivity analysis that assumes the most optimistic scenario, there is little change in the relative performance of the candidate methods. Time-series remains the best-performing and GLMM7 is consistently the worst. (Table 3) The relative order of performance does change for the worst case scenario and Time-series is no longer consistently the best-performing method. (Table 4)
Table 3

Sensitivity analysis of AUC of weighted ROC curves - best case scenario

Lives - weighted

QALYs - weighted

Costs - weighted

System

AUC

System

AUC

System

AUC

TS

0.540

TS

0.499

TS

0.533

Scan 1

0.470

GLMM 1

0.432

GLMM 1

0.469

GLMM 1

0.468

Scan 1

0.406

Scan 1

0.469

Scan 3

0.454

GLMM 3

0.385

Scan 3

0.455

GLMM 3

0.438

Scan 3

0.384

GLMM 3

0.439

Scan 7

0.413

Scan 7

0.348

Scan 7

0.415

GLMM 7

0.339

GLMM 7

0.293

GLMM 7

0.340

Table 4

Sensitivity analysis of AUC of weighted ROC curves - worst case scenario

Lives - weighted

QALYs - weighted

Costs - weighted

System

AUC

System

AUC

System

AUC

TS

0.467

GLMM 1

0.375

GLMM 1

0.329

GLMM 1

0.434

TS

0.321

GLMM 3

0.242

Scan 1

0.400

GLMM 3

0.306

TS

0.212

Scan 3

0.391

Scan 1

0.273

Scan 1

0.178

GLMM 3

0.385

Scan 3

0.263

Scan 3

0.170

Scan 7

0.352

Scan 7

0.240

GLMM 7

0.168

GLMM 7

0.300

GLMM 7

0.225

Scan 7

0.155

When the population targeted for prophylaxis is reduced to 40% of the population in the base case, the relative ordering of performance remains consistent with that of the base case. (Table 5)
Table 5

Sensitivity analysis of AUC of weighted ROC curves - 40% prophylaxis

Lives - weighted

QALYs - weighted

Costs - weighted

System

AUC

System

AUC

System

AUC

TS

0.514

TS

0.467

TS

0.449

GLMM 1

0.447

GLMM 1

0.415

GLMM 1

0.403

Scan 1

0.431

Scan 1

0.371

Scan 1

0.348

Scan 3

0.413

GLMM 3

0.363

GLMM 3

0.345

GLMM 3

0.409

Scan 3

0.351

Scan 3

0.327

Scan 7

0.377

Scan 7

0.318

Scan 7

0.296

GLMM 7

0.316

GLMM 7

0.275

GLMM 7

0.259

When the AUC calculation method is changed from trapezoidal to either rectangular or truncated, there is marked difference in the relative performance of the surveillance systems, as is demonstrated in Tables 6 and 7. While GLMM 7 remains the worst performer, the two systems that consistently had the top performances with the trapezoidal calculation, Time-series and GLMM 1, have relatively poor performance using either the rectangular or the truncated method. The Scan systems appear to have the best performance using these methods, with Scan 3 being the best performer across all analyses. It is also notable that the ordering of performance is somewhat more consistent across the three outcome weightings using either the rectangular or truncated method, as compared to the trapezoidal method.
Table 6

Sensitivity analysis of AUC of weighted ROC curves - rectangular calculation method

Lives - weighted

QALYs - weighted

Costs - weighted

System

AUC

System

AUC

System

AUC

Scan 3

0.364

Scan 3

0.297

Scan 3

0.272

Scan 7

0.347

Scan 7

0.282

Scan 7

0.258

Scan 1

0.325

Scan 1

0.267

Scan 1

0.245

GLMM 3

0.306

TS

0.254

TS

0.235

TS

0.303

GLMM 3

0.248

GLMM 3

0.227

GLMM 1

0.260

GLMM 1

0.213

GLMM 1

0.195

GLMM 7

0.246

GLMM 7

0.199

GLMM 7

0.182

Table 7

Sensitivity analysis of AUC of weighted ROC curves - truncated method

Lives - weighted

QALYs - weighted

Costs - weighted

System

AUC

System

AUC

System

AUC

Scan 3

0.205

Scan 3

0.163

Scan 3

0.148

Scan 1

0.201

Scan 1

0.162

Scan 1

0.148

Scan 7

0.171

Scan 7

0.133

Scan 7

0.119

GLMM 3

0.104

GLMM 3

0.082

TS

0.075

TS

0.097

TS

0.081

GLMM 3

0.074

GLMM 1

0.065

GLMM 1

0.052

GLMM 1

0.047

GLMM 7

0.041

GLMM 7

0.032

GLMM 7

0.029

Discussion

Through the use of decision analytic modeling, we were able to translate the number of people affected by a hypothetical bioterrorist attack to relevant outcomes and incorporate these outcomes into the evaluation metric. The outcomes were lives lost, QALYs lost, and costs incurred, costs being the most comprehensive of the three. In the base case, using the trapezoidal method of AUC calculation, the relative order of performance remained fairly consistent across the three outcome weights. However, we found the relative order of performance was sensitive to the model inputs as well as the method of AUC calculation.

Our model predicts that, if undetected until the ninth day, a bioterrorist attack with bacillus anthracis would have a significant detrimental effect on the health of the population. If a surveillance system was successful in detecting the attack before the ninth day, and measures were immediately taken to deliver treatment to the population, the lives, QALYs and dollars that would be lost could be reduced considerably. Earlier detection results in better outcomes: our model estimates an absolute cost savings of several billion dollars for a detection on the eighth day rather than the ninth. Conversely, a false positive alarm has negative consequences associated with the unnecessary use of prophylactic antibiotics, namely the cost incurred and the adverse effects of the medication. Consequently, a high-performing surveillance system should not only be capable of detecting an attack before the ninth day, but should also detect the attack in as timely a manner as possible and with a low rate of false positives.

Public health authorities must consider both the positive and negative aspects of the programs they choose to implement. In the case of surveillance for bioterrorism attacks, the benefits of early detection must be balanced by the adverse effects of false positive alarms, an aspect of surveillance systems that supports the use of weighted ROC curve analysis in their evaluation. Results comparing alternative surveillance algorithms could be used to select an optimal algorithm depending on the outcome public decision makers choose to optimize. This kind of information, and the cost information assembled in Table 1, can help inform discussions about the value and appropriate role for syndromic surveillance.

When timeliness was incorporated into the evaluation metric by Kleinman et al [4], the Time-series method was the best performer, consistent with our results. The order of relative performance of the other systems however was different in the present analysis. Using a different weighting scheme, Kleinman et al also performed an evaluation of the systems that incorporated the number of people affected by the attacks [5], and found an order of relative performance that differs from both their previous analysis as well the present analysis. As noted in the results, the impact of a delay in detection varies with day of detection. We have shown that this variation also affects the apparent performance of the surveillance systems, and thus the incorporation of outcomes into the evaluation metric has an important effect on their ranking.

It is reasonable to conclude that the shape of the outbreak also plays a part in the relative order of performance. If the increase in the number affected is greater during the first days following the attack, it follows that a detection system that performs best in those first days would result in fewer losses. However, if the increase in the number of affected people is highest several days after the attack, a detection system with a higher cumulative sensitivity during the preceding days would have the best performance, even if detection is delayed by several days. Regardless of the shape of the outbreak, in all but a linear relationship between number affected and time, the incorporation of outcomes has a significant impact on relative performance. The modified ROC curves described in this paper allow for several dependent variables to be taken into account in one evaluation metric.

The results remained fairly constant in the 'best' and 'worst' case scenario analyses, indicating that our model is robust to variation in model inputs. However, the relative order of performance is heavily dependent on the choice of AUC calculation method. The rectangular and truncated methods produced results quite different from the classic trapezoidal method. The trapezoidal method assumes that the surveillance system threshold can be adjusted in a continuous manner, such that the false positive rate can be set anywhere between zero and one. This may not in fact be a reasonable assumption given the complexity underlying the statistical algorithms used by the surveillance systems. Moreover, due to the need for extrapolation and the resulting shape of the curves (Figure 2), this method gives more weight to the latter portion of the curve where there are fewer data points. The rectangular method assumes that the thresholds are discrete and that the sensitivity of each detection algorithm has a preset maximum. The area is therefore limited by the preset maximum and the relative performance as measured by the AUC is thus affected. The truncated method goes a step further and assumes that there is a false positive rate beyond which the negative consequences are too great to consider using the system, allowing the remainder of the graph to be disregarded. In this case, we arbitrarily chose 0.1 false positives per day as the cutoff. As is demonstrated in the results, the relative ordering changed significantly from that determined in the base case.

Further research into the differences between the methods of area calculation is needed. If these evaluation methods were adopted and used by public health authorities, consideration should be given to the assumptions underlying the method of ROC curve construction, including the shape of the outbreak, the flexibility of the detection algorithms and the threshold for an acceptable false positive rate. For example, if an acceptable false positive rate were defined, this would restrict the portion of the curve to be studied and potentially minimize the variation when alternative calculation methods are used. The relative performances of the surveillance systems within these bounds would be more accurately applicable to a real-world setting.

Furthermore, the interpretation of these weighted ROC curves is limited due to the nature of the data used to construct the axes, a constraint shared by earlier analyses of this type [4]. The false-positive rate on the x-axis is based on only one year of historical data while the sensitivity is calculated from a simulated data set with multiple events that occur an arbitrary number of times. Rather than using the analysis to draw conclusions about the absolute performance of each system, the intention is to compare the area under the weighted ROC curves from the seven statistical algorithms in order to assess their performance relative to each other.

Although the probability and utility estimates were the best estimates available from the literature, we had limited data on some model inputs due to the limited number of anthrax cases. For example, the probability of recovery from anthrax disease was based on the only data available, the reported case series of seven individuals treated for prodromal anthrax in the 2001 outbreak [8]. Furthermore, the analysis assumed that all individuals with symptomatic anthrax illness would be treated on the day of detection and that antibiotic prophylaxis would be provided within one day to all persons at risk, an idealized scenario that may not be met in practice.

Conclusion

This study demonstrates the importance of accounting for mortality, morbidity, and costs. Incorporating these outcomes into the ROC analysis allows for more accurate identification of the optimal method for signaling a possible bioterrorist attack. Future research should consider the capabilities of current surveillance systems and determine acceptable false positive rates, in order to appropriately calibrate available surveillance systems.

Declarations

Acknowledgements

We would like to acknowledge Matthew Kelly, Davut Savaser, and JiYeon Kim for their contributions to earlier versions of this paper. We would also like to thank Philip S. Brachman for his expert assistance in developing assumptions for the decision analytic model. This work was supported by CDC cooperative agreement UR8/CCU115079, NIH-sponsored (NIGMS) # 1 U01 GM076672, and NIH grant #1 R21 LM008707-01.

Authors’ Affiliations

(1)
Harvard School of Public Health
(2)
Blue Cross Blue Shield of Massachusetts
(3)
Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care
(4)
University of Michigan Health System

References

  1. Henning KJ: What is syndromic surveillance?. MMWR Morb Mortal Wkly Rep. 2004, 53 (Suppl): 5-11.PubMed
  2. Kleinman KP, Abrams A, Mandl K, Platt R: Simulation for assessing statistical methods of biologic terrorism surveillance. MMWR Morb Mortal Wkly Rep. 2005, 54 (Suppl): 101-108.PubMed
  3. Buckeridge DL, Burkom H, Campbell M, Hogan WR, Moore AW: Algorithms for rapid outbreak detection: a research synthesis. J Biomed Inform. 2005, 38 (2): 99-113. 10.1016/j.jbi.2004.11.007.View ArticlePubMed
  4. Kleinman KP, Abrams AM: Assessing surveillance using sensitivity, specificity and timeliness. Stat Methods Med Res. 2006, 15 (5): 445-464.PubMed
  5. Kleinman K, Abrams A: Assessing the utility of public health surveillance using specificity, sensitivity, and lives saved. 2008
  6. Kleinman KP, Abrams AM, Kulldorff M, Platt R: A model-adjusted space-time scan statistic with an application to syndromic surveillance. Epidemiol Infect. 2005, 133 (3): 409-419. 10.1017/S0950268804003528.PubMed CentralView ArticlePubMed
  7. Kleinman K: Generalized Linear Models and Generalized Linear Mixed Models for Small-Area Surveillance. Spatial and Syndromic Surveillance for Public Health. 2005, London: Wiley
  8. Holty JE, Bravata DM, Liu H, Olshen RA, McDonald KM, Owens DK: Systematic review: a century of inhalational anthrax cases from 1900 to 2005. Ann Intern Med. 2006, 144 (4): 270-280.View ArticlePubMed
  9. Update: Investigation of bioterrorism-related anthrax, 2001. MMWR Morb Mortal Wkly Rep. 2001, 50 (45): 1008-1010.
  10. Shepard CW, Soriano-Gabarro M, Zell ER, Hayslett J, Lukacs S, Goldstein S, Factor S, Jones J, Ridzon R, Williams I: Antimicrobial postexposure prophylaxis for anthrax: adverse events and adherence. Emerg Infect Dis. 2002, 8 (10): 1124-1132.PubMed CentralView ArticlePubMed
  11. Neumann PJ, Goldie SJ, Weinstein MC: Preference-based measures in economic evaluation in health care. Annu Rev Public Health. 2000, 21: 587-611. 10.1146/annurev.publhealth.21.1.587.View ArticlePubMed
  12. Reissman DB, Whitney EA, Taylor TH, Hayslett JA, Dull PM, Arias I, Ashford DA, Bresnitz EA, Tan C, Rosenstein N: One-year health assessment of adult survivors of Bacillus anthracis infection. Jama. 2004, 291 (16): 1994-1998. 10.1001/jama.291.16.1994.View ArticlePubMed
  13. Fine AM, Wong JB, Fraser HS, Fleisher GR, Mandl KD: Is it influenza or anthrax? A decision analytic approach to the treatment of patients with influenza-like illnesses. Ann Emerg Med. 2004, 43 (3): 318-328. 10.1016/j.annemergmed.2003.09.007.View ArticlePubMed
  14. Kaplan RM, Anderson JP: A general health policy model: update and applications. Health Serv Res. 1988, 23 (2): 203-235.PubMed CentralPubMed
  15. Jernigan JA, Stephens DS, Ashford DA, Omenaca C, Topiel MS, Galbraith M, Tapper M, Fisk TL, Zaki S, Popovic T: Bioterrorism-related inhalational anthrax: the first 10 cases reported in the United States. Emerg Infect Dis. 2001, 7 (6): 933-944. 10.3201/eid0706.010604.PubMed CentralView ArticlePubMed
  16. DRG Relative Weights - 2006. [http://​www.​cms.​hhs.​gov/​AcuteInpatientPP​S/​FFD/​itemdetail.​asp?​filterType=​none&​filterByDID=​-99&​sortByDID=​2&​sortOrder=​ascending&​itemID=​CMS022585&​intNumPerPage=​10]
  17. 2006 National Physician Fee Schedule Relative Value File. [http://​www.​cms.​hhs.​gov/​PhysicianFeeSche​d/​pfsrvf/​list.​asp]
  18. 2007 Outpatient PPS Pricer. [http://​www.​cms.​hhs.​gov/​PCPricer]
  19. 2006 Drug Topics Red Book. 2006, Thomson, 110
  20. Spencer RC: Bacillus anthracis. J Clin Pathol. 2003, 56 (3): 182-187. 10.1136/jcp.56.3.182.PubMed CentralView ArticlePubMed
  21. Gross Domestic Product Deflator Inflation Calculator. [http://​cost.​jsc.​nasa.​gov/​inflateGDP.​html]
  22. Viscusi WK, Aldy JE: The Value of a Statistical Life: A Critical Review of Market Estimates Throughout the World. The Journal of Risk and Uncertainty. 2003, 27 (1): 5-76. 10.1023/A:1025598106257.View Article
  23. Hunink MG, Glasziou P: Summary indices for comparing diagnostic test performance. Decision making in health and medicine. 2005, Cambridge: Cambridge University Press, 190-195. Fourth
  24. Johnson JA, Coons SJ, Ergo A, Szava-Kovats G: Valuation of EuroQOL (EQ-5D) health states in an adult US sample. Pharmacoeconomics. 1998, 13 (4): 421-433. 10.2165/00019053-199813040-00005.View ArticlePubMed
  25. Profile of General Demographic Characteristics. [http://​factfinder.​census.​gov/​servlet/​QTTable?​_​bm=​y&​-geo_​id=​04000US25&​-qr_​name=​DEC_​2000_​SF1_​U_​DP1&​-ds_​name=​DEC_​2000_​SF1_​U]
  26. Arias E: United States life tables, 2002. Natl Vital Stat Rep. 2004, 53 (6): 1-38.PubMed
  27. Pre-publication history

    1. The pre-publication history for this paper can be accessed here:http://​www.​biomedcentral.​com/​1472-6947/​10/​25/​prepub

Copyright

© McBrien et al; licensee BioMed Central Ltd. 2010

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement