Time series modeling for syndromic surveillance

Reis, Ben Y; Mandl, Kenneth D

doi:10.1186/1472-6947-3-2

Research article
Open access
Published: 23 January 2003

Time series modeling for syndromic surveillance

Ben Y Reis¹ &
Kenneth D Mandl²

BMC Medical Informatics and Decision Making volume 3, Article number: 2 (2003) Cite this article

25k Accesses
152 Citations
2 Altmetric
Metrics details

Abstract

Background

Emergency department (ED) based syndromic surveillance systems identify abnormally high visit rates that may be an early signal of a bioterrorist attack. For example, an anthrax outbreak might first be detectable as an unusual increase in the number of patients reporting to the ED with respiratory symptoms. Reliably identifying these abnormal visit patterns requires a good understanding of the normal patterns of healthcare usage. Unfortunately, systematic methods for determining the expected number of (ED) visits on a particular day have not yet been well established. We present here a generalized methodology for developing models of expected ED visit rates.

Methods

Using time-series methods, we developed robust models of ED utilization for the purpose of defining expected visit rates. The models were based on nearly a decade of historical data at a major metropolitan academic, tertiary care pediatric emergency department. The historical data were fit using trimmed-mean seasonal models, and additional models were fit with autoregressive integrated moving average (ARIMA) residuals to account for recent trends in the data. The detection capabilities of the model were tested with simulated outbreaks.

Results

Models were built both for overall visits and for respiratory-related visits, classified according to the chief complaint recorded at the beginning of each visit. The mean absolute percentage error of the ARIMA models was 9.37% for overall visits and 27.54% for respiratory visits. A simple detection system based on the ARIMA model of overall visits was able to detect 7-day-long simulated outbreaks of 30 visits per day with 100% sensitivity and 97% specificity. Sensitivity decreased with outbreak size, dropping to 94% for outbreaks of 20 visits per day, and 57% for 10 visits per day, all while maintaining a 97% benchmark specificity.

Conclusions

Time series methods applied to historical ED utilization data are an important tool for syndromic surveillance. Accurate forecasting of emergency department total utilization as well as the rates of particular syndromes is possible. The multiple models in the system account for both long-term and recent trends, and an integrated alarms strategy combining these two perspectives may provide a more complete picture to public health authorities. The systematic methodology described here can be generalized to other healthcare settings to develop automated surveillance systems capable of detecting anomalies in disease patterns and healthcare utilization.

Peer Review reports

Background

The earliest detectable sign of a covert germ warfare attack may be unusual increases in the number of people seeking healthcare. For example, victims of an anthrax attack might present to emergency departments with flu-like or respiratory syndromes [1]. The identification of abnormally high visit rates relies on a good understanding of the normal patterns of healthcare utilization. Once expected rates of utilization are established, alarm thresholds can be set for public health alerts.

Most previous models of healthcare usage have not incorporated real time data and were generally built to improve resource management, rather than to detect diseases [2–4]. Recently, the growing bioterrorist threat has led to the development of real-time syndromic surveillance systems built on top of existing information systems [5–7], with increased research activity in the biomedical engineering [8], public health [9], and general scientific [10] communities.

A number of different approaches have been developed for syndromic surveillance, with systems monitoring over the counter drug sales [11], web-based physician-entered reports [12], consumer health hotline telephone calls [13], and ambulatory care visit records [14]. However, due to the timeliness and data availability considerations, the majority of systems to date have focused on monitoring visit data from emergency department information systems. Several regional syndromic surveillance systems that focus on real time clinical and administrative emergency department datasets have been deployed [15–17].

Two data elements consistently relied upon are the International Classification of Disease (ICD) diagnostic codes and ED chief complaints (CC). ICD codes are universally used in the United States for hospital billing. Many EDs also record information about patients' presenting chief complaints. ICD and CC data often become available in databases during or shortly after a patient's visit, thus providing a basis for real time surveillance. CCs vary in quality because they are recorded prior to physician involvement in care, but on the other hand are timelier because they are elicited and recorded upon arrival to the ED. The ICD billing codes tend to be more accurate, though the delay in their availability ranges from hours to weeks [18, 19]. To enable interoperability between different systems, data interchange standards are being proposed for communicating emergency department data for public health uses including syndromic surveillance [20].

Our goal is to be able to detect a possible anthrax attack by detecting an anomalous surge in emergency department usage and particularly in patients presenting with respiratory and flu-like syndromes. Using CC data, we sought to define the expected daily rates for emergency department total volume as well as for the frequency of visits of patient with flu-like and respiratory illness. These expected rates can be compared with actual rates in attempting to identify anomalies.

In addition to presenting a generalized framework for biosurveillance model development, the paper tests the model with simulated outbreaks, and includes a proposal for handling specific alarm-related issues that arise with ARIMA-based models. Furthermore, the present study focuses on data from a pediatric population, something not prevalent in the prior literature.

Methods

The record review was approved by the Children's Hospital Boston Committee on Clinical Investigation (Protocol Number M-01-03-040). The setting was Children's Hospital Boston, an urban, tertiary care pediatric teaching facility. Subjects were consecutive patients visiting the emergency department from June 2, 1992 through January 4, 2002. Patients with respiratory illness were identified using a previously validated set of chief complaint and ICD codes.

We built models of ED utilization using a time-series analytic approach. Models were constructed through an iterative process and were trained on roughly the first eight years of data (2,775 days). The remaining two full years of data (730 days) were reserved for testing and validation. The model and the statistical methods used in its development are described below. Complete details are also available from the authors. Our approach consisted of three main phases: signal characterization, model development, and model validation.

Signal characterization

As a first step towards understanding the trends and behaviours of ED utilization, we performed a principal Fourier component analysis on the daily visit totals (Figure 1). A Fast Fourier Transform (FFT) was calculated from the time series data, and the results indicated the presence of strong weekly and yearly periodicities in the data. The analysis was performed using the Matlab software package, Release 12, Version 6.0 [21]. Guided by these findings, we calculated the ensemble average profiles for both the weekly and yearly periodicities (Figures 2 and 3). These profiles represent the average number of visits for each day of the week and year respectively.

Model development

Our model development methods followed established time series forecasting procedures. The signal was broken up into its various components: the overall mean, the weekly trend and the yearly trend. Mean Absolute Percentage Error (MAPE) was used to measure of quality of fit. A MAPE of 0% indicated a perfect fit of the model to the training dataset.

The trimmed mean seasonal model was then constructed from the overall mean, the weekly trend, and the yearly trend, as follows:

Expected number of visits = Overall mean.+ Mean for day of week + Trimmed mean for day of year

We started with a simple-mean model that consistently predicted 137 total visits every day, 48 of which were respiratory-related. This mean was subtracted from the data. Next the weekly average profile of the resulting signal was calculated, yielding a model of the average utilization patterns for each day of the week. The weekly profile was also subtracted out from the data. The yearly average profile of the resulting signal was calculated, yielding a model of the average utilization patterns for each day of the year. To increase the robustness of this model and ensure that it was not overly sensitive to outliers in the training set, a trimmed mean function (ignoring a certain percent of the highest and lowest values) was used for calculating the yearly profiles. Testing a range of trimming magnitudes, we found that excluding the highest and lowest 25% of the values yielded the best results as far as improving the quality of fit (low MAPE).

When analyzing the residuals (forecasting errors) of the trimmed-mean seasonal model, a significant degree of autocorrelation was detected. Significant autocorrelations indicate that, for example, a high utilization rate on one day is associated with a high utilization rate on the following day. To model these localized auto-correlative effects, we fit an Auto-Regressive Integrated Moving Average (ARIMA) time-series forecasting model to the residuals (forecasting errors) of the trimmed mean seasonal model. This class of models is often used in economic time series forecasting, and is described in full detail in [22]. The ARIMA model was fit using the SAS Release 8.2 software package Time Series Forecasting System (Cary, NC) [23].

Three main parameters need to be selected when fitting an ARIMA model: the order of autocorrelation (AR), the order of Integration (I), and the order of Moving Average (MA). The higher-order models are more complex and can usually achieve a better fit of the training data set, while the simpler low-order models are usually less likely to over-fit to training dataset.

The first derivative of the data was taken to ascertain whether Integration would improve the model. Since forecasting was not improved, the optimal degree of integration was determined to be zero (I = 0). Different combinations of AR and MA orders were then tested. Of all the models tested, an ARIMA(2,0,1) model (second-order Auto-Regressive combined with first-order Moving Average) was found to work best for overall ED volume. An ARIMA(1,0,1) model (first-order Auto-Regressive combined with first-order Moving Average) was found to work best for respiratory-related ED volume. The SAS toolkit was used to automatically fit the parameters for the models.

Model validation

The models, trained on approximately the first eight years of data – the training set, were validated using the final two years of the dataset – the validation dataset. Again, a MAPE of 0% indicated a perfect fit of the model to the validation dataset.

We set out to quantify the detection performance of a biosurveillance system based on the above ARIMA model of overall ED volume by measuring its ability to detect outbreaks. Since the historical dataset used here was devoid of any known outbreaks and there is a paucity of data available on actual germ warfare attacks [11], we introduced a set of simulated outbreaks into the historical visit data by adding a certain number of simulated visits to specified days.

The simulated outbreaks lasted seven days each. While real outbreaks may last well beyond these durations, we focused on the first few days since useful detection systems should be able to spot outbreaks within that time-frame.

The outbreaks were spaced 15 days apart and there were 233 simulated 7-day outbreaks in total inserted into the data. This ensured that all the effects of any previous outbreak could be reset from the detection system's memory before the onset of the next outbreak. Furthermore, since the only significant periodicities present in the data were seven day (weekly) and 365.25 day (yearly), the 15-day spacing of outbreaks yielded a good unbiased sample of days for infusing outbreaks.

Each complete 3,505-day simulation used outbreaks of only one size. Different sizes of outbreaks were tested in separate simulations. For each simulation, sensitivity was calculated as the ratio of the number of outbreaks detected (true positives) to the total number of outbreaks. Since there is an inherent tradeoff between sensitivity and specificity, a benchmark specificity of 97% was chosen, producing approximately one false alarm per month. A study that evaluates different detection approaches using simulated outbreaks can be found in [24].