County and state agencies receive reports of individual gastrointestinal cases as well as infectious disease outbreaks. Title 17 of the California Code of Regulations mandates case reporting of specified diagnosed diseases as well as outbreaks of any disease to local health departments by health care providers . Health departments may also become aware of outbreaks through follow-up with individual reported cases, citizen complaints and other modes. The definition of an outbreak differs by disease but typically entails a group of related cases for which a common source is identified or suspected; outbreaks may include as few as two cases.
Reports of cases of gastrointestinal disease from 2001-2007 among residents were requested from each of the county health departments in the drinking water service area. Data were transmitted in electronic formats from three adjacent counties. Reports for each case included etiology, date of report to the health department, gender, age, city and county.
Electronic records of outbreak data for all three participating counties were provided by the California Department of Public Health which receives outbreak reports following county and state health department outbreak investigations. These data were combined and reconciled with electronic records and records which were manually extracted from paper files from two of the participating county health departments. For each outbreak, information on etiology, number of cases, date of symptoms onset for the first and last cases, affected counties, and whether the outbreak occurred in an institutional setting such as a nursing home was provided. Outbreaks of reportable diseases as well as outbreaks of diseases that are not reportable as listed in Title 17 were included. Individual cases reportable under Title 17 associated with any outbreak may be included in the diarrhea case dataset; however, sufficient information was not available to link the outbreak and case datasets. The Committee on Human Research at the University of California, San Francisco approved the study protocol.
Over-the-counter drug sales records were purchased from the NRDM . Records for the years 2005-2007 were provided as an electronic file. Records for years 2003-2004 were downloaded using the NRDM web interface. NRDM over-the-counter drug sales records are divided into 18 categories based on common use, form and whether intended for adult or pediatric populations. NRDM drug categories are: diarrhea remedies, anti-fever adult, anti-fever pediatric, bronchial remedies, baby/child electrolytes, chest rubs, cold relief adult liquid, cold relief adult tablet, cold relief pediatric liquid, cold relief pediatric tablet, cough syrup adult liquid, cough adult tablet, cough syrup pediatric liquid, cough/cold, hydrocortisones, nasal product internal, throat lozenges, and thermometers. Sales are based on the number of units sold regardless of the package size. Daily total sales are available for both all units sold by category and units sold by category excluding units for which discounts or other promotions were offered during the reporting period. NRDM provides information on the number of stores enrolled and reporting; from 2005 through 2007 approximately 47% of the stores enrolled to report anti-diarrhea drug sales actually reported (number of stores enrolled per week: 1389 -1706; number of stores reporting: 592-836).
Our analysis variable was the proportion of non-promotional diarrhea remedy sales to sales of non-promotional drugs for all categories combined (Diarrheal Remedy Sales). Diarrheal remedies are products taken for the relief of diarrhea and include bismuth, attapulgite, subsalicylate, and loperaminde hydrochloride products. Sales records of diarrheal remedies were available for the entire study area from July 2003 through 2007. Proportion sales were used instead of counts to control for unknown confounders such as changes in store hours.
Diarrheal Remedy Sales, and gastrointestinal case and outbreak data were aggregated by week for analysis. Diarrheal Remedy Sales were aggregated by week of sale, cases by week of report to the health department and outbreaks by week of onset of first outbreak-associated case. Data were divided into three parts for model building, model validation, and forecasting.
We used methods developed by Box and Jenkins to build autoregressive integrated moving average (ARIMA) models . Estimates of model parameters were obtained through the method of least squares. All analyses were performed using SAS version 9.1 (SAS Institute Inc., Cary, NC, USA). Using Proc ARIMA, following either pre-whitening or double pre-whitening, Diarrheal Remedy Sales were cross correlated with the number of diarrhea cases in the same week and with weekly counts lagged one to 19 weeks before and after.
The relationship between Diarrheal Remedy Sales and gastrointestinal outbreaks was examined graphically and through regression. Because a 2006 report by Edge and colleagues  suggested that over-the-counter drug sales are sensitive to viral infection, specifically Norovirus, Diarrheal Remedy Sales were compared to outbreaks of all etiologies combined and to outbreaks of Norovirus alone. Furthermore, as institutionalized populations, such as those in a nursing home, may not purchase drugs from over-the-counter drug vendors in the same way as the non-institutionalized population, analyses were repeated excluding outbreaks that occurred in an institutional setting. Diarrheal Remedy Sales univariate model residuals were regressed on the number of outbreaks and on outbreak-associated cases per week.
The univariate Diarrheal Remedy Sales ARIMA model was used to auto-forecast sales for 105 weeks with weekly model updating (one week ahead forecasting). Signals were generated when actual observations exceeded the upper 95% confidence limit. An outbreak week was any week when one or more outbreaks started that week or prior to that week but ended that week or later. Model sensitivity was calculated as the number of outbreak weeks with a signal divided by the total number of outbreak weeks. Specificity was calculated as the total number of weeks without a signal and no detected outbreaks divided by the total number of weeks without an outbreak. Calculations were done with all outbreaks and repeated in subsets of only larger outbreaks with 50 or more or 100 or more cases. To evaluate if model derived alerts identified outbreak weeks more reliably than randomly chosen alerts, sensitivity and specificity calculations were repeated for three sets of randomly chosen dates.