EURIS (European Resistance Intervention Study) was launched as a multinational study in September of 2000 to identify the multitude of complex risk factors that contribute to the high carriage rate of drug resistant Streptococcus pneumoniae strains in children attending Day Care Centers in several European countries. Access to the very large number of data required the development of a web-based infrastructure – EURISWEB – that includes a relational online database, coupled with a query system for data retrieval, and allows integrative storage of demographic, clinical and molecular biology data generated in EURIS.
All components of the system were developed using open source programming tools: data storage management was supported by PostgreSQL, and the hypertext preprocessor to generate the web pages was implemented using PHP. The query system is based on a software agent running in the background specifically developed for EURIS.
The website currently contains data related to 13,500 nasopharyngeal samples and over one million measures taken from 5,250 individual children, as well as over one thousand pre-made and user-made queries aggregated into several reports, approximately. It is presently in use by participating researchers from three countries (Iceland, Portugal and Sweden).
An operational model centered on a PHP engine builds the interface between the user and the database automatically, allowing an easy maintenance of the system. The query system is also sufficiently adaptable to allow the integration of several advanced data analysis procedures far more demanding than simple queries, eventually including artificial intelligence predictive models.
Social forces that produced Day Care Centers (DCCs) for preschool age children in many developed countries have – ironically – also created in these structures one of the major ecological reservoirs of drug resistant strains of Streptococcus pneumoniae, which spread globally and began to create serious complications in the chemotherapy of diseases caused by this dangerous pathogen [1–3]. Day Care Centers recruit in close physical proximity children of an age group that is characterized by high rate of carriage of S. pneumoniae, an immature immune system and frequent viral and bacterial respiratory tract infections leading to extensive use of antimicrobial agents which provide a powerful selective milieu for the emergence of resistant strains [4–7]. The best evidence that such strains can cause both pediatric and adult disease came from molecular epidemiological studies, which demonstrated that resistant clones of S. pneumoniae most frequently identified in disease [8, 9] were also the ones frequently carried in the nasopharynx of healthy children in DCCs [10–12].
If DCCs are ecological reservoirs of resistant S. pneumoniae then reduction in the rate of carriage of such strains in DCCs should also impact on the frequency of infections caused by resistant pneumococci. Testing the efficacy of such a novel strategy was the purpose of the multinational initiative EURIS (European Resistance Intervention Study – Reducing Resistance in Respiratory Tract Pathogens in Children)  launched by the European Community in September of 2000 until 2003. Investigators from four countries (France, Iceland, Portugal and Sweden) supported by scientists from Germany and the USA joined forces to test the effect of a variety of different interventions methods (e.g. reduction in drug prescriptions; changing antibiotic dosing; improving hygienic conditions in DCCs etc.) on the frequency of nasopharyngeal colonization by resistant pneumococci – in carefully controlled studies.
The structure of EURIS is composed of four centers where strain collections and interventions are carried out: Portugal – Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa; Iceland-Landspitali University Hospital; Sweden – Swedish Institute for Infectious Diseases Control; France – Institut National de la Santé et de la Recherche Médicale. However EURISWEB represents data generated in only three of the four collection centers: Portugal, Iceland and Sweden, the three countries in which the timing, the age groups of the children and the methods used were fully harmonized. The French initiative, while addressing the same issues, was not directly comparable, as it involved different age groups, different mode of sampling, and schools rather than Day Care Centers. Therefore the French data were deposited in a different database. Four additional collaborating units assist as reference centers for the harmonization of methods in clinical microbiology (Iceland: Landspitali University Hospital); molecular epidemiology of antibiotic resistant genes and clones (USA: Laboratory of Microbiology, The Rockefeller University; Germany: University of Kaiserslautern); data management and mathematical modeling of epidemiological aspects of EURIS (Portugal: Instituto de Biologia Experimental e Tecnológica).
The risk factors, i.e. the nature and number of the factors that influence the rate of carriage of drug resistant S. pneumoniae in preschool age children and their quantitative contribution to the degree of colonization, are not well understood. Furthermore, major risk factors for nasopharyngeal colonization may differ significantly from one setting to another [14–16], which makes analysis of data generated by a multinational study like EURIS, more complex. The evaluation and comparison of such massive amounts of surveillance data necessitated the construction of a computerized infrastructure organized in such a manner that it would assure not only data storage and retrieval but also an eventual bioinformatics analysis. The purpose of this communication is to describe such a web-based infrastructure specially designed to fit the purposes of EURIS – the EURISWEB.
Several potential conflicting attributes had to be accommodated in the design of such an infrastructure. On the one hand, it was to provide full integration of data from different countries in a common normalized repository, fully accessible to all EURIS participants. On the other hand, it was also supposed to exhibit the properties of a local database with full separation between the countries involved. Finally, it was anticipated that, eventually, EURISWEB would be made available for wider usage for research and public health management at a later stage, with steep requirements of stability, scalability, security, user-friendly access, low cost portability, and transparent implementation for subsequent independent development. In the design of EURISWEB we took into account the multiple goals of such a web-based infrastructure which now includes a relational online database coupled with data retrieval and analysis tools, where registered users can access data and tools by using a personal login name and password through a standard web browser. Ultimately three of the four centers (Iceland, Portugal and Sweden), in which the nature of the pediatric population and mode of sampling were most comparable, chose to combine all data for deposition in EURISWEB, which now covers a large number of participant institutions: 16 DCCs in Portugal, 30 DCCs in Iceland and 25 DCCs in Sweden; a wide variety of sources of data, including demographic, socio-economic factors, clinical data, patterns and types of drug use and drug prescription; microbiological data on the antibiotypes and serotypes as well as molecular types of the pneumococcal isolates and DNA fingerprints of resistant genes.
A demo version of EURISWEB is available to the general public , accessible with username euris and password welcome. For those who intend to receive the e-mails sent by the query agent (see User-Friendly Query System), please request a personal account to the authors. As any modifications applied to the current implementation of the infrastructure will automatically be reflected on the demo version, new features may be already apparent when compared with illustrations and examples used in this manuscript.
Data and data acquisition
The diverse surveillance data (demographic, clinical, microbiological etc.) generated in project EURIS are used to fill five different types of Questionnaires which serve as the source of information to be introduced into the EURISWEB database. The relationship between the five questionnaires is described and illustrated later in this report (see Database structure and Database tables versus online forms). Typically each site will update surveillance information at least once per year. Questionnaires 1 and 2 are provided by the staff of each participating DCC.
• Questionnaire 1 contains information regarding physical features of the DCC (address, number of rooms and windows, area inside and outside the facility, number of children and staff, hygiene protocols and practice) – see Figure 1.
• Questionnaires 2 provide the same type of information for each room (also referred to as "unit") in the particular DCC – see Figure 1.
• Questionnaires 3 are filled at least once every year by the parents of the children. They contain demographic information on the household and environment where the child lives, including number and age of siblings, shared bedrooms, and specific conditions such as smoking in the house.
• Questionnaires 4 are filled by the parents just prior to each strain collection. They provide information on antibiotic consumption prior to sampling (type of antibiotic, taken when and for how long). Also provided are data on illness and hospitalizations of the child.
Questionnaires 5 are filled by the participating microbiology and molecular biology laboratories. They contain characterization of each S. pneumoniae isolate for serotype; antibiotype (susceptibility to oxacillin, chloramphenicol, erythromycin, clindamycin, tetracycline, sulfamethoxazole-trimethoprim, and levofloxacin); MIC values for penicillin and ceftriaxone; molecular type by PFGE (Pulsed-Field Gel Electrophoresis) and MLST (Multilocus Sequence Typing) (for selected isolates); DNA probes for antibiotic resistant factors; and RFLP (Restriction Fragments Length Polymorphisms) for pbp (penicillin binding protein) genes of selected penicillin resistant isolates. All data in questionnaire 5 are obtained by common harmonized methods.
Although there was an effort towards the normalization of data acquisition taking place in the different participant countries, the questionnaires delivered to the DCCs and to the parents contain various questions that reflect realities specific to the country involved. Accordingly, some questions only appear in the questionnaires of some countries. Also, the frequency with which updated information is collected differs between countries. Since discarding data was to be avoided at all costs in order not to confront local practices, the normalization process had to be extended to database conception itself. Instead of designing an optimal database structure for each country, an iterative consulting process was followed for nearly a year to produce a normalized database structure that fits the reality presented by all the countries involved. The final structure of the EURISWEB database accommodates both country specificity and common European health management practices.
Besides providing comprehensive data storage, the web-based data management infrastructure must also allow the easy querying and retrieval of the data it contains. Since some of the retrieval requests may generate large amounts of data, or may require intensive computation, the requests are processed as background processes managed by a software agent that send e-mails to the user with information on the execution state of each request and, finally, a link to the completed report. User-friendliness was the primary concern in building the interface available to make these requests, with current version reflecting extensive user feedback.
Software and Hardware
All the software used to implement this infrastructure is Open Source and is provided under public license. The scripts that generate the HTML (HyperText Markup Language) interfaces were written in PHP (PHP: Hypertext Preprocessor)  4.x. The Database Management System is PostgreSQL  7.1.x. The scripts for the agents that handle the data retrieval requests were written in shell (bash 2.04) and Perl  (5.x). The server is a PC, CPU 2x PIII (coppermine) @ 800 MHz, with 512 MB SDRAM, running Linux  (based on Slackware  7.1, kernel 2.2.x), SSL (Secure Socket Layer)  enabled Apache Web-Server  (Apache/1.3.x Ben-SSL/1.x). Software versions described above were updated regularly throughout the course of the EURIS project (2000–2003) with no negative impact on its performance.
The basic internal structure of the database consists of 9 tables with an average of 14 fields per table. Figure 2 shows a simple model of this structure, where boxes represent tables and lines represent the relations between them. There is only one type of relation in this structure, which is "one-to-many", the "one" side being represented by the single line and the "many" side being represented by the forked line. A one-to-many relation between two tables means that one record from one table can be associated to several records from the other table. For example, one DCC can be associated to several rooms (units) in the same DCC; one unit can be associated with several children; and one child can be associated with several siblings.
The description illustrated in Figure 2 is country specific. A separate set of tables was defined for each of the three participant countries – Portugal, Iceland and Sweden, all inside the same database, but not formally connected to each other. Although the questionnaires for the different countries have significant differences, as some countries may lack many fields or even whole tables of this structure, the critical feature is that all the common fields can be found in exactly the same location in each country-specific structure. Equally critical, the key fields are obligatorily shared by all countries, a feature that can only be easily achieved if a common host infrastructure is in place, which is the case in EURISWEB. Because the frequency with which updated information is collected differs between countries (see Data and data acquisition, and Database conception), many of the key fields are related to the specification of the sampling periods, playing an important role as temporal normalization features. Once the access restrictions are lowered, the conservation of ontology and structure enables intersection between country-specific structures to produce comprehensive data sets jointly describing epidemiological data, which are valid for all the participating countries. Furthermore, because all countries also share the same data retrieval system (see User-Friendly Query System), queries already built by different countries produce compatible results that can be promptly joined after removing the country-specific fields.
The interface between the database and the users is made of standard HTML pages (no external applications, "plug-ins", needed on the client side). Data entering is performed through five online forms that mimic the original paper questionnaires, to facilitate the insertion task (see examples of two forms in Figure 1). All data entered in the forms is submitted to online validation procedures before entering the database, thus avoiding some of the most common user errors that may cause integrity or consistency violations in the database. Upon pressing the Insert button for submitting data, the user is promptly informed of all its mistakes and given a chance of resolving them on the same page (example in Figure 3). Only after passing all the checks is the data effectively inserted in the database, and fitted into the respective internal data structure (see Database structure). Searching and visualizing data can be done on a record-by-record basis, using the same five forms format, or by browsing as a table that shows several records at the same time (example in Figure 4). Some simple statistics can also be requested online. For convenience, most of the tables presented can be directly viewed or saved in Excel format.
Data retrieval requests can also be made by filling a simple online form in which the amount of typing required is kept to a minimum (see User-Friendly Query System). The results can be viewed and downloaded in delimited text format, also readily importable into Excel.
Database tables versus online forms
The relationship between the internal database structure and the set of online forms available to the user is not a one-to-one association. Behind each form there can be more than one table, as shown in Figure 5. Although the mimicking of the original questionnaires by the online forms is meant to facilitate the user's adaptation to the data insertion and visualization, that is not the optimal data organization in a relational database. For example, the repeated set of questions about each antibiotic taken prior to sampling (see Data and data acquisition) should not result in a repeated set of fields in the same database table (table QUESTIONNAIRE, see Figure 2). Instead, each set of questions constitutes a row of fields in a different table (table ANTIBIOTICS, same figure).
The operational model of the database interface is depicted in Figure 6, where the arrows represent flow of information between the various entities. The five online forms for record insertion and visualization, available to the user, are all built with the same general procedure (PHP engine). This program, written in PHP, reads files that contain all the information regarding the forms layout (layout files), designs the forms and manages all the interactions between the users and the database. Each layout file describes a form (for all countries) and consists of a few lines written in a subset of the PHP language, which indicate each field's properties, such as whether it is a numeric or Boolean field, a date or time field, and what are the range and type of values allowed. This program and the subset of PHP used to define the layout files are the core of the surveillance system reported here. Accordingly, to alter an existing form, or generate a new one, all the database manager has to do is update or build a layout file.
The layout files also include the description of the connection between the form fields and the actual database fields. This information must be in accordance to the internal database structure, which is managed by SQL (Structured Query Language) code also stored in files (structure files). Therefore, the database manager will need to keep them consistent with any changes in the structure files required by modifications in the online interface. These two simple tasks ensure both the automatic construction of personalized forms – together with online validation check procedures – and a smooth linkage between them and the database internal structure.
User-Friendly Query System
Although SQL is the standard way to access data stored in a database, using it requires some prior knowledge and experience from the user. The User-Friendly Query System, available to all the EURISWEB users, is an interface that facilitates query construction in order to make the wide range of possibilities offered by SQL amenable to the untrained user. The users are presented with a series of selection boxes where they can select the fields they want to see, the restrictions they want to apply to the records returned, and how the returned records are to be grouped (Figure 7). The chosen options are then transformed into actual SQL formatted statements that are sent to the query management agent, through the PHP engine, as shown in Figure 8. The arrows in the figure represent flow of information between the various entities (see Figure 9 for the whole operational model).
The query agent manages all the requests and runs them exclusively in background, so that high usage rates and complex requests do not interfere with the normal usage of the database interface. The agent interacts with the database and informs the users, by e-mail, of when their requests start being processed and when they finish, including the information of whether the query was successfully answered (the interface gives users enough freedom to request impossible things) or not, in which case the results presented are an empty text page. Due to security reasons, the results of queries are never sent by e-mail – they can only be downloaded from the server via an SSL connection.
Users can rerun, edit, or delete saved queries. They can also group queries into reports, so that a single request will yield all the results from the several queries of that report. Furthermore, users can save restrictions used often, and apply them to other queries. To minimize the time and effort required of the users, we have provided several pre-made queries, already aggregated into several logical reports. This feature may prove particularly useful if standard reporting formats become a regulatory requirement.
The EURIS online database has been adopted as the data storage standard by three of the EURIS participant countries – Portugal, Iceland, and Sweden. Growing steadily since its birth, February 2001, it now has 24 registered users and contains a total of 213 DCC records, 720 unit records, 10991 children records, 13207 questionnaire records, and 13504 microbiology records, totaling more than 25 megabytes of data. The User-Friendly Query System, available since April 2002, now contains 400 pre-made and 786 user-made queries, aggregated into several reports.
Discussion and Conclusions
Privacy and security precautions
In EURISWEB, each user registry includes not only its login name and password, but also its country identification, which completely blocks access to data belonging to other countries. In fact, by using different table sets for each country, the central database can behave like several different local databases, and the user is never aware that its access is restricted to only a subset of the complete system. In a near future, the fact that all countries use, after all, the same normalized structure, will allow simple queries and complex data analysis to be performed in the common data, as if we were dealing with a single country.
The user registration process also includes a level access number that defines restrictions for each type of user. Although the system was built anticipating this need, other precautions proved sufficient to monitor and to recover from possible destructive actions. All user inputs are scanned for invalid characters to prevent SQL code injection, and a record of all actions performed in the database is kept, including who did what, and when. Any accidentally deleted record can be promptly restored by the database manager; all the updates a record has undergone since its insertion can be tracked; and many database usage statistics can be easily performed.
Intrusion by unauthorized parties (hackers) is repelled by the need to log on with login name and password, and subsequent identification of the user with cookies protected by SSL, without which no page is ever shown and no query is run. A brute force attack is also limited by a delay introduced in the password checking cycle, and resources consumption at the server. Additionally, page accesses are monitored on a daily basis. Repeated login attempts would therefore be promptly detected before a sufficiently high number of probes take place. Furthermore, a firewall protects the server from being accessed on other ports apart from the HTTPS port, and the server software (Apache, PHP, kernel, etc.) is promptly updated if any security breach is detected in the current versions.
In all cases, names of children and DCCs are not kept in the database, instead being replaced by codes and acronyms manually assigned prior to insertion.
The operational model described in the Results section is the basis for easy improvements and extensions to the whole infrastructure of EURISWEB. As a consequence of the design described in that section, database management can be fully dealt with by manipulation of the layout and structure files. The core element of this operational functionality is supported by the PHP engine described in the Results section. As a result both maintenance and development scale well with increasing usage, particularly since availability of high performance hardware and Internet access have ceased to be an issue. It is noteworthy that the layout and structure files are particularly suited for extensions to the current model, including having new types of data integrated into the set already stored; having new countries and new country-specificities accommodated, while retaining previous accessibility and privacy. The demo version of the database shell and query system, made publicly available (see Availability for URL and login directions), was built by configuring a fictitious new country structure, which was achieved by performing minor modifications in the layout files.
This comes to illustrate that a possible useful extension to this system would be to allow selected users to manage their own tables and forms by providing a web-based interface to the layout and structure files. These users with management access permissions would not need to know the PHP or SQL languages, but simply interact with online forms with the same level of complexity as the regular database query forms. In our experience, most of the tasks requested to the database manager are simple and pose no risk whatsoever to the data, like adding an item to a drop-down selection field, resizing a text edit field, or even adding a new field to an already existing table. Given the wide geographic distribution of the EURISWEB users, describing what needs to be done to the database manager is as time consuming as specifying it in such idealized management forms. The database manager could then be left with only the more complex and "dangerous" tasks. Figure 9 shows how the operational model implemented, including the query system, could be configured to greatly remove the need for low level data management.
The extension of data management to include data analysis is particularly suited for web-based implementations – such as EURISWEB – since all computation takes place on the server side. This approach enables a bioinformatics approach to establish itself alongside data storage. As a consequence, advanced data mining tools such as multivariate statistical analysis or the identification of artificial intelligence predictive models using neural networks  and rule extraction by genetic algorithms can be made available alongside the data itself. This is mutually beneficial for usage and development and, on the other hand, bioinformatics tool development has, in return, ready access to extensive datasets for validation as well as, even more important, facilitated interaction with domain experts that provide the context for its interpretation. This bi-directional integration enabled by a web-based development environment undoubtedly offers the best conditions for practical implementation of a full-fledged epidemiological information system.
Gray BM, Turner ME, Dillon HC: Epidemiologic studies of Streptococcus pneumoniae in infants. The effects of season and age on pneumococcal acquisition and carriage in the first 24 months of life. Am J Epidemiol. 1982, 116: 692-703.
Yagupsky P, Porat N, Fraser D, Prajgrod F, Merires M, McGee L, Klugman KP, Dagan R: Acquisition, carriage, and transmission of pneumococci with decreased antibiotic susceptibility in young children attending a day care facility in southern Israel. J Infect Dis. 1998, 177: 1003-1012.
Reichler MR, Allphin AA, Breiman RF, Schreiber JR, Arnold JE, McDougal LK, Facklam RR, Boxerbaum B, May D, Walton RO, Jacobs MR: The spread of multiply resistant Streptococcus pneumoniae at a day care center in Ohio. J Infect Dis. 1992, 166: 1346-1353.
McGee L, Klugman KP, Tomasz A: Serotypes and clones of antibiotic-resistant pneumococci. In Streptococcus pneumoniae. Molecular Biology & Mechanisms of Disease. Edited by: Tomasz A. 1996, New York: Mary Ann Liebert, 375-379.
McGee L, McDougal L, Zhou J, Spratt BG, Tenover FC, George R, Hakenbeck R, Hryniewicz W, Lefevre JC, Tomasz A, Klugman KP: Nomenclature of major antimicrobial-resistant clones of Streptococcus pneumoniae defined by the pneumococcal molecular epidemiology network. J Clin Microbiol. 2001, 39: 2565-2571. 10.1128/JCM.39.7.2565-2571.2001.
Sá-Leão R, Tomasz A, Sanches IS, Brito-Avô A, Vilhelmsson SE, Kristinsson KG, de Lencastre H: Carriage of internationally spread clones of Streptococcus pneumoniae with unusual drug resistance patterns in children attending day care centers in Lisbon, Portugal. J Infect Dis. 2000, 182: 1153-1160. 10.1086/315813.
Sá-Leão R, Tomasz A, Sanches IS, Nunes S, Alves CR, Brito-Avô A, Saldanha J, Kristinsson KG, de Lencastre H: Genetic diversity and clonal patterns among antibiotic-susceptible and -resistant Streptococcus pneumoniae colonizing children: day care centers as autonomous epidemiological units. J Clin Microbiol. 2000, 38: 4137-4144.
de Lencastre H, Tomasz A: From ecological reservoir to disease: the nasopharynx, day care centres and drug resistant clones of Streptococcus pneumoniae. J Antimicrob Chemother. 2002, 50 (Suppl C): 75-81. 10.1093/jac/dkf511.
Boken DJ, Chartrand SA, Moland ES, Goering RV: Colonization with penicillin-nonsusceptible Streptococcus pneumoniae in urban and rural child-care centers. Pediatr Infect Dis J. 1996, 15: 667-672. 10.1097/00006454-199608000-00006.
This work was supported by grant QLK2-CT-2000-01020 (EURIS) from the European Commission. JC was supported by grant SFRH/BD/3123/2000, and AM by grant POCTI/1999/BSE/34794 (SAPIENS), both by Fundação para a Ciência e a Tecnologia, Ministério da Ciência e do Ensino Superior, Portugal. We also thankfully acknowledge the EURIS Portuguese team (C. Simas, R. Mato, S. Nunes, N. Sousa, N. Frazão) and the Icelandic Team for early advice that was essential for the EURISWEB prototype design, as well as J. Saldanha (Portugal) for help in designing the questionnaires.
Authors and Affiliations
Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Av. República (EAN), PO Box 127, 2781-901, Oeiras, Portugal
Sara Silva, Rodrigo Gouveia-Oliveira, António Maretzek, João Carriço, Ilda Santos Sanches, Hermínia de Lencastre & Jonas Almeida
Department of Pediatrics and Microbiology, Landspitali University Hospital, Reykjavik, Iceland
Thorolfur Gudnason & Karl G Kristinsson
Swedish Institute for Infectious Diseases Control, Department of Epidemiology, Se-171 82, Solna, Sweden
Centro de Saúde de Oeiras, Av. Salvador Allende, 2780-163, Oeiras, Portugal
Laboratory of Microbiology, The Rockefeller University, 1230 York Avenue, New York, NY, 10021, USA
Alexander Tomasz & Hermínia de Lencastre
Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Monte de Caparica, 2829-516, Caparica, Portugal
Ilda Santos Sanches
Dept Biometry & Epidemiology, Medical Univ South Carolina, 135 Cannon Street, Suite 303, PO Box 250835, Charleston, SC, 29425, USA
SS designed and manages the database, and prepared the manuscript draft. AM designed the query system and does the software and hardware maintenance. SS, RGO, AM and JC implemented the database and query system. TG, KGK, KE, AT, ISS, HL and JSA conceived of the EURIS Project, and participated in its design and coordination. ABA participated in the design of the study in Portugal. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Silva, S., Gouveia-Oliveira, R., Maretzek, A. et al. EURISWEB – Web-based epidemiological surveillance of antibiotic-resistant pneumococci in Day Care Centers.
BMC Med Inform Decis Mak3, 9 (2003). https://doi.org/10.1186/1472-6947-3-9