BMC Medical Informatics and Decision Making

Background: The integration of Information Systems (IS) is essential to support shared care and to provide consistent care to individuals – patient-centred care. This paper identifies, appraises and summarises studies examining different approaches to integrate patient data from heterogeneous IS.


Background
This review appraises studies examining the different approaches to integrating patient data from heterogeneous IS. Special attention is given to the type of integration engine and the type of integrated data. Articles published in the English literature between 1995 and 2005 with abstracts available were reviewed. We aimed to specifically review the integration of patient data, and how systems are evolving in practice to meet patient, professional and organisational needs.
A patient record is a set of documents containing clinical and administrative information regarding one particular patient, supporting communication and decision making in daily practice, and having different users and purposes [1]. Clinical care increasingly requires healthcare professionals to access patient record information that may be distributed across multiple sites, held in a variety of paper and electronic formats, and represented as mixtures of narrative, structured, coded and multimedia entries [2]. In hospitals, information technologies tend to combine different modules or subsystems, resulting in a best-of-breed approach [3]. Integration of healthcare Information Systems (IS) is essential to support shared care in hospitals, to provide proper care to mobile individuals and to make regional healthcare systems more efficient. However, to integrate clinical IS in a way that will improve communication and data use for healthcare delivery, research and management, many different issues must be addressed [4][5][6]. Consistently combining data from heterogeneous sources takes a great deal of effort because the individual feeder systems usually differ in several aspects, such as functionality, presentation, terminology, data representation and semantics [3]. It is still a challenge to make electronic health records interoperable because good solutions to the preservation of clinical meaning across heterogeneous systems remain to be explored [2]. Over the years different solutions to these problems have been proposed and some applied. Many of these solutions coexist in today's healthcare settings and are influenced by technology innovation and changes in healthcare delivery. Some of these solutions use differing standards and data architectures that may prove to be the greatest obstacle to semantic operability [7].

Eligible studies
Only studies describing or evaluating IS implementation for integrating patient data from heterogeneous IS were selected.

Review team
The review team was composed of three Computer Scientists, namely Ana Margarida Ferreira, Pedro Vieira Marques, and Ricardo Cruz Correia, one medical doctor Filipa Canário Almeida advised by health informaticians experienced in systematic reviewing, Jeremy Crispin Wyatt and Altamiro Costa Pereira.

Search methods
Studies were searched between September and October 2005 in the bibliographic databases. Since there is no specific standardised MeSH term, we developed a search string that includes the concepts of patient record, computers and data integration or sharing. Only articles with an abstract in English were considered. Given the signifi-cant evolution in ICT in the last decade, only studies published after 1994 (the last ten years) were included.
Three distinct bibliographic databases were searched: Medline (via Pubmed), ISI (ISI Web of Knowledge) and IEEE (IEEE Xplore). The query search string used in each database was ((medical or clinical or patient) and record*) and (comput* or digital or electronic*) and (integrat* or link* or sharing or share or shared).
This search method found 2443 articles in Pubmed, 961 in ISI and 414 in IEEE Xplore, a total of 3818 articles. After eliminating duplicate articles 3124 were selected.

Selection of studies for the review
All four reviewers from the review team were involved in study selection. Six combinations of reviewer pairs were defined, due to the large number of articles found. The first selection was based on the study title. Each pair of reviewers read 512 titles. The study was considered eligible when at least one of the reviewers considered that the title mentioned one of three key concepts: -Patient Records (e.g.: patient record, EPR, EHR, EMR, clinical documents -CDA, administrative database) -Integration (e.g.: IS integration, record linkage, information sharing) -Distributed environment (e.g.: e-Health, distributed healthcare, shared healthcare) A total of 923 of 3124 articles were selected in this first selection on title alone.
The second phase of the study selection was based on abstracts. Again, six combinations of reviewer pairs were defined. Each pair of reviewers read 154 abstracts. The inclusion criterion in this phase was that articles should fulfil all three of the following conditions: -Describe or assess IS implementations -Integrate patient data from various IS -Describe the technology used to integrate To maximize specificity, only selection by both reviewers was considered adequate. In cases of disagreement a third reviewer was called to decide. A total of 84 out of 923 articles were selected to be read entirely. These 84 articles were grouped into 69 distinct integration projects to avoid the distortion created by multiple papers describing the same project. All statistical analysis is based on projects and not on articles. Some of articles (n = 13) were descrip-tions of project plans or architecture models that were not already implemented on a real scenario nor even as a prototype. These projects were also excluded, leaving only 56 projects. Figure 1 is a flowchart illustrating the different stages of paper selection. Figure 2 illustrates the stages of a generic integration of heterogeneous IS. The variables examined in this review are related to these stages and intend to describe the context where the integration takes place (country, date, area covered, institutions involved, type of final users), the type of data integrated and the technology used (standards, communication methods, integration model, repositories of data, client applications).

Underlying model and definition of variables
The variables are: -Country where the system is implemented; -Date of article publication; -Area covered by each project (country, region, hospital, department); -Institutions involved as sources for patient data integration, i.e., institutions that own feeder systems to integra-tion (departments, hospitals, primary care, private clinics, private labs, patient health portals) -multiple values are accepted; -What type of medical data is integrated (lab orders, lab results, prescription orders, diagnosis or problems, procedures, admission letters, discharge letter, transfers letters, referral letters, medical images, biosignals) -multiple values are accepted; Diagram showing the methods used for study selection Figure 1 Diagram showing the methods used for study selection.
available to talk with the central repository; semantic ie. when all possible data has a predefined message template, both semantic and syntax is known; generic ie. when the document structure accepts a certain degree of evolution without re-defining the whole template) -adapted from Bernstein et al.

Statistical analysis
The statistical analysis was performed with SPSS ® version 14. P values in Table 1 were calculated using Pearson and linear-by-linear association chi-square tests with significance level of 0.05.

Study selection
The agreement rate for the first phase was 83%, and for the second phase was 77%. The number of different IS implemented was 56.

Trends
Area covered by integration 59% of the IS covered only a region, while 29% covered a hospital, 9% a department and 4% a whole country. There was a downward trend in publications related to projects that cover a hospital from 57% until 1999, 35% in 2000-02 and 17% in 2003-05. The number of projects covering a region or country has increased over the years, and currently represents 76% (p = 0.037).

Institutions involved in the integration
Most of the integrated information comes from hospital IS (69%), with departmental (40%) and primary care (33%) IS representing the next two most frequent institution types. Four projects (8%) integrated information from health portals; all were published in the most recent period considered (2003-05).

User groups
As expected, all information systems provided access to health professionals. Two recent projects claim giving data access to patients [9,10]. Medical doctors are more often referenced as users (48%) than nurses (10%).
Integrated data 77% of the projects integrated diagnosis and problems, 67% medical images, 65% lab results, 63% discharge notes and 60% procedures. There has been an increase in projects integrating referral letters (from 0% until 1999, to 18% in 2000-02 and to 25% in 2003-05).

Type of models
Regarding the type of integration model, although the number of projects found using a predefined message templates (semantic -all data structured) and middleware are very similar (44% and 40% respectively), it seems that there is a trend to use more predefined message templates (46% in 2003-05) and fewer middleware solutions (31% in 2003-05). This tendency is clearer, if the values of the projects using messaging (both "Semanticall data structured" and "Generic -structure and data dynamic") are added, representing 54% in 2003-05. Direct communication to databases is very low (10%) and more flexible messaging is now appearing (12% in 2003-05).
Messaging standards HL7 is the most frequently used messaging standard (68%). It seems that CDA is becoming the reference to use inside HL7 (25% in 2003-05). DICOM is becoming less used when compared to other standards, which is understandable as it is mainly for images. Nevertheless, DICOM is no more the only success example of standards use in medical communication protocols. Other standards have very low usage nowadays (19% in 2003-05).

Repository
Regarding the type of data storage, 77% of the projects stored data in databases, 25% used virtual repositories and 16% stored in files. There is no real change over the periods considered.  (79) †: single variable with mutually exclusive response categories ‡: multiple variables with dichotomous response categories (yes or no) *: not statistically significant §: linear-by-linear association chi-square test used (except in variable type of model) ᐍ:models used for semantic interoperability: direct communication when the systems create different interfaces to connect; middleware when an API is used to talk with the central repository; semantic when all data has a predefined message template; generic when the document structure accepts evolution without re-defining the whole template. P value calculated using Pearson association chi-square test.

Current status (results regarding 2003-05)
Currently there are more projects carrying out regional integration, especially between hospitals and primary care. Referral letters are mentioned in 7 of the 29 projects described in articles published in 2003-05. It is also clear that patients are also becoming active participants because they appear for the first time as a user group in more recent projects.
Regarding integration models, messaging between systems, both Semantic and Generic, is lately used more frequently (58%) than middleware (31%). Databases are still the most common method for data storage (86%). Communication between integrated systems uses many different technologies with Web services being used in 41% of the projects. The most common user interface by far is the Web browser (90%).

Discussion
Our results show an increasing number of publications describing projects which integrate data from multiple Information Systems. This is in agreement with our initial assumption about the interest in improving the communication of health related data to support person-centred healthcare. As the number of heterogeneous health IS grows, their integration becomes a priority. Moreover, we may be witnessing an increasing interest in regional integration between heterogeneous healthcare information systems across different institutions, to help communication between the different stake holders (primary and secondary care doctors, nurses and patients). This is also supported by the increasing communication of referral letters.
It should be noticed the efforts being put into integration in countries like Germany, Greece and Denmark which are trying to implement nationwide healthcare integrated networks feed by heterogeneous information systems.
Messaging technologies (in particular HL7) are more used than middleware solutions (like DCOM or CORBA). Web based technologies (web-services and web-browsers) support most of the projects, indicating that these new technologies are quickly adopted in healthcare institutions. Nevertheless, it is obvious that many distinct technological solutions coexist to integrate patient data.
The concept of message passing appears to be radically different from the conventional concept of procedure calls or operation invocation, but the difference is more one of pedagogical emphasis than of semantics. Message passing emphasizes the remoteness of the object and the caller's lack of knowledge of the code body which will be executed. However, any procedure call can be viewed as an exchange of messages [11]. The main difference is both approaches is the reliance on open Internet standards like HTTP, XML, SOAP, WSDL, UDDI and WSFL by the Web services (messaging), in opposition to DCOM and CORBA solutions (middleware) that resulted many times in single-vendor implementation requirements.
One key omission from the literature reviewed is that most of the project publications failed to mention any type of error detection. We feel that is mandatory to verify the quality of integrated data, so that instead of propagating data errors, alerts regarding data quality can be triggered and correction processes can take place [12].

Limitations
One of the main limitations of this review is lack of detail reported in most of the articles, and especially the non existence of any impact evaluation of the technologies they describe, despite the enormous cost of such systems and the evident change in working practices that they entail. The percentage of missing values for each time interval varied between 0 and nearly 50% depending on the type of variable analysed and interval of time considered.
Another limitation is only considering papers published in the last ten years may exclude early work on integration at the hospitals, although we feel it is justifiable given the significant evolution in ICT in the last decade.
Although we feel that grouping the papers into projects is essential to decrease the bias of multiple publications of the same project, on some of the papers it was difficult to determine if they were describing the same project or not.

Conclusion
Currently people have more mobility, longer lives and health care is more shared than ever before. It is clear that Information Systems are evolving to meet people's needs by implementing regional networks, allowing patient access and integration of ever more items of patient data. We conclude that patient information is becoming more accessible as there are more integrated IS which are more likely to involve primary care and a wider range of patient data.
Web based technologies and messaging technologies are supporting most of the current integration projects, indicating that these new technologies are quickly adopted in healthcare institutions. Many distinct technological solutions coexist to integrate patient data, using differing standards and data architectures which may difficult further interoperability.