Providing transparency related metadata
Health professionals have begun to realize that it is their responsibility to guide consumers and patients to the best available medical information on the web. Many national governments and medical societies have acknowledged that it is their responsibility to help users to identify "good quality" information sources and have begun to develop national health gateways (such as HealthinSite in Australia, NHS Direct in the UK, or Healthfinder in the USA), portal sites and other forms of "infomediaries" such as seals of approval [2] or certification mechanisms in an effort to help consumers to locate trustworthy information resources.
However, current approaches do not harness any of the advantages of the Web as a decentralized, distributed information system. There is a need for "next generation" tools, including intelligent knowledge-based tools, allowing consumers to positively and actively identify reliable health information that suits their needs.
The three application partners of MedCIRCLE, besides CISMeF in France, were the Agency for Quality in Medicine (AQuMed) in Germany and the Official Medical College of Barcelona (COMB). AQuMed was founded in March 1995 as a joint institution of the German Medical Association and the National Association of Statutory Health Insurance Physicians. AQuMed established a health gateway (URL: http://www.patienten-information.de) for laypersons, listing consumer health information sites. Before MedCIRCLE, documents had been evaluated using the DISCERN instrument [11]. COMB (URL: http://www.comb.es) represents the medical profession of Barcelona. To this date, in the project "Web Medica Acreditada", COMB has accredited more than 300 Spanish health websites from Spain and Latin America [12]. The Knowledge Management Department of the German Research Center for Artificial Intelligence DFKI GmbH provided consultancy services especially in the area of ontology modeling. DFKI also provided the technical infrastructure and development resources for the project.
CISMeF terminology
The CISMeF team is composed of five medical librarians, two medical informaticians, one engineer, three Ph.D. and two Master students in Computer Science. CISMeF uses two standard tools for organizing information: the MeSH (Medical Subject Headings) thesaurus from the US National Library of Medicine, and several metadata element sets [13]: (a) 11 of 15 items of the Dublin Core metadata format to describe and index all the health resources included in CISMeF (author or creator, date, description, format, identifier, language, publisher, resource type, rights, subject and keywords, and title), (b) the 11 elements of the Educational category from Learning Object Metadata (LOM) for teaching resources, (c) specific metadata for evidence-based medicine resources (indication of the level of evidence and the method to calculate it) which also describe the health content [14], and (d) the HIDDEL metadata set (Health Information Disclosure, Description and Evaluation Language) [15].
Description of the HIDDEL language
HIDDEL is a metadata language and an ontology, which enables the expression of descriptive and evaluative annotations in XML/RDF. The first version of HIDDEL was initially developed during the MedCERTAIN project (MedPICS Certification and Rating of Trustworthy Health Information on the Net, http://www.medcertain.org) [16]. HIDDEL evolved from MedPICS [17], a basic rating vocabulary for medical information conforming to the Platform for Internet Content Selection (PICS) [18]. HIDDEL is used to enhance transparency of health information on the Internet.
HIDDEL is based on existing quality criteria such as the Health On the Net (HON) Code of Conduct [2]. It was developed together with a quality management process model. HIDDEL can be used by information providers for self-disclosure, but also by third parties such as quality-controlled health gateways, to evaluate health information providers. It presents three levels of evaluation: (a) self-disclosure (b) evaluation by non-medical experts, and (c) evaluation by medical experts. As a quality-controlled subject gateway, CISMeF uses HIDDEL only as a third-party.
The HIDDEL vocabulary can be downloaded freely from the MedCIRCLE Web site, as long as the sources are acknowledged and requests for changes or expansions are fed back to the community. At present HIDDEL is available in four languages: English, German, French and Spanish. The use of this controlled vocabulary enables automatic translation (except for free text). The heritage process was made possible because of HIDDEL's dual structure: on the one hand, Infoprovider metadata, describing to the health information provider (e.g., the name of the person responsible for the quality of the web site), and on the other hand, Sitespecific metadata devoted to one Web site evaluation (e.g., language). In CISMeF, we have applied Sitespecific metadata to each resource (mainly quality-controlled documents) from a publisher already included in the CISMeF database. The name of the person responsible for the quality of the Web site, which is one of the Infoprovider metadata, is the same for every document of the Web site. On the contrary, the language of the document, which is one of the Sitespecific metadata, may vary from one document to another. The CISMeF team implemented the whole HIDDEL structure in the CISMeF database, which involved the creation of triggers, thus ensuring automated transfer from CISMeF to HIDDEL metadata, and the creation of new forms (interface recasting) to deal with non-CISMeF metadata. Because the HIDDEL elements are optional and repeatable, CISMeF has selected a number of 70 metadata among the 305. Most of the metadata previously used in CISMeF and in particular the Dublin Core are also included in the HIDDEL language. These metadata were automatically triggered in the HIDDEL language.
Interoperability
The interoperability process consists of an exchange of RDF files, containing experts' annotations "written" in HIDDEL. The semantic-based Archer Annotation System deals with RDF annotations reception. Archer is a Web application that allows annotating health information Web sites using the HIDDEL vocabulary. It is a technical platform and an organizational infrastructure that can be used by consumers, health information providers, and third party rating services. The first version of Archer was implemented as a part of MedCERTAIN, and further enhanced in the course of the successor project MedCIRCLE to allow the exchange of metadata between third party rating organizations.
On another ground, through its search engine Doc'CISMeF, CISMeF provides external links to Archer backend servlets, and internal links to rated sites disclosure (see Figure 1). Since August 2002 the CISMeF team has embedded RDF metadata (URL: http://www.w3.org/RDF) into the generated HTML pages, making them not only machine-readable (as every HTML page is) but also machine-processable. Therefore, one of the main goals of this metadata element set was fulfilled easily: it became interoperable with other Internet services. Moreover, an RDF Scheme describing CISMeF specific metadata was created (URL: http://doccismef.chu-rouen.fr/cismef.xml).
In a more pragmatic way, interoperability relies on a 3 steps process (see Figure 2): (1) RDF files generation: a Java program (RDFWriter.class) formats evaluation data according to a MedCIRCLE RDF Schema of annotations; (2) RDF files export: a Java program (RDFSender) sends RDF files to the MedCIRCLE web server using HTTP Post; (3) Reception and ID allocation: for each transmitted file, the MedCIRCLE Web server sends back an ID number that will be used to access the exported metadata.