- Open Access
- Open Peer Review
QL4MDR: a GraphQL query language for ISO 11179-based metadata repositories
BMC Medical Informatics and Decision Making volume 19, Article number: 45 (2019)
Heterogeneous healthcare instance data can hardly be integrated without harmonizing its schema-level metadata. Many medical research projects and organizations use metadata repositories to edit, store and reuse data elements. However, existing metadata repositories differ regarding software implementation and have shortcomings when it comes to exchanging metadata. This work aims to define a uniform interface with a technical interlingua between the different MDR implementations in order to enable and facilitate the exchange of metadata, to query over distributed systems and to promote cooperation. To design a unified interface for multiple existing MDRs, a standardized data model must be agreed on. The ISO 11179 is an international standard for the representation of metadata, and since most MDR systems claim to be at least partially compliant, it is suitable for defining an interface thereupon. Therefore, each repository must be able to define which parts can be served and the interface must be able to handle highly linked data. GraphQL is a data access layer and defines query techniques designed to navigate easily through complex data structures.
We propose QL4MDR, an ISO 11179-3 compatible GraphQL query language. The GraphQL schema for QL4MDR is derived from the ISO 11179 standard and defines objects, fields, queries and mutation types. Entry points within the schema define the path through the graph to enable search functionalities, but also the exchange is promoted by mutation types, which allow creating, updating and deleting of metadata. QL4MDR is the foundation for the uniform interface, which is implemented in a modern web-based interface prototype.
We have introduced a uniform query interface for metadata repositories combining the ISO 11179 standard for metadata repositories and the GraphQL query language. A reference implementation based on the existing Samply.MDR was implemented. The interface facilitates access to metadata, enables better interaction with metadata as well as a basis for connecting existing repositories. We invite other ISO 11179-based metadata repositories to take this approach into account.
Heterogeneity of healthcare data from different sources is a well-known obstacle limiting data integration and analytics. If the same facts are expressed in various ways, understanding and exchanging data becomes a demanding process that ties up resources in the form of data specialists and is both labor-intensive and error-prone .
As a remedy, the unambiguous interpretation and, thus, integration of such “instance data” can be facilitated by describing their variety and characteristics using “metadata”. If curated and semantically annotated, metadata is instrumental in data integration . For example, metadata can be used for validation and transformation of instance data: Having harmonized metadata at the schema level, matchings and mappings between different metadata sets can be used to generate the transformation of instance data, as conceptually shown in Fig. 1. It has been shown that such processing rules can serve to integrate and exchange healthcare instance data .
Many projects and organizations in the field of medical informatics research already utilize metadata repositories (MDR) to store, edit, use and reuse metadata. As a result, a multitude of MDR implementations have emerged, each one featuring its own web interface, e.g. the Common Data Element Browser from the National Institute of Health , the US Health Information Knowledgebase, the Samply.MDR  and the METeOR of the Australian Institute of Health and Welfare . While mapping between data elements within one MDR is a well-researched topic, the exchange between several MDRs – a requirement for the exchange and integration process across consortia – has been much less studied. This shall be the focus of this study, represented on the left side in Fig. 1. Fortunately, most MDR systems claim to be partly conformant to the metadata standard ISO 11179, so that in principle metadata can be exchanged between MDRs [4, 7,8,9,10]. However, while ISO 11179-3 defines a metamodel and basic attributes for describing metadata, it does not provide an implementation. After studying the several systems mentioned above, we discovered that some systems either provide no query endpoint at all, or the existing interfaces are rather deprecated. Existing metadata exchange standards are not focused on the ISO 11179 standard, are proprietary and rigid due to their design and technologies . The semantically enhanced metadata is therefore unavailable due to technical or syntactical heterogeneity. In summary, before we can exploit metadata from several MDRs for data integration, we face a problem of metadata integration.
We propose a uniform interface to access multiple MDRs as long as they follow a specific metadata standard. The idea of a uniform interface (of clinical systems) has a prominent example through HL7 Fast Health Interoperability Resources (FHIR) . Standardized exchange formats are provided equipped with modern tooling like JSON, ATOM and REST. Although, the standard is disadvantageous if deeply structured resources are to be processed. Since metadata is predominantly a deeply nested information, it is urgently dependent on implementing effective access to real MDR systems on a technical level.
The ISO 11179 standard for metadata repositories
In order to design a uniform interface suitable for several existing MDRs, a standard data model needs to be agreed on. The ISO/IEC standard 11179 is commonly used for the modelling of metadata, corresponding repositories and registries . The standard defines a core model in order to harmonize the formal representation of metadata. This core model is divided into two layers: the representational and the conceptual layer. The representational layer defines the key concept Data Element as a single information element and a Value Domain describing the datatypes and their value ranges. The conceptual layer sorts Data Element in concept groups to describe their semantical similarity. In addition to the core model, the standard defines various entities to capture the information corresponding to the metadata. As an objective, the interface must be able to query a highly linked data model.
Fast health interoperability resources
A uniform interface is a common way to overcome the problem of heterogeneity in data exchange. A significant example is the Fast Health Interoperability Resources standard, the newest member of the HL7 standards family . FHIR defines information components, called resources, and a standardized way to retrieve and manipulate these components. The FHIR resource DataElement and the ISO 11179 profile, defined for representing metadata in FHIR version DSTU2, were the base for a functional MDR prototype . With FHIR version STU3, however, the DataElement resource has been marked deprecated, and a suitable successor has not been defined, yet. In particular, FHIR developers state that REST interfaces are not a suitable communication approach for the complex, nested queries as required in exchange of ISO 11179-3 information .
GraphQL, initially developed by Facebook, is a query language especially suited for highly linked data models  used by GitHub, Twitter or the German railway company Deutsche Bahn [17, 18]. The FHIR standard itself introduced GraphQL as a query alternative to REST APIs . Technically, GraphQL functions as a database abstraction layer providing a single API endpoint both for queries and mutations. The provided information objects are defined in a schema, which has an expressive coverage, supports inheritance, interfaces, custom types and attribute constraints such as non-nullable entries. Creating a GraphQL schema requires to define:
Objects and Fields to define information representation
Queries to define how object types can be queried, including filtering and
Mutations to enable input types for information capturing and manipulation.
We used the GraphQL reference library graphql-java  to derive the QL4MDR API and its documentation from the defined schema. As a next step, we implemented the API in a widely used open-source ISO 11179-based metadata repository, Samply.MDR . We created the necessary data fetchers using the underlying Samply.MDR database access layer that ensures backwards compatibility across MDR versions and allows the use of the existing access control based on OpenID Connect . As an optimization, we implemented resource resolvers to reduce the necessary connections to the database via lazy-loading, e.g. fetching a namespace including each data element with the corresponding value domains without producing a large number of database queries.
Having reviewed the ISO 11179-3 core model, we propose a compatible GraphQL schema, a GraphQL-based API QL4MDR and a prototypical implementation of a modern web-based interface.
Definition of an ISO 11179-compatible GraphQL schema
We derived the GraphQL schema for QL4MDR from the ISO 11179 standard. Particularly the third part describing the core model, was considered. Of the 26 entities described in the core model, the QL4MDR schema consists of the following: a) Object types with corresponding fields, b) Query and c) Mutation types.
Objects & Fields
The ISO 11179-3 core model is represented in four Object types: Data Element, Value Domain, Data Element Concept and Conceptual Domain. The standard also comprises Namespace and the customizable Slots as structures for the identification of metadata. Also, all required Objects related to the previous six types are included in the QL4MDR schema, resulting in 13 Object types.
ISO 11179-3 further specifies these basic Object types by attributes. We translated these attributes into GraphQL fields, which can be used to filter and constrain the query. To enhance filter functionality, Object types with less than two attributes are included in related Objects as fields. For example, the ISO 11179 Property Class results in the string representation Property related to the Data Element Concept.
GraphQL queries start at an entry point and traverse through the data graph. QL4MDR provides six entry points: Data Element as the central information item, Value Domain, Concepts and Conceptual Domain, Namespaces and Slot. Each entry point provides a particular set of filters to specify the enquired information, e.g. all concepts regarding Person and its mass. Since slots can contain custom information about each data element, they allow additional parameters for better querying.
The QL4MDR data graph has a defined direction, which we derived from the cardinality described in the ISO 11179-3 – represented with directed lines in Fig. 2. QL4MDR queries should be formulated in a way traversing the graph along the defined directions.
Of the six available entry points for querying, we selected three as valid starting points for mutations: Namespace, Conceptual Domain and the pivotal Data Element. This selection ensures two important guarantees: first, each entity can be created, modified or deleted as there is a guaranteed path. Second, it is impossible to define cyclical mutations.
The proposed interface follows two major design decisions, which result in advantages with regards to MDR interoperability: choosing GraphQL rather than RESTful or a service-oriented interface and basing the QL4MDR on the ISO 11179-3 standard rather than a proprietary implementation.
GraphQL vs. traditional interfaces
GraphQL can be regarded as a variation of the widely used RESTful design pattern but differs in specific characteristic and yields both advantages and limitations: As a GraphQL-based API, QL4MDR can answer even complex questions navigating across the various entities of the ISO 11179 standard, thus reducing the required number of queries. In other words, the RESTful or service-oriented interfaces need substantially more requests to provide the same information. The number of queries against a RESTful interface depends on the number of inquired data elements. For example, consider an electronic data capture solution requesting validation rules for all data elements present in a given namespace, as shown in Fig. 3. Additionally, a RESTful client receives redundant information as it is forced to query data elements with all properties and has to discard those that are of no further benefit . The RESTful interface could implement tailored routes, but it is infeasible in the comparison of benefit from costs due to maintenance.
Another advantage of GraphQL lies in the creation of meaningful documentation. In particular, GraphQL implementations like graphql-java can generate both human- and machine-readable documentation from the defined schema. The introspection feature allows not only users and developers to understand the interface more easily, but the machine-readable representation enables dynamic and loose coupling between server and clients , thus facilitating the federation of various, technically different ISO 11179-based MDRs. Previous standards like the WS-MetadataExchange  cannot stand that flexibility and loose coupling due to its heavyweight service-oriented architecture .
Adherence to metadata standards instead of their implementations
QL4MDR is not tailored to a specific repository implementation but modelled strictly after the ISO 11179-3 standard. This approach yields both advantages and limitations.
On the one hand, adhering to ISO 11179-3 as the common metadata model ensures reusable queries that can be executed against various MDR implementations, as long as they follow ISO 11179-3 and implement QL4MDR. On the other hand, metadata management systems are sometimes customized for specific use cases and specifications, which go beyond what ISO 11179-3 defines. For instance, Samply.MDR implements the so-called Data Element Group to organize certain data elements. As this entity is not included in the standard, it obviously cannot be queried via QL4MDR. However, workarounds are possible: in this case, for example, Data Element Groups could be treated as complex data elements consisting of several data elements, a designation and a definition.
Designing a common interface is the first step on the way to a simple federation of heterogeneous MDRs via a uniform and standardized interface and therefore reusing metadata. An interface alone, however, cannot address common problems of handling of metadata in a distributed context, such as consolidation of datasets and/or the mediation between existing sets, matching and mapping of data elements and protection of intellectual property (study designs, etc.). Also, federating various MDR instances yields the usual problems of distributed information systems such as replication, consistency and duplicate detection, addressing and operational availability and versioning. QL4MDR is made for MDRs which are based on the 11,179–3, non-ISO-based systems are currently out of scope.
Lastly, one must consider that like any other interface, QL4MDR can offer only functionality or serve information available in the underlying MDR. In the case of ISO 11179, not all MDRs implement all components of the extensive standard. For example, although QL4MDR does cover the conceptual layer, it is unavailable in our reference implementation as it is not available in Samply.MDR. To some extent, such limitations can be mitigated: In our example, the additional semantic information can be stored in the optional slot of a data element. However, for the sake of interoperability across MDR implementations, we argue that compliance to the ISO 11179 standard is preferable to such workarounds.
We have presented a uniform query interface for various implementations of metadata repositories. To ensure compatibility and sustainability, we did not invent new paradigms but reused existing standards, namely the widely used ISO 11179 standard for metadata registries and the GraphQL query language. We implemented a reference implementation based on the widely used Samply.MDR software, which is available under https://bitbucket.org/medicalinformatics/. QL4MDR could be integrated into other MDR implementations following the ISO 11179 metadata representation by implementing the required GraphQL data fetcher and the HTTP-based query endpoint. Once integrated into MDRs, QL4MDR can not only enable better interaction with a single metadata repository in a uniform and based on the ISO 11179-3 standardized manner. In addition, it serves as the foundation towards a federation of existing implementations and research networks’ instances. Thus, we invite authors of other ISO 11179-based metadata registries to consider this approach for implementation.
Availability and requirements
The source-code are freely released in open source on Bitbucket.
Project name: e.g. Samply.MDR.GraphQL.
Project home page: e.g. https://bitbucket.org/medicalinformatics/samply.mdr.ql4mdr
Operating system(s): Platform independent.
Programming language: Java.
Other requirements: Java 1.3.1 or higher, Tomcat 4.0 or higher.
License: GNU Affero General Public License.
Any restrictions to use by non-academics: no licence needed.
Fast Health Interoperability Resources
Khoumbati K, Themistocleous M, Irani Z. Integration Technology Adoption in Healthcare Organisations: A Case for Enterprise Application Integration. Proceedings of the 38th Annual Hawaii International Conference on System Sciences. 2005:9.
Dugas M. Design of case report forms based on a public metadata registry: re-use of data elements to improve compatibility of data. Trials. 2016;17:566.
Aubrecht P, Kouba Z. Metadata Driven Data Transformation. In: ISAS-SCI (1). Citeseer; 2001. p. 332–336.
Nadkarni PM, Brandt CA. The common data elements for cancer research: remarks on functions and structure. Methods Inf Med. 2006;45:594–601.
Kadioglu D, Weingardt P, Lablans M, Ückert F, Wagner TO. Samply. MDR–Ein Open-Source-Metadaten-Repository. German Medical Science GMS Publishing House. 2016.
Australien Institute of Health and Welfare. METeOR home. http://meteor.aihw.gov.au/content/index.phtml/itemId/181162. Accessed 29 Jun 2018.
Stausberg J, Löbe M, Verplancke P, Drepper J, Herre H, Löffler M. Foundations of a metadata repository for databases of registers and trials. Stud Health Technol Inform. 2009;150:409–13.
Ngouongo SM, Löbe M, Stausberg J. The ISO/IEC 11179 norm for metadata registries: does it cover healthcare standards in empirical research? J Biomed Inform. 2013;46:318–27.
Richesson RL, Nadkarni P. Data standards for clinical research data collection forms: current status and challenges. J Am Med Inform Assoc. 2011;18:341–6.
Park YR, Yoon YJ, Kim HH, Kim JH. Establishing semantic interoperability of biomedical metadata registries using extended semantic relationships. Stud Health Technol Inform. 2013;192:618–21.
Ballinger K, Box D, Curbera F, Davanum S, Ferguson D, Graham S, et al. Web services metadata exchange (WS-MetadataExchange). OASIS draft. 2004.
Benson T, Grieve G. Principles of Health Interoperability. Springer; 2016.
ISO/IEEC 11179–3. Information Technology – Metadata Registries (MDR), Part 3: Registry Metamodel and Basic Attributes, Edition 3, see https://www.iso.org/standard/50340.html. 2013.
Ulrich H, Kock A-K, Duhm-Harbeck P, Habermann JK, Ingenerf J. Metadata repository for improved data sharing and reuse based on HL7 FHIR. Stud Health Technol Inform. 2016;228:162–6.
Hay D. GraphQL | Hay on FHIR. https://fhirblog.com/2017/08/17/graphql/. Accessed 2 Jul 2018.
Buna S. Learning GraphQL and relay: Packt Publishing Ltd; 2016.
Facebook Inc. GraphQL: Users. http://graphql.org/users. Accessed 6 Jun 2018.
DB Systel GmbH. API-Portal - 1BahnQL-Free. https://developer.deutschebahn.com/store/apis/info?name=1BahnQL-Free&version=v1&provider=DBOpenData. Accessed 6 Jun 2018.
Health Level 7. Graphql - FHIR v3.4.0. http://build.fhir.org/graphql.html. Accessed 27 Jun 2018.
Facebook Inc. GraphQL: A query language for APIs. http://graphql.org/. Accessed 27 Jun 2018.
Kadioglu D, Breil B, Knell C, Lablans M, Mate S, Schlue D, et al. Samply.MDR - a metadata repository and its application in various research networks. Stud Health Technol Inform. 2018;253:50–4.
Sakimura N, Bradley J, Jones M, de Medeiros B, Mortimore C. OpenID Connect Core 1.0 incorporating errata set 1. The OpenID Foundation, specification. 2014.
Kern J, Tas D, Ulrich H, Schmidt EE, Ingenerf J, Ückert F, et al. A Method to use Metadata in legacy Web Applications: The Samply.MDR.Injector. Stud Health Technol Inform - In Press. 2018.
Kumari S, Rath SK. Performance comparison of soap and rest based web services for enterprise application integration. In: Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on. IEEE; 2015. p. 1656–1660.
The project is partially supported by a grant LA 3859/2–1 by the German Research Foundation (Deutsche Forschungsgemeinschaft). The funding agency had no role in study design, data collection, data analysis, results interpretation or in writing the manuscript.
Availability of data and materials
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.