Skip to main content

Pragmatic MDR: a metadata repository with bottom-up standardization of medical metadata through reuse

Abstract

Background

The variety of medical documentation often leads to incompatible data elements that impede data integration between institutions. A common approach to standardize and distribute metadata definitions are ISO/IEC 11179 norm-compliant metadata repositories with top-down standardization. To the best of our knowledge, however, it is not yet common practice to reuse the content of publicly accessible metadata repositories for creation of case report forms or routine documentation. We suggest an alternative concept called pragmatic metadata repository, which enables a community-driven bottom-up approach for agreeing on data collection models. A pragmatic metadata repository collects real-world documentation and considers frequent metadata definitions as high quality with potential for reuse.

Methods

We implemented a pragmatic metadata repository proof of concept application and filled it with medical forms from the Portal of Medical Data Models. We applied this prototype in two use cases to demonstrate its capabilities for reusing metadata: first, integration into a study editor for the suggestion of data elements and, second, metadata synchronization between two institutions. Moreover, we evaluated the emergence of bottom-up standards in the prototype and two medical data managers assessed their quality for 24 medical concepts.

Results

The resulting prototype contained 466,569 unique metadata definitions. Integration into the study editor led to a reuse of 1836 items and item groups. During the metadata synchronization, semantic codes of 4608 data elements were transferred. Our evaluation revealed that for less complex medical concepts weak bottom-up standards could be established. However, more diverse disease-related concepts showed no convergence of data elements due to an enormous heterogeneity of metadata. The survey showed fair agreement (Kalpha = 0.50, 95% CI 0.43–0.56) for good item quality of bottom-up standards.

Conclusions

We demonstrated the feasibility of the pragmatic metadata repository concept for medical documentation. Applications of the prototype in two use cases suggest that it facilitates the reuse of data elements. Our evaluation showed that bottom-up standardization based on a large collection of real-world metadata can yield useful results. The proposed concept shall not replace existing top-down approaches, rather it complements them by showing what is commonly used in the community to guide other researchers.

Peer Review reports

Background

Due to the medical complexity and heterogeneity of data element definitions, an enormous variety of medical documentation exists [1]. This variety often leads to incompatible data elements that impede data integration between different institutions [2]. Standardizing and reusing such metadata definitions has two major advantages. First, it yields harmonized data sets that allow data exchange between institutions [3, 4] and facilitate data analyses, such as multi-site phenotyping [5] or machine learning [6]. Second, medical documentation does not have to be developed from scratch reducing costs [7]. A common approach pursued in past years to facilitate standardization and reuse are so-called metadata repositories (MDR); databases that gather, retain, and disseminate standardized data element definitions [8]. Several implementations based on the ISO/IEC 11179 norm for metadata registries exist [9]. Table 1 summarizes publicly accessible instances for healthcare applications. Existing MDRs usually apply a top-down approach for metadata standardization through an expert committee or another manually controlled procedure [10]. To the best of our knowledge, however, it is not yet common practice to reuse data element definitions for the creation of case report forms or routine documentation from one of the given MDRs.

Table 1 Publicly accessible metadata repositories in the healthcare domain

In this work, we suggest an alternative approach called pragmatic metadata repository, which enables a community-driven bottom-up approach for agreeing on standards and facilitates metadata sharing. We define a pragmatic MDR with the following key principles:

  1. (1)

    Based on real-world metadata definitions that were already used for data collections in medical research or routine healthcare

  2. (2)

    Frequency-based scoring of data elements leading to de facto standards

  3. (3)

    Open access to share, query, and reuse content across institutions

In contrast to existing repositories, a pragmatic MDR contains a large collection of real-world metadata definitions from different sources. To obtain this collection, it allows data sharing for everyone, i.e. it is community-driven. When a data element definition is used in many real-world settings, this indicates that this definition was already tested and is well accepted. Moreover, many data sets already exist, which potentially could be compared to data from a newly designed system that adopts such a data element definition. Hence, the pragmatic MDR concept considers frequent metadata definitions as high quality with an increased potential for reuse and scores them higher. To this end, a pragmatic MDR automatically detects equivalent definitions, aggregates them, and only stores a single copy along with its number of occurrences. We call this concept to reflect metadata quality through its frequency in real-world documentation bottom-up standardization [11]. A comparison to this approach might be the practice to assess the relevance of a scientific paper by its number of citations. This pragmatic MDR concept with bottom-up standardizations shall provide more suitable data element definitions for the creation of case report forms or routine documentation than existing MDRs.

The main objective of this work is to carry out a feasibility study for the suggested pragmatic MDR concept by implementing a proof of concept application fulfilling the above key principles. We apply and evaluate this prototype in two different use cases to demonstrate its capabilities for metadata reuse. First, we integrate it into the study form editor ODMEdit [12] as a suggestion mechanism for data elements during the creation of medical documentation. Second, our partners at the University Medicine of Greifswald use the prototype for automatic synchronization of metadata shared in the Portal of Medical Data Models (MDM Portal) [13]. In addition to that, we perform an evaluation of bottom-up standardization in a pragmatic MDR and verify the quality of the derived data element definitions. To this end, two medical data managers evaluate different properties of the top three items for 24 important medical concepts.

Methods

Pragmatic MDR proof of concept implementation

Our implementation was guided by the three principles for a pragmatic MDR. To obtain a large set of real-world metadata definitions for the proof of concept (principle 1), we used the content of the MDM Portal [13]. The portal stores medical forms in the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) [19, 20]. To avoid data conversion and development of a new data model, the prototype’s metadata model was derived from ODM. Figure 1 illustrates ODM’s tree structure in the center with a corresponding example form in the MDM Portal on the left. The eight depicted ODM elements served as atomic resources that were stored in the pragmatic MDR prototype. Sharing of metadata definitions (principle 3) is also realized through the MDM Portal that already offers a simple upload mechanism and automatically synchronizes with the MDR. Our proof of concept implementation split up incoming ODM files into atomic elements and aggregated equivalent elements while it kept a list of their original occurrences as illustrated in Fig. 2. Our prototype treated elements as equivalent if they agreed in every ODM property. We used the open-source search platform Apache Solr [21] with a custom search strategy to rank results with a tradeoff between query matching and the logarithm of their number of occurrences for frequency-based scoring of data elements (principle 2). Open access for querying and reusing metadata definitions (principle 3) was realized through a publicly accessible resource-oriented REST API [22]. For each element, URL endpoints were created to request a single resource or a collection as shown on the right of Fig. 1. We used Spring Boot [23], an open-source Java framework, and PostgreSQL, a database management system, for our implementation.

Fig. 1
figure 1

Medical form in the MDM portal with respective ODM elements and REST API endpoints. A medical form with a single data element Body weight classification in the MDM Portal (left) with the corresponding ODM definitions as a tree structure (center). Association is indicated with the text behind the ODM definitions. Note that the Protocol element is not displayed in the MDM Portal. The right side shows the REST API endpoints of the pragmatic MDR proof of concept implementation to query collections or single resources. The endpoints can be queried with HTTP GET requests and are secured with an API key indicated by the blue fields

Fig. 2
figure 2

Aggregation of equivalent metadata definitions in the pragmatic MDR proof of concept. Simplified insertion procedure of two ODM models into the pragmatic MDR proof of concept application. Equivalent elements (Item B) and their children (Codelist B) are aggregated while their original occurrences are kept track of. The number of occurrences is used for frequency-based scoring during search

Application of pragmatic MDR proof of concept in two use cases

We applied the pragmatic MDR proof of concept in two use cases to demonstrate its capabilities for reusing metadata. The first use case was a suggestion and reuse mechanism for ItemGroup and Item resources in the study editor ODMEdit [12]. This web-based editor was implemented in R and the suggestion mechanism was realized with JavaScript. To insert a metadata definition into the current working document, ODMEdit processed the JSON response of a specific ItemGroup or Item resource and transformed it into its internal representation format. The second use case was a collaboration with the University Medicine of Greifswald, which is coordinating the Study of Health in Pomerania (SHIP); a major epidemiological study in Germany initiated in 1997 to obtain scientific valid data regarding factors contributing to a shorter life expectancy in eastern Germany [24]. In prior work, the metadata of SHIP was already converted to ODM and was imported into the MDM Portal [25]. This process included semantic annotation with Unified Medical Language System (UMLS) codes [26] by medical experts. Since this is a laborious process, we wanted to integrate this valuable information into the SHIP database. To this end, we implemented a script that automatically queried all SHIP metadata definitions from the pragmatic MDR and transferred semantic codes into the SHIP data dictionary.

Evaluation of bottom-up standardization in the pragmatic MDR proof of concept

Our evaluation of bottom-up standardization was two-fold: first, we checked to what extent bottom-up standards emerged in our proof of concept that was filled with the content of the MDM Portal; second, we evaluated the quality of these standards. While the prototype contained different metadata resources, items were used for this evaluation, since they were the smallest building block that is commonly standardized and shared. Moreover, we restricted the evaluation to items with an English question text. To cover a broad spectrum of relevant item definitions for our evaluation, six item concepts were chosen from four different groups: Clinical Data Acquisition Standards Harmonization (CDASH) vital signs [27], six most frequent Logical Observation Identifiers Names and Codes (LOINC) codes [28], items related to ischaemic heart disease, and items related to stroke. CDASH vital signs and LOINC codes are common data elements used in medical documentation. Ischaemic heart disease and stroke are the top two global causes of death according to the world health organization [29]. For CDASH vital signs and LOINC codes, we used their names to query the pragmatic MDR. To identify important items related to ischaemic heart disease and stroke, we identified relevant medical documentation in the MDM PortalFootnote 1 and collected all medical concepts (UMLS codes) they contained. We then used the UMLS names of the six most frequent concepts for each disease as search queries. The second column in Table 2 shows the resulting 24 queries.

Table 2 Overview of item definitions for evaluation of bottom-up standardization

To evaluate whether bottom-up standards emerged, we plotted cumulative occurrences of item definitions for each search query. Moreover, we determined the ratio of occurrences of the top three search results compared to all results of a query. The top three search results should make up a considerable amount of all items to be considered as standards. Since our prototype implementation required item definitions to agree in every property to aggregate them, we expected this ratio to be low. A first experiment confirmed this suspicion. Hence, we performed the same analyses with a relaxed equivalence definition that only required the ODM question element in lower case to coincide. We chose the question element because this is the text displayed to users. Since the search mechanism also considered partial matches, this analysis might have included item definitions that were only slightly related to the original medical concept, especially when the search query consisted of several words. Nevertheless, we thought that this analysis could yield insights into the emergence of bottom-up standards. Note that we performed this evaluation 1 year after the initial synchronization, so it was based on a larger amount of content from the MDM Portal.

In the second part of our evaluation, two medical data managers evaluated the quality of the top three bottom-up standards derived for each medical concept query. We selected the top three results for evaluation to have a larger set of test samples. The evaluation was performed with a self-designed questionnaire, which included questions for eight ODM item properties: Question, CodeList, Name, DataType, Length, Description, Alias, RangeCheck. Questions were derived from the definitions in the ODM standard [19]. Moreover, the data managers assessed whether the identified item definitions were a good match for the search query and their relevance for reuse in a case report form. This resulted in ten questions for each item definition. Rating was performed with an ordinal Likert Scale from one to five: strongly disagree (SD), disagree (D), neither agree nor disagree (N), agree (A), strongly agree (SA). We did a test evaluation with different item definitions and used the feedback to design the final evaluation questionnaires. We generated descriptive statistics for the ratings of each evaluator and calculated Krippendorff's alpha coefficient with bootstrap confidence intervals as a statistical measure for interrater agreement [30]. All analysis methods were determined a priori in a study protocol. For the final analysis, the color maps and export methods of the heat maps were adjusted slightly to account for correct formatting. Evaluation questionnaires and our study protocol are available as Additional files 1 and 2.

Results

Pragmatic MDR proof of concept implementation

Figure 3 shows a schematic overview of the resulting proof of concept application. Initially, 15,306 medical forms in ODM format were transferred to the pragmatic MDR. New forms that were uploaded to the portal were synchronized automatically (1). Figure 3 contains a table showing total and unique counts for the resulting resources in the pragmatic MDR. There were fewer unique resources because equivalent metadata definitions were aggregated. Reuse indicates the ratio of total and unique resources, i.e. it shows the average number of equivalent definitions. In total, the pragmatic MDR contained 853,445 metadata definitions of which 466,569 were unique. Most resources belonged to the type Item and CodeListItem with 387,977 and 286,344 elements. Together with MeasurementUnit and CodeList they had the highest reuse ratio. The REST API can be queried manually and responds in JSON format (2). The depicted example query illustrates a request to the item endpoint with a single parameter query.Footnote 2 The main purpose of the API is to enable the integration into applications that query and reuse metadata definitions in an automatic fashion (3). We demonstrated this for the study editor ODMEdit and the SHIP data dictionary. Medical metadata created with these applications or metadata from external sources can be shared via the MDM Portal (4). In this way, a feedback loop is established in which reused metadata definitions are shared again and can contribute to bottom-up standardization.

Fig. 3
figure 3

Schematic overview and content of pragmatic MDR proof of concept implementation. This overview shows the content of the pragmatic MDR proof of concept for each ODM element after the initial synchronization with the MDM Portal. New content in the MDM Portal is automatically transferred to the pragmatic MDR (1) and the portal also serves as a frontend to share new metadata definitions (4). The REST API can be queried manually (2) or it can be integrated into applications that query data automatically, e.g. for reuse of metadata definitions (3)

Application of pragmatic MDR proof of concept in two use cases

For the first use case, we integrated the pragmatic MDR into the study editor ODMEdit [12] as a suggestion mechanism to explore existing metadata definitions for ItemGroup and Item resources. Moreover, it was possible to reuse complete definitions and integrate them into the current working document. This allowed to assess and directly reuse 39,518 unique ItemGroup and 234,766 Item definitions within ODMEdit. Usage statistics showed that 955 ItemGroup and 881 Item resources were reused during a 9-month test period with medical experts creating medical documentation for the MDM Portal [13]. In the second use case, we integrated the pragmatic MDR prototype into the SHIP data dictionary [24]. A script queried the pragmatic MDR REST API with unique item identifiers to retrieve semantic coding that was added by medical experts in the MDM Portal. During this process, semantic codes were transferred for 4608 data elements. To our knowledge, this is one of the largest efforts to exchange metadata between different institutions in an automatic fashion. Medical experts need on average 1 min to code a single item [31], so this transfer saved approximately 77 h of work.

Evaluation of bottom-up standardization in the pragmatic MDR proof of concept

We analyzed 24 item concepts from four different categories for our evaluation of bottom-up standardization. The search query for each concept along with the amount of total search results is given in the second column of Table 2. Since the search mechanism also took into account partial matches, though, with a lower score, queries with several words tended to return more results. The third and fourth columns in Table 2 contain the ODM Question property and the number of occurrences of the top three search results. Note that the Question property could be the same across different items when these data elements differed in other properties (see concept Sodium). Moreover, the search mechanism used a combination of frequency and query matching, hence the first result did not necessarily have the most occurrences (see concept Hemoglobin). In this case, the first item definition was a better match for the search query, which lead to a higher score even though it had fewer occurrences. The last two columns contain the question and occurrences for a relaxed equivalence definition that only required the question texts in lowercase to coincide. Item definitions for the quality evaluation are given in Additional file 4.

In Fig. 4 cumulative occurrences of the search results for each item concept query are shown. Consider, for example, the item query “Pulse” in plot (a); there was one item definition that occurred 28 times, there were three definitions that occurred at least 23 times, there were 10 definitions that occurred at least 17 times, and 446 definitions that occurred at least once, i.e. in total. This was a common trend across all item concepts. Few item definitions occurred very often, but there were a lot of definitions that occurred only once or twice. For ischaemic heart disease and stroke-related concepts, most frequent definitions had fewer occurrences than CDASH vital signs or LOINC codes. Item concept queries consisting of several words led to many search results and several very frequent item definitions because the search mechanism also included partial matches. Hence, for instance, the query “Coronary heart disease” also returned all item definitions that contained the term “disease”. The ratios of occurrences for the top three search results compared to all results were 2.07 ± 2.06% for CDASH vital signs, 4.89 ± 3.30% for LOINC codes, 0.49 ± 0.61% for ischaemic heart disease, and 2.42 ± 2.15% for stroke. We repeated the same analyses with a relaxed equivalence definition that only required the ODM question element in lower case to coincide. Figure 5 shows the plots for cumulative occurrences and they show an increased number of definitions with many occurrences. The ratios of occurrences for the top three results were 8.98 ± 11.83%, 17.13 ± 8.90%, 1.02 ± 1.30%, and 3.68 ± 3.52%. Looking at the absolute number of occurrences in Table 2, we can observe that for CDASH vital signs and LOINC codes most definitions had many occurrences. This effect increased with the relaxed equivalence definition. For ischaemic heart disease and stroke-related concepts, however, there were only a few definitions with many occurrences and the top three results often had only one or few occurrences even when clustered by the question text.

Fig. 4
figure 4

Plots of cumulative occurrences for 24 analyzed item concepts. Cumulative occurrences of all items in the pragmatic MDR proof of concept were generated with the item concept queries in Table 2. Each plot contains six item concepts from one category. a CDASH vital signs. b Most frequent LOINC codes. c Most frequent ischaemic heart disease-related UMLS concepts from MDM Portal. d Most frequent stroke-related UMLS concepts from MDM Portal

Fig. 5
figure 5

Plots of cumulative occurrences for 24 analyzed item concepts with equivalence on question level. Cumulative occurrences of all items in the pragmatic MDR proof of concept were generated with the item concept queries in Table 2. In contrast to Fig. 4, item concepts were clustered based on their lowercase question text. Each plot contains six item concepts from one category. a CDASH vital signs. b Most frequent LOINC codes. c Most frequent ischaemic heart disease-related UMLS concepts from MDM Portal. d Most frequent stroke-related UMLS concepts from MDM Portal

Results of the quality evaluation are summarized in Table 3. It contains an overview of responses for both raters and each item property. Since the ODM standard defines CodeList, Length, Description, Alias, and RangeCheck as optional attributes, some of these properties were undefined and could not be assessed (column Undefined). A single response of rater A for the description property was invalid, which we treated as Undefined. The last row summarizes all responses of both raters. Moreover, median values are highlighted in bold. We can observe that RangeCheck, CodeList, and Description properties were missing very often. For CodeList this was due to the fact that some items did not offer a value list for selection. On the other hand, only four Alias properties were missing indicating a high coverage of semantic codes among top search results. Overall rating of the item definitions was positive (median for both raters and all responses is A). Responses of rater A were slightly more positive than of rater B with one median value for N, seven for A, and three for SA compared to one median value for D, three for N, five for A, and one for SA. Interrater agreement of both raters could be considered as fair agreement (Kalpha = 0.50, 95% CI 0.43–0.56) [32] (Additional file 5 provides a contingency table for rater agreement). Item properties Question, DataType, and Alias were rated higher than CodeList, Description, and the item’s relevance and differences between item categories, i.e. concepts 1–6, 7–12, 13–17, and 18–24, were very small (see Additional file 6).

Table 3 Responses for quality evaluation of bottom-up standards

Discussion

Heterogeneity of medical metadata hampers bottom-up standardization

We investigated the concept of bottom-up standardization in a pragmatic MDR that imported at least 15,306 medical forms, identified equivalent definitions, and scored them according to their number of occurrences. To evaluate the emergence of bottom-up standards, we considered 24 important medical concepts and analyzed cumulative occurrences of related items and determined absolute and ratios of occurrences of the top three search results for each concept. However, plots of cumulative occurrences took into account all related items and they were skewed by partial matches of the search mechanism, so we consider them less relevant. Analysis of the top three search results, on the other hand, was more specific, because these item definitions received the best tradeoff between query matching and the number of occurrences. Hence, in the following, we focus on absolute and ratios of occurrences.

For ischaemic heart disease and stroke concepts, the occurrence ratio and absolute occurrences showed that no clear bottom-up standards emerged in the pragmatic MDR prototype. There were usually more than a thousand total search results, but many of the most frequent definitions only had one or few occurrences. While this effect decreased a little bit with the relaxed equivalence definition using only the question text, we would not call the results clear bottom-up standards. Hence, as a main result of our analysis, we can conclude that there exists an enormous heterogeneity of metadata for medical concepts for diseases. This is consistent with previous work that showed a strong need for metadata harmonization to generate disease-specific common data elements [33, 34].

The situation was different for CDASH vital signs and LOINC codes. The latter already showed a considerable occurrence ratio with strong item equivalence, which increased to 17.13 ± 8.90% when applying equivalence only on question level. That means for LOINC codes three bottom-up standardized questions represented on average 17.13% of all questions that matched the respective search query. In addition to that, the absolute numbers of occurrences were also very high. Hence, we conclude that for laboratory values our proof of concept was able to determine bottom-up standards. While for CDASH vital signs the occurrence ratios were not as high as for LOINC codes, probably due to higher numbers of total search results, we think the ratio for the relaxed equivalence definition of 8.98 ± 11.83% in combination with a high absolute number of occurrences suggests that weak bottom-up standards emerged.

This discrepancy between the medical categories probably stems from a lower medical complexity: it is easier to agree on data elements to collect vital signs or laboratory values than information on a complex medical condition. However, we are also convinced that for CDASH vital signs and LOINC codes there is still much room for improvement. For instance, consider the concept Body Temperature with the relaxed equivalence criteria (Table 2). The second and third search results had only one and two occurrences, which is unlikely to reflect the heterogeneity of collecting the body temperature. Moreover, the bottom-up standards for CDASH vital signs and LOINC codes were very simplistic. There were no complex question texts in the top three search results since it is probably much harder to agree on those.

Our self-designed quality evaluation of bottom-up standards showed an overall fair agreement for good item quality. However, we think this evaluation has only weak validity since there were only two raters and the questions were derived from the ODM standard, which was the format of the original data. Moreover, there might be different use cases for item definitions that were not well reflected in our questionnaire. We conclude that our evaluation gives a hint that our proof of concept can offer useful item definitions for certain scenarios even when bottom-up standards might not emerge.

A pragmatic MDR can facilitate reusing and sharing of metadata

Reusing medical metadata saves costs in the creation of medical documentation and fosters harmonized data collections [7]. In contrast to existing MDRs, a pragmatic MDR usually offers a larger variety of different metadata definitions for the same medical concept. This allows users to choose a definition from several suggestions, which can facilitate metadata reuse. We demonstrated this for the study editor ODMEdit [12], but also external applications are possible [35].

It is common today that designers of medical information systems do not publish their documentation [36]. Sharing metadata in a pragmatic MDR should only require its occurrence in a real-world data collection; all data processing and bottom-up standardization should be performed automatically. Hence, the sharing process can be simplified to a file upload as we have realized it for our prototype. By reducing the effort to publish medical documentation, a pragmatic MDR might increase the amount of shared metadata. Furthermore, due to the simple policy for metadata sharing, a pragmatic MDR can be used to transfer metadata. We have demonstrated this in the second use case: we reused SHIP metadata even though these definitions did not necessarily emerge as bottom-up standards.

Bottom-up versus top-down approach for metadata standardization

We discuss some theoretical considerations of bottom-up and top-down standardization not verified in this study to outline key differences and to give an idea where each concept might be advantageous. Bottom-up standards should be the most frequent definitions of a medical concept in a collection of real-world metadata, which usually indicates that they were already used in many settings and are well accepted. Second, since many existing data collections already use this data element, reusing it leads to compatible data collections. Third, bottom-up standardization automatically adapts to changes since shared documentation directly shows up in a pragmatic MDR and influences the scoring mechanism. In addition to that, automatic processing of shared metadata can yield a more neutral scoring of data elements and reduces standardization costs. Lastly, bottom-up standardization offers several candidate definitions, which might be better suited to reflect the heterogeneity of medical documentation. However, this data-driven approach requires a large amount of shared metadata definitions and bottom-up standards highly depend on the data quality. In our feasibility study, we simulated this process with forms from the MDM Portal, which are curated by medical experts. Besides, frequency alone cannot measure the quality of a data element. Due to the large number of different data elements for important medical concepts, many high-quality definitions with few usages will receive a low score and, hence, will be difficult to find in a pragmatic MDR.

The top-down-approach, on the other hand, offers full control to define a single source of truth for data collections, which is necessary to enforce guidelines for semantic interoperability. The quality of these top-down metadata definitions depends on the expertise and opinion of experts. Certainly, real-world examples will be considered before agreeing on a definition, but the decision is probably driven from a data consumer perspective, which demands as much data as possible in a highly structured way resulting in more complex metadata. Data producers, on the other hand, want to reduce their efforts for data collection and might choose a simpler variant. Moreover, top-down standardization is a manual process, which increases costs. In our opinion, top-down MDRs can be advantageous when the scope of an MDR is small, there is high agreement among experts for metadata definitions, or full control of the data is necessary. In contrast to that, the pragmatic bottom-up concept could be of value, when the scope of an MDR is broad and no single ground truth definition exists or is necessary. Due to these different characteristics, bottom-up standardization is unlikely to replace top-down approaches. However, depending on the application it could serve as a useful complement.

Pragmatic MDR and ISO/IEC 11179 norm for metadata registries

ISO/IEC 11179 specifies a conceptual model for MDRs and metadata representation, which includes a data element definition, conceptual domain, value domain, and data element concept [9]. Such a rigorous data model can improve data definitions, collection guidelines, and quality that ultimately improve the overall data quality of medical data collections [37]. Extensive content curation is necessary to ensure adherence to this data model. In practice, ISO/IEC 11179 compliant MDRs try to fulfill this data model through their top-down approach for standardization. However, an evaluation of caCORE in 2006 identified several limitations concerning inconsistent, insufficient, and redundant content [10]. This evaluation demonstrates the intrinsic difficulties to maintain a consistent and complete ISO/IEC 11179 compliant MDR.

For a pragmatic MDR, ISO/IEC 11179 is not well suited, because deriving the conceptual domain and value domain in an automatic fashion is hard, which would impede automatic data processing. For a pragmatic MDR, it is therefore preferable to use a more relaxed data model to simplify data sharing and obtain a large collection of real-world metadata definitions. Our proof of concept implementation fulfills ISO/IEC 11179 in part, due to the properties of ODM [38]. Using ISO/IEC 11179 for an MDR can be advantageous in similar situations as the top-down standardization approach; when the scope is small and there is high agreement among experts for metadata definitions. However, for the pragmatic MDR concept, the data model is usually too rigorous.

Limitations

Our definition of a pragmatic MDR along three principles is relatively loose. For example, we do not specify how open access should be ensured or clearly define frequency-based scoring. It is arguable whether our proof of concept implementation satisfies the third principle for open-access. At present, sharing of medical documentation is implemented only through the MDM Portal to exploit its data as initial content and to offer a graphical interface for sharing. Querying and reusing metadata definitions is secured with API-keys. However, necessary keys are provided to all interested parties on request. Moreover, our prototype is limited to ODM metadata; its internal data model and API are derived from this standard. In addition, due to the content of the MDM Portal, we can ensure that our proof of concept is based on real-world metadata definitions and it remains open how to ensure this in other settings.

The proposed bottom-up concept to rank data elements according to their frequency in real-world data collections is certainly a rather simplistic approach to determine their quality and relevance for reuse. Yet, we considered it worthwhile to study its potential to establish an ordering in a large collection of metadata. Also, we performed our quality evaluation of frequent item definitions only with two evaluators with fair agreement and used self-designed questionnaires that were derived from the ODM standard [19]. Our evaluation does not explicitly measure the reliability, validity, and economic factors of a medical item definition. Hence, our evaluation results should be interpreted with caution. We only considered 24 important item concepts from four different categories. The quality of less relevant item definitions might be worse. The same applies to our analysis of cumulative frequencies as well as occurrence ratios and absolute occurrences of the top three search results. While our analysis suggests that certain definitions emerge as de facto standards it is not clear whether this also holds for other item concepts.

Lastly, our overview of existing MDRs and the ISO/IEC 11179 norm is limited. These infrastructures have different guidelines for their content that we subsumed as top-down standardization. This is certainly an oversimplification. We did this to contrast them with our approach for bottom-up standardization.

Conclusions

In this study, we suggested the pragmatic MDR concept, which enables a community-driven bottom-up approach for standardization of medical metadata. In contrast to existing MDRs, it is based on a large collection of real-world metadata and uses frequency-based scoring of data elements as a proxy for their quality. We successfully implemented a proof of concept application and filled it with 466,569 unique metadata definitions from the MDM Portal. Applications of this prototype in two use cases suggest that it can facilitate the reuse of metadata. Moreover, our analysis for the existence of bottom-up standards showed that for rather simple medical concepts such as laboratory values and vital signs at least weak bottom-up standards emerged in our prototype. For more diverse concepts related to ischaemic heart disease or stroke, such standards could not be determined due to an enormous heterogeneity of data elements. Our evaluation of metadata quality suggests that our proof of concept can offer useful item definitions. In our opinion, a pragmatic MDR is a useful concept alongside existing top-down MDRs that simplifies standardization, gives a broader overview of existing metadata definitions, and offers standards derived from real-world documentation. We think it has potential to facilitate the reuse of data elements during the creation of case report forms and routine documentation.

Future work should consider refined equivalence criteria used for metadata aggregation as we have done it with the data element’s question text. In previous work semantic coding [33, 34, 39], existing common data elements [11], or natural language processing [40] were used to identify equivalent data elements. This would increase the reuse ratio and decrease the long tail of rare metadata definitions. Moreover, it would be of interest to consider a pragmatic MDR that is not restricted to the content of the MDM Portal and allows metadata sharing from different sources. One possibility to realize this is a public API also for sharing and deleting metadata definitions. Most importantly, future work should investigate the utility of the pragmatic MDR concept. For example, by connecting an application to the REST API of our prototype. There are various usage scenarios such as suggestion mechanisms [35] or automatic semantic coding [41] and further experience is necessary to assess the value of this concept.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Notes

  1. We queried the MDM Portal with "Ischemic heart disease" OR "Coronary heart disease" OR "heart attack" and “Stroke”.

  2. Additional file 3 contains more query examples. Further endpoints and parameters are available and documented on the start page of the pragmatic MDR (https://medical-data-models.org/MDR/).

Abbreviations

A:

Agree

CDASH:

Clinical Data Acquisition Standards Harmonization

CDISC:

Clinical Data Interchange Standards Consortium

D:

Disagree

LOINC:

Logical Observation Identifiers Names and Codes

MDM Portal:

Portal of Medical Data Models

MDR:

Metadata Repository

N:

Neither agree nor disagree

Occ:

Occurrences

ODM:

Operational Data Model

SA:

Strongly agree

SD:

Strongly disagree

SHIP:

Study of Health in Pomerania

UMLS:

Unified Medical Language System

References

  1. Dugas M. Clinical research informatics: recent advances and future directions. Yearb Med Inform. 2015;10:174–7. https://doi.org/10.15265/IY-2015-010.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Lehne M, Sass J, Essenwanger A, Schepers J, Thun S. Why digital medicine depends on interoperability. NPJ Digit Med. 2019;2:1–5.

    Article  Google Scholar 

  3. Kush RD, Warzel D, Kush MA, Sherman A, Navarro EA, Fitzmartin R, et al. FAIR data sharing: the roles of common data elements and harmonization. J Biomed Inform. 2020;107:103421.

    Article  CAS  Google Scholar 

  4. Liaw S-T, Guo JGN, Ansari S, Jonnagaddala J, Godinho MA, Borelli AJ, et al. Quality assessment of real-world data repositories across the data life cycle: a literature review. J Am Med Inform Assoc. 2021. https://doi.org/10.1093/jamia/ocaa340.

    Article  PubMed  Google Scholar 

  5. Klann JG, Weber GM, Estiri H, Moal B, Avillach P, Hong C, et al. Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. J Am Med Inform Assoc. 2021. https://doi.org/10.1093/jamia/ocab018.

    Article  PubMed  PubMed Central  Google Scholar 

  6. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25:30–6. https://doi.org/10.1038/s41591-018-0307-0.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Beresniak A, Schmidt A, Proeve J, Bolanos E, Patel N, Ammour N, et al. Cost-benefit assessment of using electronic health records data for clinical research versus current practices: contribution of the Electronic Health Records for Clinical Research (EHR4CR) European Project. Contemp Clin Trials. 2016;46:85–91. https://doi.org/10.1016/j.cct.2015.11.011.

    Article  PubMed  Google Scholar 

  8. Marco D, Jennings M. Universal Meta Data Models. New York: Wiley; 2004.

    Google Scholar 

  9. Information technology—Metadata registries (MDR)—Part 3: registry metamodel and basic attributes. 3rd ed. Final Committee Draft ISO/IEC FCD11179-3. 2010.

  10. Redeker NS, Anderson R, Bakken S, Corwin E, Docherty S, Dorsey SG, et al. Advancing symptom science through use of common data elements. J Nurs Scholarsh. 2015;47:379–88. https://doi.org/10.1111/jnu.12155.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Huser V, Amos L. Analyzing real-world use of research common data elements. AMIA Annu Symp Proc. 2018;2018:602–8.

    PubMed  PubMed Central  Google Scholar 

  12. Dugas M, Meidt A, Neuhaus P, Storck M, Varghese J. ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository. BMC Med Res Methodol. 2016;16:65. https://doi.org/10.1186/s12874-016-0164-9.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Dugas M, Neuhaus P, Meidt A, Doods J, Storck M, Bruland P, Varghese J. Portal of medical data models: information infrastructure for medical research and healthcare. Database. 2016;2016:bav121. https://doi.org/10.1093/database/bav121.

    Article  PubMed  PubMed Central  Google Scholar 

  14. National Cancer Insitute (NIH). Cancer Data Standards Registry and Repository (caDSR) Wiki. https://wiki.nci.nih.gov/display/caDSR. Accessed 3 Mar 2021.

  15. Nadkarni PM, Brandt CA. The Common Data Elements for cancer research: remarks on functions and structure. Methods Inf Med. 2006;45:594–601.

    Article  CAS  Google Scholar 

  16. Davies J, Gibbons J, Harris S, Crichton C. The CancerGrid experience: metadata-based model-driven engineering for clinical trials. Sci Comput Program. 2014;89:126–43.

    Article  Google Scholar 

  17. Stohr MR, Helm G, Majeed RW, Gunther A. CoMetaR: a collaborative metadata repository for biomedical research networks. Stud Health Technol Inform. 2017;245:1337.

    PubMed  Google Scholar 

  18. Kadioglu D, Breil B, Knell C, Lablans M, Mate S, Schlue D, et al. Samply.MDR—a metadata repository and its application in various research networks. Stud Health Technol Inform. 2018;253:50–4.

    PubMed  Google Scholar 

  19. Clinical Data Interchange Standards Consortium (CDISC). Operational Data Model (ODM)-XML. https://www.cdisc.org/standards/data-exchange/odm. Accessed 3 Mar 2021.

  20. Huser V, Sastry C, Breymaier M, Idriss A, Cimino JJ. Standardizing data exchange for clinical research protocols and case report forms: an assessment of the suitability of the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM). J Biomed Inform. 2015;57:88–99. https://doi.org/10.1016/j.jbi.2015.06.023.

    Article  PubMed  PubMed Central  Google Scholar 

  21. The Apache Software Foundation. Apache Solr. https://lucene.apache.org/solr/. Accessed 3 Mar 2021.

  22. Fielding RT, Taylor RN. Architectural styles and the design of network-based software architectures: University of California, Irvine Doctoral dissertation; 2000.

  23. Pivotal Software. Spring Boot Framework. https://spring.io/projects/spring-boot. Accessed 3 Mar 2021.

  24. Völzke H, Alte D, Schmidt CO, Radke D, Lorbeer R, Friedrich N, et al. Cohort profile: the study of health in Pomerania. Int J Epidemiol. 2010;40:294. https://doi.org/10.1093/ije/dyp394.

    Article  PubMed  Google Scholar 

  25. Hegselmann S, Gessner S, Neuhaus P, Henke J, Schmidt CO, Dugas M. Automatic conversion of metadata from the study of health in Pomerania to ODM. Stud Health Technol Inform. 2017;236:88–96.

    PubMed  Google Scholar 

  26. Amos L, Anderson D, Brody S, Ripple A, Humphreys BL. UMLS users and uses: a current overview. J Am Med Inform Assoc. 2020;27:1606–11.

    Article  Google Scholar 

  27. Gaddale JR. Clinical Data Acquisition Standards Harmonization importance and benefits in clinical data management. Perspect Clin Res. 2015;6:179–83. https://doi.org/10.4103/2229-3485.167101.

    Article  PubMed  PubMed Central  Google Scholar 

  28. McDonald CJ, Huff SM, Suico JG, Hill G, Leavelle D, Aller R, et al. LOINC, a universal standard for identifying laboratory observations: a 5-year update. Clin Chem. 2003;49:624–33. https://doi.org/10.1373/49.4.624.

    Article  CAS  PubMed  Google Scholar 

  29. World Health Organisation (WHO). The top 10 causes of death. https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. Accessed 3 Mar 2021.

  30. Zapf A, Castell S, Morawietz L, Karch A. Measuring inter-rater reliability for nominal data—which coefficients and confidence intervals are appropriate? BMC Med Res Methodol. 2016;16:93. https://doi.org/10.1186/s12874-016-0200-9.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Varghese J, Sandmann S, Dugas M. Web-based information infrastructure increases the interrater reliability of medical coders: quasi-experimental study. J Med Internet Res. 2018;20:e274. https://doi.org/10.2196/jmir.9644.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.

    Article  CAS  Google Scholar 

  33. Holz C, Kessler T, Dugas M, Varghese J. Core data elements in acute myeloid leukemia: a unified medical language system-based semantic analysis and experts’ review. JMIR Med Inform. 2019;7:e13554. https://doi.org/10.2196/13554.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Kentgen M, Varghese J, Samol A, Waltenberger J, Dugas M. Common data elements for acute coronary syndrome: analysis based on the unified medical language system. JMIR Med Inform. 2019;7:e14107. https://doi.org/10.2196/14107.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Vengadeswaran A, Neuhaus P, Hegselmann S, Storf H, Kadioglu D. Semantically annotated metadata: interconnecting samply.MDR and MDM-Portal. Stud Health Technol Inform. 2019;267:86–92. https://doi.org/10.3233/SHTI190810.

    Article  PubMed  Google Scholar 

  36. Dugas M, Jöckel KH, Friede T, Gefeller O, Kieser M, Marschollek M, et al. Memorandum “Open Metadata.” Methods Inf Med. 2015;54:376–8.

    Article  CAS  Google Scholar 

  37. Stausberg J, Lobe M, Verplancke P, Drepper J, Herre H, Loffler M. Foundations of a metadata repository for databases of registers and trials. Stud Health Technol Inform. 2009;150:409–13.

    PubMed  Google Scholar 

  38. Ngouongo SM, Löbe M, Stausberg J. The ISO/IEC 11179 norm for metadata registries: Does it cover healthcare standards in empirical research? J Biomed Inform. 2013;46:318–27. https://doi.org/10.1016/j.jbi.2012.11.008.

    Article  PubMed  Google Scholar 

  39. Luo Z, Miotto R, Weng C. A human–computer collaborative approach to identifying common data elements in clinical trial eligibility criteria. J Biomed Inform. 2013;46:33–9. https://doi.org/10.1016/j.jbi.2012.07.006.

    Article  PubMed  Google Scholar 

  40. Elghafari A, Finkelstein J. Automated identification of common disease-specific outcomes for comparative effectiveness research using ClinicalTrials.gov: algorithm development and validation study. JMIR Med Inform. 2021;9:e18298. https://doi.org/10.2196/18298.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Christen V, Groß A, Rahm E. A reuse-based annotation approach for medical documents. In: International Semantic Web Conference. 2016. p. 135–50.

Download references

Acknowledgements

Not applicable.

Funding

Open Access funding enabled and organized by Projekt DEAL. This work was supported by German Research Foundation (Deutsche Forschungsgemeinschaft, DFG Grants DU 352/11-1, DU 352/11-2) and Open Access Publication Fund of University of Münster. The funding sources had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

SH and MD conceived of the presented concept. SH wrote the manuscript in consultation with MS and MD. SH implemented the proof of concept application. SH and PN implemented the first use case. JH and COS implemented the second use case. SH, SG, JV, SB, and BS designed the evaluation. SH and MS performed the statistical analysis. SG, JV, PB, AM, CM, SR, and MD created the data for the proof of concept application. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Stefan Hegselmann.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Questionnaires for the quality evaluation of bottom-up standards.

Additional file 2.

Study protocol for the quality evaluation of bottom-up standards.

Additional file 3.

Table with example requests for the REST API of the pragmatic MDR proof of concept.

Additional file 4.

Overview of item definitions for the quality evaluation of bottom-up standards.

Additional file 5.

Contingency table and heat map for rater agreement.

Additional file 6.

Heat map for median ratings for each item concept across both raters and top three search results.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hegselmann, S., Storck, M., Gessner, S. et al. Pragmatic MDR: a metadata repository with bottom-up standardization of medical metadata through reuse. BMC Med Inform Decis Mak 21, 160 (2021). https://doi.org/10.1186/s12911-021-01524-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12911-021-01524-8

Keywords