Skip to main content
Fig. 2 | BMC Medical Informatics and Decision Making

Fig. 2

From: De-identified Bayesian personal identity matching for privacy-preserving record linkage despite errors: development and validation

Fig. 2

Linkage techniques between two hypothetical organizations, with and without the use of direct identifiers. The primary objective is to link data from two (e.g. healthcare) organizations A and B, so that information about a given person can be related. The secondary objective is to do so for research in a way that researchers cannot see identifiable data. All methods shown achieve these objectives. The tertiary objective is to minimize or eliminate handling of direct identifiers during the process of linkage, and more generally to minimize the ability of any participating organization to learn facts about any person that they did not already know. Information is colour-coded according to Fig. 1. All methods using de-identified linkage (right-hand column) require the ability to match people without the use of direct (plaintext) identifiers. This is simple with a shared unique identifier (e.g. NHS number) but more difficult without one, a technique developed in the present study. A Direct linkage. Organization A sends identity information (I) but not health information to B, tagging every person with a research ID (X) of no meaning to anyone else. A’s data, and the subset of B’s for people who match, are de-identified and linked for research. B Hashed direct linkage. If A and B share a secret hash key, they can reproduce this process but instead of using direct identifiers (I), they can use an irreversibly hashed version (H). C Trusted third-party (TTP) linkage. If A and B share a TTP, that TTP can perform linkage using identifiable data without B learning whom A requests, before de-identifying the linked data for research. D Hashed TTP linkage. As before, sharing a hash key enables the TTP to operate without direct identifiers. E Identity exchange (IDX) TTP linkage. In this more complex scheme, the TTP is used only to exchange identity information, without having to hold health information. Identity linkage (➊) occurs before de-identified health data linkage (âž‹). F Hashed IDX TTP linkage. The process can be improved further by hashing

Back to article page