Skip to main content

Table 1 Identifiers and Record linkage operation

From: An efficient record linkage scheme using graphical analysis for identifier error detection

Start cluster id

New cluster id

NHS number

hospital number

Surname

Forename

sex

date of birth (ddmmyyyy)

frequency of occurrence

1

1

NULL

4496644

WILSON

DAVID

M

14061940

3

2

2

5170231111

NULL

WILSON

DAVID

M

01051939

1

3

3

3319004037

4118890

WILSON

DAVID

M

20011969

2

4

4

NULL

NULL

WILSON

DAVID

M

20011969

1

5

3

3319004037

NULL

WILSON

DAVID

M

20011969

2

6

6

NULL

4118890

WILSON

DAVID

M

20011969

1

  1. An example of identifiers provided for patients with forename and surname 'David Wilson'. The details have been changed to protect patient confidentiality. Null fields indicate there was no information provided in that field.
  2. One cycle of the record linkage is illustrated. Consider each combination of identifiers to belong to its own, discrete cluster, identified by a cluster identifier (Start cluster id). For all sets in which at least one member shares an NHS number identifier with a different set, combine these sets into a single set (New Cluster ID). The operation proceeds for all identifiers.