Skip to main content

Table 7 Effect of collision resolution

From: An efficient record linkage scheme using graphical analysis for identifier error detection

 

Before collision resolution

After collision resolution

% drop

 

Before collision resolution

Number of clusters

3557951

3618233

 

Number of clusters

3557951

Clusters with multiple:

   

Clusters with multiple:

 

NHS numbers

6202

2122

~66%

NHS numbers

6202

hospital numbers

97071

94238

~3%

hospital numbers

97071

birthdates

58293

35523

~39%

birthdates

58293

deathdates

830

107

~87%

deathdates

830

genders

81118

61337

~24%

genders

81118

forenames

59426

16873

~71%

forenames

59426

surnames

189657

151593

~25%

surnames

189657

  1. After initial linkage, a process of collision resolution is applied (see methods). This causes a decrease in the number of clusters containing multiple identifiers, as detailed above.