Skip to main content

Advertisement

Table 2 Levels of matched records using a variety of techniques.

From: The SAIL databank: linking multiple health and social care datasets

  Levels of matched records
  Primary Care General Practice (GP dataset) Secondary Care Hospital Admissions (PEDW dataset) Social Services (PARIS database)
  Number % Number % Number %
Sample size 229,127   290,650   18,540  
Valid NHS Number 229,117 99.996% 264,868 91.13% - 0.00%
Valid NHS Number plus DRL: 229,123 99.998% 280,729 96.59% 14,158 76.36%
Valid NHS Number plus PRL (99% cut off): 229,125 99.999% 287,572 98.94% 17,095 92.21%
Valid NHS Number plus PRL (95% cut off): 229,125 99.999% 288,186 99.15% 17,431 94.02%
Valid NHS Number plus PRL (90% cut off): 229,125 99.999% 288,424 99.23% 17,553 94.68%
Valid NHS Number plus PRL (50% cut off): 229,125 99.999% 288,670 99.32% 17,639 95.14%
Overall combining Valid NHS, DRL & PRL (50%): 229,125 99.999% 288,683 99.32% 17,642 95.16%
  1. The numbers (and percentages) of records that could be matched using deterministic record linkage (DRL) and a various thresholds of probabilistic record linkage (PRL) were assessed for each of three test datasets: the GP dataset, the PEDW dataset and the PARIS database. Records with a valid NHS number were accepted. The matching rate achieved by applying DRL followed by PRL (to the 50% threshold) was also assessed, and the final row shows this result of operating the MACRAL algorithm as illustrated in Figure 1.