Skip to main content

Table 4 Summary of pairwise linkage performance at selected thresholds. (A) “Odds on”: performance at θ = δ = 0, for comparison to the 50% threshold (p = 0.5, log odds = 0) of reference [24]. These settings yield a high TPR, at the cost of some misidentification. (B) Performance at the software’s default thresholds of θ = 5 and δ = 0, which optimized a weighted performance metric favouring MID reduction over TPR (see text). (C) Performance at θ = δ = 15, for a low MID. Values are a subset of data from Fig. 4. TPR, true positive rate or recall (detection of a proband who was in the sample, including correct linkages and misidentifications); MID, misidentification rate (the proportion of probands incorrectly identified). Values shown to three significant figures. † Note that the PCMIS database contained records with duplicate NHS numbers (see Table 3), particularly relevant when it is the sample database

From: De-identified Bayesian personal identity matching for privacy-preserving record linkage despite errors: development and validation

Proband database

Sample database

CDL

PCMIS †

RiO

SystmOne

A. At θ = δ = 0 (high TPR):

 CDL

TPR: 1.00; MID: 0.00000654

TPR: 0.961; MID: 0.0104

TPR: 0.996; MID: 0.00259

TPR: 0.964; MID: 0.0111

 PCMIS

TPR: 0.959; MID: 0.0111

TPR: 1.00; MID: 0.00000848

TPR: 0.985; MID: 0.00503

TPR: 0.985; MID: 0.00554

 RiO

TPR: 0.996; MID: 0.00414

TPR: 0.987; MID: 0.00722

TPR: 1.00; MID: 0

TPR: 0.990; MID: 0.00657

 SystmOne

TPR: 0.963; MID: 0.0218

TPR: 0.986; MID: 0.0136

TPR: 0.990; MID: 0.0168

TPR: 1.00; MID: 0.000139

 Mean (range) for non-self linkage

TPR: 0.980 (0.959–0.996)

MID: 0.00965 (0.00259–0.0218)

B. At θ = 5, δ = 0 (software defaults, balanced performance):

 CDL

TPR: 1.00; MID: 0.00000654

TPR: 0.935; MID: 0.00314

TPR: 0.993; MID: 0.00123

TPR: 0.941; MID: 0.00276

 PCMIS

TPR: 0.931; MID: 0.00279

TPR: 1.00; MID: 0.00000848

TPR: 0.970; MID: 0.00148

TPR: 0.972; MID: 0.00174

 RiO

TPR: 0.994; MID: 0.00159

TPR: 0.973; MID: 0.00202

TPR: 1.00; MID: 0

TPR: 0.976; MID: 0.00171

 SystmOne

TPR: 0.941; MID: 0.00429

TPR: 0.974; MID: 0.00328

TPR: 0.976; MID: 0.00387

TPR: 1.00; MID: 0.000139

 Mean (range) for non-self linkage

TPR: 0.965 (0.931–0.994)

MID: 0.00249 (0.00123–0.00429)

C. At θ = δ = 15 (low MID):

 CDL

TPR: 0.990; MID: 0

TPR: 0.577; MID: 0.00137

TPR: 0.924; MID: 0.000664

TPR: 0.549; MID: 0.000633

 PCMIS

TPR: 0.588; MID: 0.00134

TPR: 0.928; MID: 0

TPR: 0.786; MID: 0.000609

TPR: 0.788; MID: 0.000699

 RiO

TPR: 0.926; MID: 0.000673

TPR: 0.774; MID: 0.000619

TPR: 0.994; MID: 0

TPR: 0.788; MID: 0.000320

 SystmOne

TPR: 0.550; MID: 0.00103

TPR: 0.777; MID: 0.00105

TPR: 0.787; MID: 0.000980

TPR: 0.997; MID: 0

 Mean (range) for non-self linkage

TPR: 0.735 (0.549–0.926)

MID: 0.000832 (0.000320–0.00137)