Skip to main content

Table 5 Improvement in predictive ability of data cleaning techniques

From: The effect of data cleaning on record linkage quality

 

Hospital admissions data

Synthetic data

Remove punctuation

a0.08%

+0.08%

Remove alt. missing values

+0.5%

0%

Nickname lookup

−28%

−33%

Sex Imputation

NA

−5%

  1. a Negative sign (-) refers to decrease in predictive ability, positive sign (+) refers to increase in predictive ability compared to baseline.