Skip to main content

Table 1 Results of the original deid program and modified program on the training set and two validation sets

From: De-identification of primary care electronic medical records free-text data in Ontario, Canada

Feature added/Modified

Number of Free Text Records

Sensitivity/Recall

Specificity

Precision

Accuracy

F-measure

Original deid Program

500

83.4%

71.6%

71.0%

77.0%

0.77

Modification of deid Program

      

- Replaced deid lists for cities, businesses

and medical facilities with Ontario lists

and made adjustments for Ontario

healthcard numbers and postal codes

500

91.5%

71.0%

70.7%

79.9%

0.80

- Added RPDB* names to ambiguous

names, added PS‡ derived initial name

removal replacement names to the

unambiguous names and added list of

Ontario physicians

500

90.9%

71.8%

71.5%

80.1%

0.80

   - Improved medical eponyms lists

500

90.9%

71.8%

71.5%

80.1%

0.80

- Added protection for common acronyms

and nomenclature

750

92.6%

72.8%

72.7%

81.5%

0.81

   - Added 'do not remove' list

1000

88.3%

91.4%

91.3%

89.9%

0.90

First Validation

700

86.7%

91.4%

91.1%

89.0%

0.89

Second Validation

500

80.2%

87.7%

87.4%

83.8%

0.84

  1. *RPDB = Registered Persons Database
  2. ‡PS = Practice Solutions