Skip to main content

Table 1 Results of the original deid program and modified program on the training set and two validation sets

From: De-identification of primary care electronic medical records free-text data in Ontario, Canada

Feature added/Modified Number of Free Text Records Sensitivity/Recall Specificity Precision Accuracy F-measure
Original deid Program 500 83.4% 71.6% 71.0% 77.0% 0.77
Modification of deid Program       
- Replaced deid lists for cities, businesses
and medical facilities with Ontario lists
and made adjustments for Ontario
healthcard numbers and postal codes
500 91.5% 71.0% 70.7% 79.9% 0.80
- Added RPDB* names to ambiguous
names, added PS‡ derived initial name
removal replacement names to the
unambiguous names and added list of
Ontario physicians
500 90.9% 71.8% 71.5% 80.1% 0.80
   - Improved medical eponyms lists 500 90.9% 71.8% 71.5% 80.1% 0.80
- Added protection for common acronyms
and nomenclature
750 92.6% 72.8% 72.7% 81.5% 0.81
   - Added 'do not remove' list 1000 88.3% 91.4% 91.3% 89.9% 0.90
First Validation 700 86.7% 91.4% 91.1% 89.0% 0.89
Second Validation 500 80.2% 87.7% 87.4% 83.8% 0.84
  1. *RPDB = Registered Persons Database
  2. ‡PS = Practice Solutions