Skip to main content

Table 6 Performance without customized dictionary on gold standard corpus.

From: Automated de-identification of free-text medical records

PHI Type PHI sub-type Count # FNs Per Category Recall Per Category Precision
Name Patient Name 54 1 0.981  
  Patient Name Initial 2 2 0.00  
  Relative/Proxy Name 175 5 0.971  
  Clinician Name 593 24 0.973 0.731
Date Date (not year) 482 26 0.946  
  Year 46 11 0.761 0.712
Location   367 231 0.371 0.840
Phone   53 0 1.00 0.898
Age over 89   4 1 0.750 0.600
Undefined   3 2 0.333 N/A
Overall   1779 295 0.834 0.725
  1. (FNs are false negatives and N/A indicates not applicable.)