Skip to main content

Table 4 Performance on gold standard corpus.

From: Automated de-identification of free-text medical records

PHI Type PHI sub-type Count # FNs # FNs per 100,000 words Per Category Recall Per Category Precision
Name Patient Name 54 0 0 1.00  
  Patient Name Initial 2 2 0.598 0.00  
  Relative/Proxy Name 175 4 1.195 0.977  
  Clinician Name 593 3 1.494 0.995 0.725
Date Date (not year) 482 26 7.769 0.946  
  Year 46 11 3.287 0.761 0.713
Location   367 10 4.482 0.973 0.922
Phone   53 0 0 1.00 0.898
Age over 89   4 1 0.299 0.750 0.600
Undefined   3 2 0.598 0.333 N/A
Overall   1779 59 19.720 0.967 0.749
  1. (FNs are false negatives and N/A indicates not applicable)