Skip to main content

Table 4 Performance on gold standard corpus.

From: Automated de-identification of free-text medical records

PHI Type

PHI sub-type

Count

# FNs

# FNs per 100,000 words

Per Category Recall

Per Category Precision

Name

Patient Name

54

0

0

1.00

 
 

Patient Name Initial

2

2

0.598

0.00

 
 

Relative/Proxy Name

175

4

1.195

0.977

 
 

Clinician Name

593

3

1.494

0.995

0.725

Date

Date (not year)

482

26

7.769

0.946

 
 

Year

46

11

3.287

0.761

0.713

Location

 

367

10

4.482

0.973

0.922

Phone

 

53

0

0

1.00

0.898

Age over 89

 

4

1

0.299

0.750

0.600

Undefined

 

3

2

0.598

0.333

N/A

Overall

 

1779

59

19.720

0.967

0.749

  1. (FNs are false negatives and N/A indicates not applicable)