Skip to main content

Table 2 Summary Statistics from Smoking Status Algorithm Testing

From: Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes

a)

i2b2 trained and tested

Micro F1: 0.90

 

Precision

Recall

F1-Score

N notes

Sensitivity

Specificity

Never

0.94

0.94

0.94

16

94%

94%

Ever

–

–

–

25

94%

99%

Former

0.73

0.73

0.73

11

73%

97%

Current

0.62

0.73

0.67

11

73%

95%

Smoker

0.0

0.0

0.0

3

0%

99%

Unknown

1.00

1.00

1.00

63

100%

100%

b)

Local record trained and tested

Micro F1: 0.90

 

Precision

Recall

F1-Score

N notes

Sensitivity

Specificity

Never

0.83

0.98

0.90

51

98%

94%

Ever

–

–

–

64

90%

96%

Former

0.93

0.83

0.88

30

83%

99%

Current

0.79

0.84

0.81

31

84%

96%

Smoker

0.33

0.33

0.33

3

33%

99%

Unknown

0.99

0.92

0.95

108

92%

99%

c)

Local record trained, i2b2 record tested

Micro F1: 0.88

 

Precision

Recall

F1-Score

N notes

Sensitivity

Specificity

Never

0.75

0.94

0.83

16

94%

94%

Ever

–

–

–

25

80%

94%

Former

0.88

0.64

0.74

11

64%

99%

Current

0.86

0.55

0.67

11

55%

99%

Smoker

0.00

0.00

0.00

3

0%

94%

Unknown

1.00

1.00

1.00

63

100%

100%

  1. Overall F1-score, and by smoking status precision, recall, F1-score, sensitivity, and specificity for a) i2b2 note trained and tested b) local note trained and tested and c) local note trained and i2b2 note tested data sets