Skip to main content

Advertisement

Table 2 Summary Statistics from Smoking Status Algorithm Testing

From: Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes

a) i2b2 trained and tested
Micro F1: 0.90  
Precision Recall F1-Score N notes Sensitivity Specificity
Never 0.94 0.94 0.94 16 94% 94%
Ever 25 94% 99%
Former 0.73 0.73 0.73 11 73% 97%
Current 0.62 0.73 0.67 11 73% 95%
Smoker 0.0 0.0 0.0 3 0% 99%
Unknown 1.00 1.00 1.00 63 100% 100%
b) Local record trained and tested
Micro F1: 0.90  
Precision Recall F1-Score N notes Sensitivity Specificity
Never 0.83 0.98 0.90 51 98% 94%
Ever 64 90% 96%
Former 0.93 0.83 0.88 30 83% 99%
Current 0.79 0.84 0.81 31 84% 96%
Smoker 0.33 0.33 0.33 3 33% 99%
Unknown 0.99 0.92 0.95 108 92% 99%
c) Local record trained, i2b2 record tested
Micro F1: 0.88  
Precision Recall F1-Score N notes Sensitivity Specificity
Never 0.75 0.94 0.83 16 94% 94%
Ever 25 80% 94%
Former 0.88 0.64 0.74 11 64% 99%
Current 0.86 0.55 0.67 11 55% 99%
Smoker 0.00 0.00 0.00 3 0% 94%
Unknown 1.00 1.00 1.00 63 100% 100%
  1. Overall F1-score, and by smoking status precision, recall, F1-score, sensitivity, and specificity for a) i2b2 note trained and tested b) local note trained and tested and c) local note trained and i2b2 note tested data sets