Skip to main content

Table 2 Comparison of classifiers for opioid misuse

From: Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients

Classifier

ROC AUC

(95% CI)

F1

Precision/PPV (95% CI)

Recall/Sensitivity (95% CI)

Specificity (95% CI)

NPV (95% CI)

P value for model fit*

Rule-based

NAa

0.76

0.68 (0.57, 0.78)

0.87 (0.76, 0.94)

0.79 (0.71, 0.86)

0.92 (0.85, 0.96)

< 0.01

Logistic Regression CUI

0.91 (0.86, 0.95)

0.79

0.89 (0.77, 0.96)

0.71 (0.58, 0.81)

0.95 (0.90, 0.98)

0.86 (0.80, 0.91)

0.06

Logistic Regression Word

0.91 (0.86, 0.95)

0.72

0.86 (0.73, 0.94)

0.62 (0.49, 0.73)

0.95 (0.89, 0.98)

0.83 (0.76, 0.88)

< 0.01

Convolutional Neural Network CUI

0.93 (0.90, 0.97)

0.81

0.82 (0.70, 0.90)

0.79 (0.68, 0.88)

0.91 (0.85, 0.95)

0.89 (0.83, 0.94)

0.51

Convolutional Neural Network Word

0.94 (0.91, 0.98)

0.84

0.94 (0.85, 0.99)

0.75 (0.63, 0.85)

0.98 (0.93, 1.00)

0.88 (0.82, 0.93)

0.42

Convolutional Neural Network Character

0.93 (0.90, 0.97)

0.79

0.88 (0.76, 0.95)

0.72 (0.60, 0.82)

0.95 (0.89, 0.98)

0.87 (0.80, 0.92)

< 0.01

Deep Averaging Network CUI

0.83 (0.78, 0.88)

0.74

0.68 (0.57, 0.78)

0.87 (0.76, 0.94)

0.79 (0.71, 0.86)

0.92 (0.85, 0.96)

< 0.01

Deep Averaging Network Word

0.80 (0.74, 0.86)

0.49

0.74 (0.56, 0.87)

0.37 (0.25, 0.49)

0.93 (0.87, 0.97)

0.74 (0.67, 0.80)

< 0.01

Max Pooling Network CUI

0.93 (0.89, 0.96)

0.79

0.85 (0.73, 0.93)

0.74 (0.61, 0.83)

0.93 (0.87, 0.97)

0.87 (0.80, 0.92)

0.60

Max Pooling Network Word

0.91 (0.86, 0.96)

0.78

0.87 (0.76, 0.95)

0.71 (0.58, 0.81)

0.95 (0.89, 0.98)

0.86 (0.79, 0.91)

0.36

Deep Averaging + Max Pooling Network CUI

0.94 (0.91, 0.97)

0.81

0.92 (0.82, 0.98)

0.72 (0.60, 0.82)

0.97 (0.92, 0.99)

0.87 (0.80, 0.92)

< 0.01

Deep Averaging + Max Pooling Network Word

0.94 (0.91, 0.97)

0.78

0.86 (0.74, 0.94)

0.72 (0.60, 0.82)

0.94 (0.88, 0.97)

0.87 (0.80, 0.92)

0.09

  1. Logistic regression with a combination of unigrams and bigrams; PPV positive predictive value, NPV negative predictive value, ROC AUC area under the curve receiver operating characteristic, CUI concept unique identifier, CI confidence interval
  2. *model fit by Hosmer-Lemeshow Goodness of Fit test where p > 0.05 demonstrate the model fit the data well
  3. aNA not applicable because bivariate predictions (0/1) without predicted probabilities to plot ROC AUC