Skip to main content

Table 2 Model performance

From: Natural language processing to identify lupus nephritis phenotype in electronic health records

Dataset

Algorithm

Sensitivity

Specificity

PPV

NPV

F Measure

NU (testing set)

Baseline

0.43

0.6

0.39

0.64

0.41

NU (testing set)

Regex + structured

0.49

0.93

0.81

0.76

0.61

NU (testing set)

Full MetaMap (binary)

0.63

0.92

0.82

0.81

0.71

NU (testing set)

Full MetaMap (counts)

0.6

0.95

0.88

0.80

0.71

NU (testing set)

MetaMap mixed

0.74

0.92

0.84

0.86

0.79

VUMC

Baseline

0.86

0.67

0.38

0.95

0.52

VUMC

MetaMap mixed

0.93

0.98

0.93

0.98

0.93

  1. For logistic regression-based models, probability of 0.5 is used as the threshold for classification
  2. Abbreviations: SLE systemic lupus erythematosus, NU Northwestern University, VUMC Vanderbilt University Medical Center, NLP natural language processing, PPV positive predictive value, NPV negative predicted value