Skip to main content

Table 3 Comparison of performance for heuristic and machine learning models tested on holdout data

From: Using artificial intelligence to reduce diagnostic workload without compromising detection of urinary tract infections

Model Name

AUC Score

Accuracy (%)

p-value**

PPV

NPV

Sensitivity (%)

Specificity (%)

Relative Workload Reduction (%)

All Patients

Pregnant

Children < 11 Yrs

  

Heuristic model (30 WBC/μl or 100 bacteria/μl)

 

63·92

NA

42.73 [± 0.51]

97.01 [±0.28]

95·70 [± 0·15]

85·9 [± 0·72]

91·5 [± 0·92]

52·10 [± 0·36]

39·06 [± 0·38]

Random Forest (Class weight - 1:20)

0·908

71·96

< 0.001

40.47 [± 0.54]

97.67 [± 0.25]

95·95 [± 0·23]

70·5 [± 2·14]

89·8 [± 1·49]

63·40 [± 0·54]

47·58 [± 0·39]

Neural Network

0·906

85·00

< 0.001

71.70 [± 0.46]

90.18 [± 0.50]

74·03 [± 0·64]

27·6 [± 5·74]

69·3 [± 3·38]

89·09 [± 0·29]

71·98 [± 0·35]

Neural Network (with resampling*)

0·904

79·35

< 0.001

57.66 [± 0.74]

95.54 [± 0.19]

90·60 [± 0·35]

56·6 [± 3·43]

84·8 [± 2·04]

75·16 [± 0·44]

57·33 [± 0·38]

XGBoost (Class weight - 1:20)

0·910

65·68

< 0.001

44.05 [± 0.74]

97.77 [± 0.13]

96·70 [± 0·18]

77·1 [± 1·65]

93·1 [± 1·13]

54·14 [± 0·61]

40·36 [± 0·38]

  1. [95% Confidence Interval]
  2. *Resampling (without replacement) at a ratio of 2:1 for positive samples to offset class imbalance
  3. ** p-values obtained by comparison to heuristic model by McNemar test