Skip to main content

Table 3 Table showing each machine learning technique and its AUC from the test set and its respective 20x repeated Cross-Validation AUC and confidence interval with each NLP classifier

From: Word2Vec inversion and traditional text classifiers for phenotyping lupus

Technique

Data form

CV AUC

CV CI (α=0.95)

Test AUC

ICD-9 billing codes

N/A

0.897

N/A

0.900

Word2Vec inversion

N/A

0.963

[0.956, 0.971]

0.905

Neural network

BOWs

0.902

[0.897, 0.908]

0.925

 

CUIs

0.960

[0.957, 0.964]

0.974

Random forests

BOWs

0.981

[0.979, 0.984]

0.987

 

CUIs

0.987

[0.985,0.989]

0.988

Naïve Bayes

BOWs

0.841

[0.815, 0.868]

0.841

 

CUIs

0.805

[0.777, 0.833]

0.805

Support vector machines

BOWs

0.923

[0.911, 0.934]

0.923

 

CUIs

0.980

[0.975, 0.985]

0.980