Word2Vec inversion and traditional text classifiers for phenotyping lupus

Table 3 Table showing each machine learning technique and its AUC from the test set and its respective 20x repeated Cross-Validation AUC and confidence interval with each NLP classifier

Technique	Data form	CV AUC	CV CI (α=0.95)	Test AUC
ICD-9 billing codes	N/A	0.897	N/A	0.900
Word2Vec inversion	N/A	0.963	[0.956, 0.971]	0.905
Neural network	BOWs	0.902	[0.897, 0.908]	0.925
	CUIs	0.960	[0.957, 0.964]	0.974
Random forests	BOWs	0.981	[0.979, 0.984]	0.987
	CUIs	0.987	[0.985,0.989]	0.988
Naïve Bayes	BOWs	0.841	[0.815, 0.868]	0.841
	CUIs	0.805	[0.777, 0.833]	0.805
Support vector machines	BOWs	0.923	[0.911, 0.934]	0.923
	CUIs	0.980	[0.975, 0.985]	0.980

ISSN: 1472-6947