Improving sensitivity of machine learning methods for automated case identification from free-text electronic medical records

Table 6 Sensitivity and specificity of various classifiers trained on the acute renal failure data set for difference percentages of over-sampling

Over-sampling	SVM		MyC		RIPPER		C4.5		Imbalance
(%)	Sens.	Spec.	Sens.	Spec.	Sens.	Spec.	Sens.	Spec.	ratio
0	0.62	0.92	0.69	0.90	0.75	0.89	0.69	0.88	16
100	0.66	0.86	0.78	0.80	0.81	0.76	0.74	0.75	8
200	0.71	0.81	0.84	0.71	0.84	0.65	0.77	0.67	5
300	0.74	0.77	0.89	0.59	0.88	0.65	0.80	0.65	4
400	0.76	0.73	0.89	0.51	0.86	0.64	0.81	0.61	3
500	0.77	0.69	0.89	0.48	0.84	0.64	0.82	0.60	3
600	0.78	0.66	0.91	0.48	0.89	0.59	0.82	0.60	2
700	0.82	0.60	0.92	0.43	0.89	0.54	0.82	0.60	2
800	0.82	0.57	0.94	0.37	0.86	0.60	0.82	0.61	2
900	0.83	0.55	0.93	0.36	0.89	0.53	0.83	0.61	2
1000	0.84	0.54	0.95	0.36	0.88	0.54	0.83	0.61	1

ISSN: 1472-6947