BMC Medical Informatics and Decision Making

Table 4 Classification performance results for diseases of interest: Influenza, Diabetes, Pneumonia and HIV

From: Automatic classification of diseases from free-text death certificates for real-time surveillance

			(a) Rule-based
Disease	Precision	Recall	F-measure	Confusion matrix
				Classifier		Ground truth
				-	+
Influenza	0.94	0.89	0.92	68430	2	-
				4	34	+	Influenza
Pneumonia	0.98	0.97	0.97	59351	215	-
				274	8630	+	Pneumonia
Diabetes	0.98	0.96	0.97	62,519	100	-
				212	5639	+	Diabetes
HIV	0.93	0.85	0.89	68,373	6	-
				14	77	+	HIV
Macro-average^a	0.94	0.96	0.95
Micro-average^b	0.98	0.98	0.98
			(b) Machine learning
Disease	Precision	Recall	F-measure	Confusion matrix
				Classifier		Ground truth
				-	+
Influenza	0.84	0.95	0.89	68425	7	-
				2	36	+	Influenza
Pneumonia	0.98	0.97	0.97	59364	202	-
				279	8625	+	Pneumonia
Diabetes	0.98	0.99*	0.99*	62522	97	-
				72	5779	+	Diabetes
HIV	0.91	0.96	0.93	68370	9	-
				4	87	+	HIV
Macro-average	0.93	0.97	0.94
Micro-average	0.98	0.98	0.98

^aMacro-average is the mean of the precision, recall, and f-measure values from the four classes above
^bMicro-average aggregates the values from the confusion matrix for all the classes and calculates the measures over all the data
Statistically significant differences between rules and machine learning as measured with a two-tailed z-test are marked with *, representing p<0.05

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com