BMC Medical Informatics and Decision Making

Table 8 Result for the Reddit and Twitter models on in- and out-of-source test data sets compared to the baseline model trained on all of the data

From: Classifying patient and professional voice in social media health posts

Model(s)	Other: F1	Patient voice: F1	Prof. Voice: F1	Macro F1	Acc.	Test
Reddit	0.94	0.95	0.86	0.92	0.95	Reddit:
Twitter	0.74	0.69	0.00	0.47	0.71	3933
All	0.85	0.88	0.30	0.68	0.86
Reddit	0.83	0.50	0.00	0.44	0.73	Twitter:
Twitter	0.98	0.90	0.90	0.93	0.96	1941
All	0.90	0.64	0.26	0.60	0.83
Reddit&Twitter	0.96	0.95	0.88	0.92	0.95	All:
All	0.87	0.85	0.28	0.66	0.85	5474

We also include the results for both models when tested each on in-source test data combined compared to the baseline model trained on all the data (last two rows). We report F1 scores per label, macro-average F1 and accuracy across all three label types as well as the size of the test set

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com