Skip to main content

Table 1 Description of the dataset by health topic

From: AutoDiscern: rating the quality of online health information with hierarchical encoder attention-based neural networks

TopicBreast CancerArthritisDepression
Number of Articles7988102
Number of Sentences10,17010,95013,790
Number of Tokens125,891129,759160,597
Avg Sentences per Article129124135
Avg Tokens per Article1,5491,4751,574
Positive Class Prevalence   
Q4: References13%14%14%
Q5: Date20%26%24%
Q9: How Treatment Works85%28%52%
Q10: Treatment Benefits89%80%65%
Q11: Treatment Risks63%16%33%