Table 4 Performance Comparison between 525 Human Manual Rating and Deep Learning Model. Manual performance 526 is reported as percent agreement. Automated performance is reported as Implementation Accuracy (see Table 3).

From: AutoDiscern: rating the quality of online health information with hierarchical encoder attention-based neural networks

QuestionManual PerformanceAutomated Performance
 2 raters3 raters80% coverage100% coverage
Q4: References (HoN: Reference)96%89%87%84%
Q5: Date (HoN: Date)88%80%87%83%
Q9: How Treatment Works92% 82%78%
Q10: Treatment Benefits95% 83%77%
Q11: Tt. Risks (HoN: Justifiability)97%74%91%81%