Corpus | Text type and size | Annotations (count) |
---|---|---|
NICTA-PIBOSO [14] | 10 000 sentences from 1000 MEDLINE abstracts | Sentences classified in the PIBOSO model: Population (812), Intervention (690), Background (2557), Outcome (4523), Study design (233) and Other (1564) |
Deléger et al. [16] | 52 FDA labels (96 675 tokens), 3503 clinical notes (>1M tokens) and CTAs (241 annotated with drugs, 51 793 tokens; 3000 annotated with disorders/symptoms, 647 246 tokens) | Disease and symptoms (12 388), medications and drug attributes (74 507) |
EBM corpus [17] | Clinical Enquiries section from the Journal of Family Practice, and excerpts from PubMed | Medical questions (456), bottom-line answers (1396), justifications (3036); these are matched to 2908 abstracts |
EBM-NLP [18] | 5000 abstracts about clinical trials from PubMed (>1M tokens) | Entities corresponding to PICO elements (counts not reported) |
Evidence Inference corpus [19] | More than 10 137 evidence questions (prompts) matched to 2419 PubMed articles about RCTs | Intervention results significantly increase (2428), significantly decrease (4470) or show no significant difference (3239) |
EBMSASS [22] | 1000 pairs of sentences of clinical evidence | Elements from the PIBOSO model (200 pairs for each class) |
Koroleva et al. [20] | Sentences from clinical trial studies in PubMed Central | Outcomes: Primary (2000 sentences) and Reported (1940) |
Chia [24] | 1000 texts from ClinicalTrials.gov (12 409 elibility criteria) | 15 entity types (41 487) and 12 different relationships (25 017) |