Skip to main content

Table 8 Five Way Classification Including 'Intervention' on Manually Annotated Abstracts

From: Sentence retrieval for abstracts of randomized controlled trials

  All Abstracts Structured Subset Unstructured Subset
  P R F P R F P R F
System 1 Accuracy = 90.14% Accuracy = 91.39% Accuracy = 87.35%
Aim 0.92 0.97 0.94 0.94 0.98 0.96 0.88 0.95 0.91
Method 0.85 0.81 0.83 0.86 0.83 0.84 0.80 0.76 0.78
Intervention 0.87 0.78 0.82 0.88 0.80 0.84 0.85 0.74 0.79
Results 0.91 0.97 0.92 0.91 0.95 0.93 0.89 0.92 0.90
Conclusion 0.96 0.94 0.95 0.98 0.94 0.96 0.92 0.94 0.93
System 2 Accuracy = 95.24% Accuracy = 96.45% Accuracy = 92.51%
Aim 0.94 0.99 0.99 0.96 1.00 0.98 0.90 0.96 0.93
Method 0.92 0.91 0.92 0.93 0.93 0.93 0.89 0.87 0.88
Intervention 0.87 0.79 0.83 0.88 0.80 0.84 0.85 0.75 0.80
Results 0.98 0.99 0.99 0.99 1.00 0.99 0.96 0.97 0.96
Conclusion 0.99 0.99 0.99 1.00 1.00 1.00 0.89 0.87 0.88
System 3 Accuracy = 95.60% Accuracy = 96.45% Accuracy = 94.55%
Aim 0.95 0.98 0.97 0.96 1.00 0.98 0.93 0.97 0.95
Method 0.92 0.92 0.92 0.93 0.93 0.93 0.91 0.91 0.91
Intervention 0.87 0.80 0.83 0.88 0.80 0.84 0.86 0.78 0.82
Results 0.98 0.99 0.99 0.99 1.00 0.99 0.97 0.99 0.98
Conclusion 0.99 0.99 0.99 1.00 1.00 1.00 0.99 0.98 0.99
System 4 Accuracy = 93.89% Accuracy = 95.02% Accuracy = 91.34%
Aim 0.95 0.97 0.96 0.96 0.99 0.97 0.92 0.93 0.93
Method 0.88 0.89 0.88 0.89 0.90 0.90 0.85 0.85 0.85
Intervention 0.77 0.71 0.74 0.78 0.73 0.75 0.76 0.69 0.72
Results 0.99 0.99 0.99 1.00 1.00 1.00 0.96 0.96 0.97
Conclusion 0.99 0.99 0.99 1.00 1.00 1.00 0.96 0.98 0.97
  1. Sentence classification using CRFs into five classes including Intervention. Results report on four systems. System 1: baseline system. System 2: feature vectors augmented with section headings from the four rhetorical roles, where they are either mapped from original headings in structured abstracts or predicted by the four class CRF model for unstructured abstracts. System 3 (oracle): feature vectors augmented with manually corrected section headings. System 4: same as System 2 except the training data is also augmented with training data from Set I. Precision (P), Recall (R) and F-score (F) are reported for each label over the entire data set (318), the structured subset (211) and unstructured subset (107).