Skip to main content

Table 2 Precision, recall and F1 score of CLAMP, cTAKES, and MetaMap on 20,408 ASD-related PubMed abstracts

From: Natural language processing (NLP) tools in extracting biomedical concepts from research articles: a case study on autism spectrum disorder

  Number of true positives Number of true entities Number of predicted entities Precision Recall F1 Score
CLAMP unfiltered 96,235 106,284 370,654 0.26 0.91 0.4
CLAMP filtered 89,185 106,284 118,862 0.75 0.84 0.79
cTAKES unfiltered 101,219 106,284 489,520 0.21 0.95 0.34
cTAKES filtered 101,127 106,284 185,966 0.54 0.95 0.69
MetaMap unfiltered 97,992 106,286 1,839,606 0.05 0.92 0.10
MetaMap filtered 92,570 106,286 224,282 0.41 0.87 0.56
  1. The number of true entities represents the number of benchmark (BM) ASD terms found in the texts. MetaMap has a slightly different number of true entities than CLAMP and cTAKES because of the pre-processing methods used in order to run MetaMap on the texts. Details on how the statistics were computed can be found in “Methods”.