Skip to main content

Table 1 Precision, recall and F1 scores of CLAMP, cTAKES, and MetaMap on 544 ASD-related full-text PubMed articles

From: Natural language processing (NLP) tools in extracting biomedical concepts from research articles: a case study on autism spectrum disorder

  Number of true positives Number of true entities Number of predicted entities Precision Recall F1 Score
CLAMP unfiltered 43,330 48,706 256,525 0.17 0.89 0.28
CLAMP filtered 39,533 48,706 65,037 0.61 0.81 0.70
cTAKES unfiltered 45,579 48,706 337,125 0.14 0.94 0.24
cTAKES filtered 45,509 48,706 103,783 0.44 0.93 0.60
MetaMap unfiltered 47,544 48,804 1,726,985 0.03 0.97 0.05
MetaMap filtered 45,078 48,804 145,926 0.31 0.92 0.46
  1. The number of true entities represents the number of benchmark (BM) ASD terms found in the texts. MetaMap has a slightly different number of true entities than CLAMP and cTAKES because of the pre-processing methods used in order to run MetaMap on the texts. Details on how the statistics were computed can be found in “Methods