Skip to main content

Table 4 Cluster quality metrics for Lingo and UTC

From: ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials

  Lingo UTC
Cluster purity 0.423 0.825
Pairwise cluster contamination 0.644 0.242
Within-cluster similarity 0.363 0.531
  1. The table shows the scores of cluster purity, pairwise cluster contamination and within-cluster similarity achieved by the clustering and cluster labeling algorithms Lingo and UTC. Experiments used 5-fold cross validation and were performed on 1800 clinical trial protocols containing 9 frequently occurring query words.