Skip to main content

Table 4 Cluster quality metrics for Lingo and UTC

From: ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials

 

Lingo

UTC

Cluster purity

0.423

0.825

Pairwise cluster contamination

0.644

0.242

Within-cluster similarity

0.363

0.531

  1. The table shows the scores of cluster purity, pairwise cluster contamination and within-cluster similarity achieved by the clustering and cluster labeling algorithms Lingo and UTC. Experiments used 5-fold cross validation and were performed on 1800 clinical trial protocols containing 9 frequently occurring query words.