Skip to main content

Table 4 Comparison among embeddings for weakly supervised Text-to-UMLS linking from MIMIC-III discharge summaries

From: Ontology-driven and weakly supervised rare disease identification from clinical notes

 

validation (n=142+/400)

test (n=187+/673)

Text to UMLS

P

R

\(F_{1}\)

P

R

\(F_{1}\)

Word2Vec-100

86.6

50.0

63.4

85.1

61.0

71.0

Word2Vec-300

85.7

59.2

70.0

80.7

69.5

74.7

Word2Vec-768

85.1

68.3

75.8

78.9

78.1

78.5

BERT

88.1

83.8

85.9

79.5

91.4

85.1

PubMedBERT

88.7

77.5

82.7

79.6

87.7

83.5

SapBERT

88.3

79.6

83.7

80.8

89.8

85.1

BlueBERT-base

90.1

89.4

89.8

80.4

92.0

85.8

+ fine-tuning

84.6

88.7

86.6

73.5

92.0

81.7

BlueBERT-large

89.1

80.3

84.4

79.0

88.8

83.6

  1. The column statistics (n=\(N_+\)+/N) show number of positive data \(N_+\) and all samples N in the dataset. All word2vec-k embeddings were pre-trained from MIMIC-III discharge summaries, representing the mention as the averaged k-dimensional embedding of tokens in the context window. BERT models were used as static features (in the second-last layer) if not specified with “fine-tuning”. The best scores, either or not considering strong supervision (SS), are bolded. We did not tune the optimal number of random weakly supervised training data for BlueBERT-base model (and all other models), thus its results were slightly below those in Table 3