Skip to main content

Table 4 Coding results for CodiEsp dataset with \(f_s = 180\)

From: Comparison of different feature extraction methods for applicable automated ICD coding

Feature extraction & classifiers

Macro-F1 (%)

Micro-F1 (%)

Macro-AUC (%)

Micro-AUC (%)

BoW

 LR_uni

63.55

63.68

70.44

70.04

 SVM_uni

70.68

70.34

75.13

74.45

 LR_uni_bi

63.93

63.85

72.36

71.08

 SVM_uni_bi

72.46

72.27

77.22

75.95

 LR_uni_bi_tri

62.41

62.26

71.39

69.97

 SVM_uni_bi_tri

69.48

69.26

75.39

73.90

W2V

 LR_word

56.07

56.07

64.39

64.62

 SVM_word

59.52

59.63

66.86

67.11

BERT_embeddings

 LR_word

64.00

63.90

69.33

68.61

 SVM_word

59.15

59.02

64.29

64.11

 LR_comb

61.26

60.91

66.32

65.85

 SVM_comb

62.52

62.45

67.68

67.73

BERT_finetune

 top_layer

17.21

22.19

48.79

49.40

 whole

85.32

85.41

91.44

92.82

  1. Aside from BERT_embeddings, the suffixes have the same meanings as those in Table 3. For BERT_embeddings, _word means merely the BERT-mini embeddings, and _comb means concatenating the BERT-mini embeddings and W2V word embbeddings