Skip to main content

Table 4 Coding results for CodiEsp dataset with \(f_s = 180\)

From: Comparison of different feature extraction methods for applicable automated ICD coding

Feature extraction & classifiers Macro-F1 (%) Micro-F1 (%) Macro-AUC (%) Micro-AUC (%)
BoW
 LR_uni 63.55 63.68 70.44 70.04
 SVM_uni 70.68 70.34 75.13 74.45
 LR_uni_bi 63.93 63.85 72.36 71.08
 SVM_uni_bi 72.46 72.27 77.22 75.95
 LR_uni_bi_tri 62.41 62.26 71.39 69.97
 SVM_uni_bi_tri 69.48 69.26 75.39 73.90
W2V
 LR_word 56.07 56.07 64.39 64.62
 SVM_word 59.52 59.63 66.86 67.11
BERT_embeddings
 LR_word 64.00 63.90 69.33 68.61
 SVM_word 59.15 59.02 64.29 64.11
 LR_comb 61.26 60.91 66.32 65.85
 SVM_comb 62.52 62.45 67.68 67.73
BERT_finetune
 top_layer 17.21 22.19 48.79 49.40
 whole 85.32 85.41 91.44 92.82
  1. Aside from BERT_embeddings, the suffixes have the same meanings as those in Table 3. For BERT_embeddings, _word means merely the BERT-mini embeddings, and _comb means concatenating the BERT-mini embeddings and W2V word embbeddings