From: Comparison of different feature extraction methods for applicable automated ICD coding
 | Fuwai | CodiEsp | |||
---|---|---|---|---|---|
Word | Character | Code | Word | Code | |
Token size | 691,418 | 1,557,769 | 44,366 | 161,078 | 11,158 |
Vocabulary size | 9130 | 1768 | 1532 | 14,885 | 2557 |
Average length | 99.5 | 224.2 | 6.4 | 161.1 | 11.2 |