Skip to main content

Table 3 Distribution of entities in two datasets

From: An attention-based deep learning model for clinical named entity recognition of Chinese electronic medical records

 

Dataset of CCKS 2018

 

Dataset of CCKS 2017

 

Training set (600)

Test set (400)

Entity type

Training set (300)

Test set (100)

Anatomical Part

7838 (52%)

6339 (63%)

Body Part

10,719 (36%)

3021 (32%)

Symptom Description

2066 (14%)

918 (9%)

Symptom

7831 (26%)

2311 (24%)

Independent Symptom

3055 (20%)

1327 (13%)

Diagnosis

722 (2%)

553 (6%)

Drug

1005 (7%)

813 (8%)

Test

9546 (32%)

3143 (33%)

Operation

1116 (7%)

735 (7%)

Treatment

1048 (4%)

465 (5%)