Skip to main content

Table 2 Token distribution of the seven types of entities in three kinds of dataset

From: Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records

Entity

Training set

Validating set

Testing set

Symptom

10,880

3633

3655

Body region

981

451

507

Disease

3760

1260

1339

Surgery

616

215

165

Medication

742

205

197

Family history

137

46

61

Disease course

281

82

104

All

17,367

5892

6028