Skip to main content

Table 1 Statistics of CCKS2017_CNER and ICRC_CNER for entity recognition in Chinese clinical text

From: Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF

Dataset (CCKS2017_CNER)

#Record

#Clinical Entity

#Body

#Disease

#Symptom

#Test

#Treament

#All

Training (300)

300

10,719

722

7831

9546

1048

29,866

Test (100)

100

3021

553

2311

3143

465

9493

Total (400)

400

13,740

1275

10,142

12,689

1513

39,359

Dataset (ICRC_CNER)

#Record

#Clinical Entity

#Medication

#Disease

#Symptom

#Test

#Treament

#All

Training

600

1293

11,470

5270

17,024

3065

38,122

0

7441

75

7

107

7630

Development

176

475

3594

1738

5276

938

12,021

0

2421

37

3

41

2502

Test

400

999

7932

3353

11,326

2020

35,630

3

5153

57

6

61

5280

Total

1176

2767

22,996

10,361

33,626

6023

75,773

3

15,015

169

16

209

15,412