Skip to main content

Table 6 Statistics of the ParaMed corpus

From: ParaMed: a parallel corpus for English–Chinese translation in the biomedical domain

Language

Articles

Sentences

Avg. Len.

Tokens

Unique Tokens

English

1,966

97,441

31.08

3,028,434

55,673

Chinese

29.93

2,916,779

46,700