Skip to main content

Table 6 Statistics of the ParaMed corpus

From: ParaMed: a parallel corpus for English–Chinese translation in the biomedical domain

Language Articles Sentences Avg. Len. Tokens Unique Tokens
English 1,966 97,441 31.08 3,028,434 55,673
Chinese 29.93 2,916,779 46,700