Skip to main content

Table 1 Partition characteristics

From: Creating a medical dictionary using word alignment: The influence of sources and resources

Partition Content Word correlation Character correlation Word ratio difference Rubrics English rubric average number (standard deviation) of words Swedish rubric average number (standard deviation) of words English unique words Swedish unique words English unique words per rubric Swedish unique words per rubric
All All terminology systems 0.78 0.79   38,575 3.7 (2.9) 3.3 (3.0) 17,679 25,848 0.5 0.7
1 MeSH, one word in either English or Swedish rubric    0.56 13,514 1.5 (0.7) 1.0 (0.1) 11,267 13,581 0.8 1.0
2 MeSH, more than one word in both English and Swedish rubrics 0.52 0.71 0.30 5,568 2.6 (0.8) 2.3 (0.7) 5,434 6,443 1.0 1.2
3 ICF, whole 0.69 0.79 0.53 1,496 4.7 (2.5) 4.2 (2.8) 991 1,263 0.7 0.8
4 KSH97-P, whole 0.70 0.67 0.49 968 4.0 (2.5) 3.5 (2.4) 1,324 1,382 1.4 1.4
5 ICD-10, except chapter 2 level 4 0.77 0.75 0.37 10,791 5.2 (3.0) 5.2 (3.4) 5,144 7,219 0.5 0.7
6 NCSP, except chapter N 0.64 0.63 0.38 4,137 5.8 (2.7) 5.0 (2.5) 1,758 2,347 0.4 0.6
7 ICD-10, chapter 2 level 4 0.38 0.45 0.71 713 3.6 (2.2) 6.3 (2.7) 443 535 0.6 0.8
8 NCSP, chapter N 0.55 0.48 0.25 1,388 9.4 (2.6) 7.7 (2.3) 249 285 0.2 0.2
  1. Content of the partitions.
  2. Kendall's tau-b correlation between the English rubrics and corresponding Swedish rubrics according to number of words and number of characters and average absolute differences between the ratio for all rubrics in the partition and the grand mean of the different terminology partitions.
  3. Number of parallel rubrics, average number and standard deviation of words per rubrics, number of unique words, and number of average unique words per rubric of the different terminology partitions.