Skip to main content

Table 2 Rare disease classification performance on HaoDaiFu corpus

From: Improving rare disease classification using imperfect knowledge graph

Percentage Bins

(0, 0.02%]

(0.02%, 0.05%]

(0.05%, 0.1%]

(0.1%, 0.5%]

(0.5%, 1%]

 

89 diseases

277 diseases

205 diseases

194 diseases

32 diseases

 

F1

MRR

F1

MRR

F1

MRR

F1

MRR

F1

MRR

BOW

34.10

45.86

40.80

49.91

49.48

58.81

53.23

62.80

62.23

75.31

LSTM

0.00

0.41

0.01

1.07

0.38

5.91

12.29

27.23

40.07

53.04

UpSample

35.17

47.10

40.69

50.43

47.63

57.63

49.85

59.75

58.6

68.95

χ2

34.04

46.75

40.81

50.66

49.15

58.53

51.74

61.38

61.55

74.05

BOW+ χ2

34.56

47.25

42.41

51.84

50.03

59.33

53.15

62.34

62.10

73.97

KG1

33.66

44.98

38.25

47.45

45.17

53.97

48.07

57.55

59.21

71.29

KG12

33.51

44.92

39.08

48.07

45.23

54.55

48.66

58.00

59.2

71.43

BOW+KG\(^{\text {pseudo-doc}}_{1}\)

31.91

42.81

37.51

46.08

44.08

53.22

47.01

56.94

55.91

69.47

BOW+KG\(^{\text {pseudo-count}}_{1}\)

34.87

46.14

41.74

50.14

49.31

57.94

52.56

61.59

61.65

74.19

BOW+KG\(^{\text {late-fusion}}_{1}\)

33.33

45.42

38.41

48.68

47.15

56.39

51.13

60.18

61.42

73.30

BOW+KG\(^{\text {early-fusion}}_{1}\)

36.87

48.36

43.11

51.79

50.06

58.99

52.86

61.90

61.90

73.57

BOW+KG\(^{\text {early-fusion}}_{12}\)

36.94

48.22

42.63

51.40

49.66

58.62

52.60

61.51

61.47

73.23

  1. The higher F1 and MMR, the better. Each column’s highest number is shown in boldface, second highest number shown with underline. The left three percentage bins are rare disease bins; the right two bins are for comparison purposes. “ ” denotes results significantly higher than BOW (randomization test, significance level α=0.05)