Skip to main content

Table 5 Abbreviation detection

From: Detection of sentence boundaries and abbreviations in clinical narratives

Top 10

[1]

w 2

[1-2]

w 2

[1-3]

w 2

1

Contains period

0.30

Contains period

0.35

S 2

5885.83

2

All upper case

0.02

C(L norm , •)

0.18

S 3

4855.66

3

Contains digit

0.01

logλ

0.13

S 4

1999.51

4

-

-

C(¬L norm , ¬•)

0.12

S 2 , S 3

1798.60

5

-

-

C(¬L norm , •)

0.09

logλ

1180.39

6

-

-

C(L norm , ¥)

0.09

S 5

894.98

7

-

-

All upper case

0.02

S 4 , S 5

715.70

8

-

-

Contains digit

8.16E-5

S 2 , S 5

617.98

9

-

-

-

-

S 2 , S 4 , S 5

474.86

10

-

-

-

-

S 3 , S 4 , S 5

256.81

Top 10

[1-4]

w 2

[1-5]

w 2

[1-6]

w 2

1

S 2

1063.78

S 5

1027.15

LT

952.62

2

S 3

962.33

S 4 , S 5

914.02

Mean-LT

952.62

3

S 2 , S 3

507.82

S 2 , S 5

610.69

All upper case

549.64

4

S 4

391.68

S 2 , S 4 , S 5

527.28

S 3 , S 4 , S 5

529.85

5

S 3 , S 4 , S 5

379.70

S 2

463.94

S 3 , S 5

521.60

6

S 3 , S 5

325.68

S 3 , S 4 , S 5

274.81

erforderl.

403.54

7

S 5

265.62

S 3 , S 5

253.30

pathol.

392.23

8

S 4 , S 5

222.55

Mean-LT

145.91

verschiebl.

375.40

9

logλ

143.67

LT

145.91

d-lat.

358.11

10

S 2 , S 5

129.90

S 2 , S 4

90.13

entzündl.

345.21

  1. Top 10 feature rankings per feature set (1 Rule-based features; 2 Statistical features; 3 Scaling features; 4 Language-dependent features; 5 Length features; 6 Word type features). Length (LT); w2: Weight based feature relevance criterion.