Skip to main content

Table 10 Sentence detection

From: Detection of sentence boundaries and abbreviations in clinical narratives

Top 10

[1]

w 2

[1-2]

w 2

[1-3]

w 2

1

∈ CCDict

0.07

Capitalization

2.67

Capitalization

1.54

2

∈ MDDict

2.15E-3

All upper case

0.47

No "\n"

1.09

3

-

-

∈ CCDict

0.43

∈ CCDict

0.58

4

-

-

Contains digit

0.21

Double "\n"

0.48

5

-

-

Contains period

0.02

All upper case

0.17

6

-

-

∈ MDDict

8.32E-4

Single "\n"

0.11

7

-

-

-

-

Contains digit

0.07

8

-

-

-

-

∈ MDDict

0.03

9

-

-

-

-

Contains period

0.01

10

-

-

-

-

-

-

  1. Top 10 feature rankings per feature set (1 Language features; 2 Rule-based features; 3 Text format features). w2: Weight based feature relevance criterion.