Skip to main content

Table 7 Sentence detection

From: Detection of sentence boundaries and abbreviations in clinical narratives

Top 10

1

w 2

2

w 2

3

w 2

1

∈ CCDict

0.07

Capitalization

1.84

No "\n"

0.32

2

∈ MDDict

2.15E-3

All upper case

0.54

Double "\n"

0.06

3

-

-

Contains digit

0.27

Single "\n"

0.03

4

-

-

Contains period

1.59E-5

-

-

5-10

-

-

-

-

-

-

  1. Top 10 feature rankings per feature set (1 Language features; 2 Rule-based features; 3 Text format features). w2: Weight based feature relevance criterion.