Skip to main content

Advertisement

Table 4 Summary of the feature settings. (The w denotes the window size. If the value is absent, only feature of the current token is used. The n denotes the n of the n-gram. The ‘len’ denotes the length of affixes. The matching features denote the result of controlled vocabulary matching)

From: Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition

Set Token Norm-token n-gram character affix capitalization POS/Chunk Matching
#1-context w = 3 w = 3      
#2-morph w = 3 w = 3   len = 2~3 w = 3    
#3-i2b2 w = 5 w = 5 n = 2 w = 5 len = 2~7 w = 3 w = 1   
#3-snuh w = 5 w = 3 n = 2 w = 5 len = 2~3    modifier /control
#3-conll w = 5    len = 3~4 w = 5 w = 5 n = 1