BMC Medical Informatics and Decision Making

Table 4 Summary of the feature settings. (The w denotes the window size. If the value is absent, only feature of the current token is used. The n denotes the n of the n-gram. The ‘len’ denotes the length of affixes. The matching features denote the result of controlled vocabulary matching)

From: Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition

Set	Token	Norm-token	n-gram	character affix	capitalization	POS/Chunk	Matching
#1-context	w = 3	w = 3
#2-morph	w = 3	w = 3		len = 2~3 w = 3
#3-i2b2	w = 5	w = 5	n = 2 w = 5	len = 2~7 w = 3	w = 1
#3-snuh	w = 5	w = 3	n = 2 w = 5	len = 2~3			modifier /control
#3-conll	w = 5			len = 3~4 w = 5	w = 5	n = 1

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com