Table 2 Rules for lexical feature generation

From: Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts

Rule name Description
Negation Determine negation by checking existence of negative word e.g. a negative adjective modifying the verb or relation trigger, or a semantically negative word
Tense (active/passive) Tense determination
Words in between entities The sequence of words between the two entities
Surface distance Distance between the two recognized entities (including existing tokens and entity itself)
Window left entity A window of k words to the left of Entity 1, with their part-of-speech tags
Window right entity A window of k words to the right of Entity 2, with their part-of-speech tags
Detect nominalization Existence of nominalized verb located in left/right position of the entity and distance from specific entity
Weak nominalization Detect when an entity occurs after a preposition and whether a nominalized biomedical verb is located ahead of that preposition