Skip to main content

Table 1 Lexical Variants Included in this Paper

From: Complexities, variations, and errors of numbering within clinical notes: the potential impact on information extraction and cohort-identification

Lexical Variant Category

Examples

Positive integers

‘three’, ‘thirty-three’, ‘seventy-three’

Negative integers

‘minus three’, ‘minus 3’

Fractions

‘one third’, ‘one thirds’, ‘six eights’

Dimensions

‘one by three’, ‘two by four’

Ranges/odds

‘one to three’, ‘two to four’

Dates, including invalid

‘January 35’, ‘June 31’, ‘September 38’

Roman numerals

‘X’, ‘XV’, ‘XXIV’, ‘XXVIII’, ‘XXXV’

Medical classifications

‘1A’, ‘IID’, ‘type 2’, ‘type II’, ‘class III’

Ages, including implausible values

‘135 year old’ ‘septuagenarian’

Expressions of quantity

‘billions’, ‘octillion’, ‘gobs of’

Ordering/ranking

‘1st’, ‘1rd’, ‘firstly’, ‘1stly’, ‘primary’

Tuples

‘single’, ‘double’, ‘triple’, ‘quadruple’