Skip to main content

Table 1 Types of features — both term and concept-based — extracted from death certificates

From: Automatic classification of diseases from free-text death certificates for real-time surveillance

 

Feature type

Description

Example certificate extract

Resulting feature values

Term

TokenStem

A token stem, i.e., the stemmedversion of a word.

Acute chronic renal failure

Acut, chronic, renal, failur.

 

TokenStem n-gram

The n-gram formed by n adjacenttoken stems.

chronic renal failure

Chronic renal, renal failur.

Concept

SCTConceptId

SNOMED CT concept identifier (as extracted by the Medtex system)

chronic renal failure

90688005.

  1. (Stemming is a process of removing and replacing word suffixes to arrive at a common root form of the word.)