Skip to main content

Advertisement

Table 18 Results from a Cohort Identification Experimenta

From: Complexities, variations, and errors of numbering within clinical notes: the potential impact on information extraction and cohort-identification

(a) (b) (c) (d) (e) (f) (g)
Phrase 1 (containing the Arabic numerical variant) Number of patients with Phrase 1 only % of patients missed if searching only for Phrase 1 Number of patients with both Phrase 1 and Phrase 2 Number of patients with Phrase 2 only % of patients missed if searching only for Phrase 2 Phrase 2 (containing the Roman numerical variant)
citrullinemia type 1 2 25.0 1 1 50.0 citrullinemia type I
type 2 diabetes mellitus 43,777 10.5 7919 6053 75.8b type II diabetes mellitus
type 1 neurofibromatosis 181 24.5 56 77 57.6b type I neurofibromatosis
Tanner Stage 3 7639 57.8b 1373 12,367 35.7 Tanner Stage III
grade 3 anaplastic astrocytoma 42 36.7 27 40 38.5 grade III anaplastic astrocytoma
stage 3 chronic kidney disease 615 67.4b 446 2190 18.9 stage III chronic kidney disease
factor 9 deficiency 14 68.1b 51 139 6.9 factor IX deficiency
class 3 malocclusion 135 81.2b 115 1079 10.2 class III malocclusion
phase 1 clinical trial 320 66.5b 263 1158 18.4 phase I clinical trial
Mallampati score: 4 121 27.8 1 47 71.6b Mallampati score: IV
  1. aReesults from a cohort identification exercise for 10 diagnoses and clinical findings in the clinical notes, including counts of the number of patients identified by searching for phrases containing either the Arabic or Roman numeral variants, or both. The percentage of patients potentially missed by searching for only one of the variants is displayed
  2. b Cells with percentages > 50%