Skip to main content

Table 2 Interrater reliability and kappa scores for the internal (n = 980) and multicenter (n = 996) validation set

From: Contextual property detection in Dutch diagnosis descriptions for uncertainty, laterality and temporality

Property (n)/dataset Interrater reliability (%)a Kappa score
Internal validation set (n = 980) Multicenter validation set (n = 996) Internal validation set (n = 980) Multicenter validation set (n = 996)
Laterality 98.0 (NESK = 245, NFJP = 233) 98.1 (NESK = 288, NFJP = 293) 0.94 0.95
Temporality 97.7 (NESK = 163, NFJP = 157) 98.1 (NESK = 96, NFJP = 85) 0.91 0.88
Uncertainty 96.1 (NESK = 135, NFJP = 107) 97.5 (NESK = 98, NFJP = 87) 0.82 0.85
Removal of uncertainty 98.3 (NESK = 11, NFJP = 14) 99.3 (NESK = 8, NFJP = 11) 0.36 0.63
  1. Descending on Kappa scores
  2. aNESK is the sum of the records in the corresponding property according to annotator ESK and NFJP is the sum of the records in the corresponding property according to annotator FJP