Skip to main content

Table 6 Sensitivity, Specificity and Positive Predictive Value of Full Model

From: Use of name recognition software, census data and multiple imputation to predict missing data on ethnicity: application to cancer registry records

Ethnic group

Sensitivity

Specificity

Positive predictive value

White

99.7%

56.0%

98.2%

South Asian

94.7%

99.8%

90.4%

Black

20.4%

99.8%

63.6%

Chinese/Other

21.0%

99.9%

57.6%

Mixed

0%

100%

 
  1. A multinomial logistic regression model was used to predict ethnic group. The model was developed on a randomly selected 50% sample of the 85352 cases whose ethnicity was recorded in the HES dataset. The remaiming 50% of cases were used to validate the model and derive the above estimates. The predictors used in the model were: ethnicity derived from name recognition software; Census estimates of ethnic distribution of population; number of hospital admissions; year of diagnosis; patient seen outside the NHS (yes/no); screen-detected cancer (yes/no); death certificate only cancer registration (yes/no); cancer treatment type (surgery/radiotherapy/chemotherapy); deprivation score; gender; age at diagnosis; cancer site; and death during follow-up period (all-cause and due to primary cancer separately) and time to death/censoring (Nelson-Aalen cumulative hazard).