Skip to main content

Table 6 Sensitivity, Specificity and Positive Predictive Value of Full Model

From: Use of name recognition software, census data and multiple imputation to predict missing data on ethnicity: application to cancer registry records

Ethnic group Sensitivity Specificity Positive predictive value
White 99.7% 56.0% 98.2%
South Asian 94.7% 99.8% 90.4%
Black 20.4% 99.8% 63.6%
Chinese/Other 21.0% 99.9% 57.6%
Mixed 0% 100%  
  1. A multinomial logistic regression model was used to predict ethnic group. The model was developed on a randomly selected 50% sample of the 85352 cases whose ethnicity was recorded in the HES dataset. The remaiming 50% of cases were used to validate the model and derive the above estimates. The predictors used in the model were: ethnicity derived from name recognition software; Census estimates of ethnic distribution of population; number of hospital admissions; year of diagnosis; patient seen outside the NHS (yes/no); screen-detected cancer (yes/no); death certificate only cancer registration (yes/no); cancer treatment type (surgery/radiotherapy/chemotherapy); deprivation score; gender; age at diagnosis; cancer site; and death during follow-up period (all-cause and due to primary cancer separately) and time to death/censoring (Nelson-Aalen cumulative hazard).