Estimating the re-identification risk of clinical data sets

Dankar, Fida Kamal; El Emam, Khaled; Neisa, Angelica; Roffey, Tyson

doi:10.1186/1472-6947-12-66

BMC Medical Informatics and Decision Making

Table 1 The data sets that will be included in our simulation

From: Estimating the re-identification risk of clinical data sets

Description	Quasi-identifiers	No. Records
Adult		32,561
The adult dataset from the UC Irvine machine learning data repository. This is an extract from the US census and has common demographics and socio-economic status variables: ftp://ftp.ics.uci.edu/pub/machine-learning-databases/adult	· Age
	· Profession
	· Education
	· Marital status
	· Race
	· Sex
	· Country
FARS	·	43,330
Department of Transportation Fatal crash information: http://www-fars.nhtsa.dot.gov/main.cfm	· Age
	· Race
	· Month of Death
	· Day of Death
CUP		95,412
Data from the Paralyzed Veterans Association on veterans with spinal cord injuries or disease: http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html	· ZIP code
	· Age
	· Gender
	· Income
Pharm		16,424
Prescription records from the Children’s Hospital of Eastern Ontario pharmacy from July 2006 to March 2009. This is for inpatients only and excludes acute cases. A de-identified version of this data was disclosed to commercial data aggregators [67].	· Age
	· Postal code (FSA)
	· Admission date
	· Discharge date
	· Sex
ED		108,344
Emergency department records from Children’s Hospital of Eastern Ontario from 1^st June 2007 to 1^st June 2009. This data is disclosed for the purpose of disease outbreak surveillance.	· Admission date
	· Postal Code
	· Date of Birth
	· Sex
Niday		637,964
A registry of all newborns in Ontario from 1^st April 2004 to 31^st March 2009. This data set is used frequently for research purposes: http://www.bornontario.ca	· Maternal postal code
	· Baby DoB
	· Mother DoB
	· Baby sex

Each data set is treated as a population. The data set size as well as the variables which will be included in the analysis are shown.

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com