From: Estimating the re-identification risk of clinical data sets
Description | Quasi-identifiers | No. Records |
---|---|---|
Adult | Â | 32,561 |
The adult dataset from the UC Irvine machine learning data repository. This is an extract from the US census and has common demographics and socio-economic status variables: ftp://ftp.ics.uci.edu/pub/machine-learning-databases/adult | · Age |  |
 | · Profession |  |
 | · Education |  |
 | · Marital status |  |
 | · Race |  |
 | · Sex |  |
 | · Country |  |
FARS | · | 43,330 |
Department of Transportation Fatal crash information: http://www-fars.nhtsa.dot.gov/main.cfm | · Age |  |
 | · Race |  |
 | · Month of Death |  |
 | · Day of Death |  |
CUP | Â | 95,412 |
Data from the Paralyzed Veterans Association on veterans with spinal cord injuries or disease: http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html | · ZIP code |  |
 | · Age |  |
 | · Gender |  |
 | · Income |  |
Pharm | Â | 16,424 |
Prescription records from the Children’s Hospital of Eastern Ontario pharmacy from July 2006 to March 2009. This is for inpatients only and excludes acute cases. A de-identified version of this data was disclosed to commercial data aggregators [67]. | · Age |  |
 | · Postal code (FSA) |  |
 | · Admission date |  |
 | · Discharge date |  |
 | · Sex |  |
ED | Â | 108,344 |
Emergency department records from Children’s Hospital of Eastern Ontario from 1st June 2007 to 1st June 2009. This data is disclosed for the purpose of disease outbreak surveillance. | · Admission date |  |
 | · Postal Code |  |
 | · Date of Birth |  |
 | · Sex |  |
Niday | Â | 637,964 |
A registry of all newborns in Ontario from 1st April 2004 to 31st March 2009. This data set is used frequently for research purposes: http://www.bornontario.ca | · Maternal postal code |  |
 | · Baby DoB |  |
 | · Mother DoB |  |
 | · Baby sex |  |