Table 5 Comparison of unbalanced data modeling methods.

From: A method for managing re-identification risk from small geographic areas in Canada

Model Evaluation for the 5% Uniqueness Threshold
  AUC Sensitivity Specificity
Down-Sampling 0.9849 0.87 0.996
KZ 0.9849 0.449 0.992
Model Evaluation for the 20% Uniqueness Threshold
  AUC** Sensitivity Specificity
Down-Sampling 0.947 0.74 0.98
KZ 0.949 0.59 0.949
  1. **We tested the difference between the AUC values, and the difference was statistically significant between the two methods only for 20% uniqueness at an alpha level of 0.05