Skip to main content

Table 4 Rules to use for constraining the search for the de-identification solution

From: De-identifying a public use microdata file from the Canadian national discharge abstract database

Relationship between maximum acceptable generalization (M) and adversary's background knowledge (Q)

Generalization level to use for suppression (S)

Generalization level to release data at

M ≥ Q(analysts could accept a generalization equal to or less detailed than the adversary's knowledge)

S = M (suppress at level M)

Only include data generalized to level S = M

M <Q (analysts need a version that is more detailed than the adversary's knowledge)

S = Q

The PUMF can have data at level M in the generalization hierarchy, except when it generalizes to a suppressed value at level S = Q.

  1. The generalizations on the right-hand column can be applied to each quasi-identifier separately. The symbol M denotes the generalization hierarchy level representing the highest generalization level acceptable for analysis. The symbol Q denotes the generalization hierarchy level representing the adversary's background knowledge, and S denotes the level at which the suppression should be performed. (In our example, M = 1, and Q = 2. Therefore, using this example, the M <Q condition is met and we should: suppress at level Q; and include partially suppressed data at level M).