Skip to main content

Table 4 Statistics about predictive properties obtained for the high-dimensional datasets

From: Efficient and effective pruning strategies for health data de-identification

Attributes

Transformations

Checked

Inserts

Hits

Antichain

3

96

12 (12.50 %)

4

17.39 %

75.00 %

4

480

50 (10.42 %)

18

20.87 %

55.56 %

5

1,440

89 (6.18 %)

34

22.84 %

44.12 %

6

4,320

177 (4.10 %)

61

25.13 %

36.07 %

7

12,960

449 (3.46 %)

157

28.06 %

31.85 %

8

38,880

820 (2.11 %)

284

29.99 %

24.65 %

9

116,640

3,872 (3.32 %)

1,187

34.26 %

22.16 %

10

466,560

15,858 (3.40 %)

4,486

36.80 %

22.78 %

11

1,399,680

32,507 (2.32 %)

10,119

37.70 %

20.43 %

12

4,199,040

76,679 (1.83 %)

25,211

38.36 %

18.84 %

13

12,597,120

265,762 (2.11 %)

85,303

38.74 %

19.59 %

14

37,791,360

626,383 (1.66 %)

199,747

39.15 %

20.75 %

15

113,374,080

1,634,751 (1.44 %)

514,863

39.31 %

20.17 %

  1. We report the size of the solution space, the percentage of transformations checked as well as the number of inserts, the number of hits and the maximal size of the antichain for the predictive property insufficient quality. The size of the antichain is expressed relatively to the number of inserts