Skip to main content

Table 11 Best window sizes under for dataset-A and dataset-C

From: A proficient cost reduction framework for de-duplication of records in data integration

Windowing variant

Window size for Dataset-A

Window size for Dataset-C

SDX

SB4

SDX

SB4

Multipass Windowing – MPW (Highest matches)

3–6

3–6

21–24

6–9

Composite Key Windowing – CKW (Least comparisons)

21–24

6–9

30

30

Single Key Windowing - SKW

21–24

6–9

30

21–24