From: A proficient cost reduction framework for de-duplication of records in data integration
Dataset name
No. of fields
No. of records
No. of original records
No. of duplicate records
Dataset-A
12
1000
500
Dataset-C
600
400