From: An automated data cleaning method for Electronic Health Records by incorporating clinical knowledge
Test name | Completeness: percentage of missing values (%) | Correctness: percentage of normal values (%) | Number of observations | ||||
---|---|---|---|---|---|---|---|
Original | After preprocessing | After unit change | After all steps | Original | After all steps | ||
Hemoglobin | 0 | 0.02 | 0.02 | 0.03 | 92.68 | 92.71 | 1,061,333 |
Lymphocyte | 0 | 0.03 | 0.03 | 0.04 | 14.23 | 54.10 | 1,060,664 |
Eosinophils | 0 | 0.03 | 0.03 | 3.87 | 15.71 | 96.43 | 1,055,109 |
Monocyte | 0 | 0.03 | 0.03 | 0.26 | 14.12 | 69.93 | 1,053,768 |
Basophil | 0 | 0.06 | 0.06 | 0.06 | 17.91 | 73.09 | 1,027,615 |
Hematocrit | 0 | 0.02 | 0.02 | 0.02 | 94.02 | 97.17 | 1,025,484 |
Erythrocyte | 0 | 0.08 | 0.08 | 0.09 | 70.61 | 74.82 | 1,025,399 |
Leukocyte | 0 | 0.10 | 0.10 | 9.70 | 34.04 | 98.21 | 1,013,012 |
MCV | 0 | 0.03 | 0.03 | 0.24 | 92.53 | 92.80 | 1,003,326 |
MCH | 0 | 0.03 | 0.03 | 0.24 | 84.16 | 84.42 | 997,867 |