Table 3 Challenges and solutions

Challenges Solution by D-ETL Approach
Heterogeneity in source data sets • ETL specifications • Rule-based D-ETL engine • Native SQL code acceptance • Custom rule mechanism
Data extraction interferes with source EHR • CSV file format
Efficiency • Integrated D-ETL engine • Query optimization
Duplicate and overlapping data • Automated data de-duplication and incremental data loading
Data quality • Input data: Extracted data validation • Output data: Data profiling and visualization
Human expertise • Explicit rule structure • Effective rule testing and debugging mechanism
Resumption (ability to continue from a point where an error previously occurred) • Modular ETL process