From: Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading
Challenges | Solution by D-ETL Approach |
---|---|
Heterogeneity in source data sets |
• ETL specifications • Rule-based D-ETL engine • Native SQL code acceptance • Custom rule mechanism |
Data extraction interferes with source EHR | • CSV file format |
Efficiency |
• Integrated D-ETL engine • Query optimization |
Duplicate and overlapping data | • Automated data de-duplication and incremental data loading |
Data quality |
• Input data: Extracted data validation • Output data: Data profiling and visualization |
Human expertise |
• Explicit rule structure • Effective rule testing and debugging mechanism |
Resumption (ability to continue from a point where an error previously occurred) | • Modular ETL process |