Skip to main content

Table 1 Variation Assessment Table for Data Abstraction

From: Assessment of the impact of EHR heterogeneity for clinical research through a case study of silent brain infarction

Variation TypeDefinitionPotential ImplicationExample of Assessment Method
Institutional variationVariation in practice patterns, outcomes, and patient sociodemographic characteristicsInconsistent phenotype definition; unbalanced concept distribution• Compare clinical guideline, protocol, and definition
• Calculate the number of eligible patients divided by screening population
• Calculate the ratio of the proportion of the persons with the disease over the proportion with the exposure
EHR system variationVariation in data type and format caused by different EHR system infrastructureInconsistent data type; different data collection processes• Compare data type, document structure, and metadata
• Conduct a semi-structured interview to obtain information about the context of use
Documentation variationVariation in reporting schemes during the processes of generating clinical narrativesNoisy data• Compare the cosine similarity between two documents represented by vectors
• Conduct a sub-language analysis to assess syntactic variation
Process variationVariation in data collection and corpus annotation processPoor data reliability, validity, and reproducibility• Calculate the degree of agreement among abstractors
• Conduct a semi-structured interview to obtain information about the context of use