The primary finding of this study was that structured diagnoses within the EHR for HTN, HLD, and DM can have a marked delay in being recorded compared with the time a diagnosis can be computed from other EHR data. For three common diseases, the average time from which a diagnosis could be computed from laboratory values preceded the manual recording of a structured diagnosis by as much as 389 days. In addition, even one year after a diagnosis can be computed, a large percentage of patients do not have an equivalent structured diagnosis recorded in the EHR. Therefore, while the EHR has several potential advantages to other sources of RWD and can be accessed in near real-time from a technical perspective, the recording of clinical information within the structured history and problem list may be less sensitive and delayed compared to identifying computed diagnoses for certain conditions. Thus, studies based on RWD, the approach to extracting information from the EHR may affect its quality.
The EHR contains a detailed record of a patient's clinical history but extracting this history from the structured and unstructured fields that data can reside in remains a challenge. Our work extends the prior literature, which have focused on the completeness and accuracy of the problem list compared to manual adjudication or next generation phenotyping approaches [22, 23, 25, 30, 37, 38]. The delay in recording a structured diagnosis has the potential to impact the development of cohorts and outcome ascertainment from RWD because analyses based on structured diagnostic codes could preferentially identify patients with a longer history of disease. In addition, analyses limited to patients with more recent data are likely to have a lower prevalence of disease than cohorts with a longer history, which may bias historic comparisons with synthetic or external control arms. Finally, if data are obtained from multiple institutions with varying local diagnostic patterns, additional biases may be introduced to multi-site studies.
While concerns of EHR data completeness are often described as data collection and quality issues, this is, in many cases, primarily a concern when the data are used for secondary research purposes [6, 39, 40]. When assessed from a clinical perspective, information related to a disorder, such as blood pressure measurements or documentation within an unstructured clinical note, can be used by a healthcare provider to draw equivalent conclusions, despite the high potential to be missed during automated digital extraction. Therefore, EHR data may not be missing or of low quality, but are rather collected for clinical, rather than research, purposes. Even with these limitations, EHR data can add significant value when analyzed appropriately. For example, others have demonstrated that what may often be described as noise within EHR data, such as frequency of measurements or presence of repeat diagnoses, can actually be used to predict patient outcome and the temporality of clinical conditions [31, 34, 41].
Despite the potential concerns related to the use of EHR described here, it is important to acknowledge that similar issues can also be found in clinical research and manually adjudicated data sets, such as disease registries. Several studies have shown significant variability in the accuracy and inter-rater reliability of manual data abstraction. One case study by the Office of the Inspector General for the Department of Health and Human Services found that manual nurse review identified 78% (93 of 120) of adverse events in the study population . Similarly, patient report, a common source for clinical research studies, has been found to over- or under-represent even major healthcare events, such as readmission, in nearly 30% of cases . Therefore, strategies to better understand and use RWD to augment data collected through traditional methods have the potential to increase the accuracy and completeness of patient history, clinical events, and healthcare outcomes.
This study has several limitations. First, data were collected from a single site within a healthcare system. We did not assess the possible impact of data quality issues in the EHR and its mapping to the PCORnet Common Data Model, a complex issue we consider to be beyond the scope of this study. However, our findings for the number of missing diagnoses for those with HTN were consistent with previously published studies [26, 27]. This work also focused on only three phenotypes which could be reliably identified from clinical measurements and laboratory testing, all of which were chronic diseases, and did not assess for more complex diagnoses or variations in thresholds for diagnosis. Finally, we did not assess the cause or clinical impact of delayed or missing structured diagnoses.
While strategies to assess data quality and account for variations in data collection for clinical research data have been developed, access to and use of RWD remains a new and rapidly evolving field. Like diagnostic laboratory tests, methods to extract data from the EHR can be viewed as assays with varying sensitivity, specificity, and window periods. Work by the Electronic Medical Records and Genomics (eMERGE)  and OHDSI [38, 45] networks, among others, to create standardized next generation phenotypes will continue to improve our ability to identify clinical cohorts and outcomes. While no single approach may be effective for every study, standardized strategies to assess whether RWD and specific computed phenotypes are fit-for-purpose will need ongoing, and likely use case-specific, assessment.