Skip to main content

The validity of synthetic clinical data: a validation study of a leading synthetic data generator (Synthea) using clinical quality measures



Clinical data synthesis aims at generating realistic data for healthcare research, system implementation and training. It protects patient confidentiality, deepens our understanding of the complexity in healthcare, and is a promising tool for situations where real world data is difficult to obtain or unnecessary. However, its validity has not been fully examined, and no previous study has validated it from the perspective of healthcare quality, a critical aspect of a healthcare system. This study fills this gap by calculating clinical quality measures using synthetic data.


We examined an open-source well-documented synthetic data generator Synthea, which was composed of the key advancements in this emerging technique. We selected a representative 1.2-million Massachusetts patient cohort generated by Synthea. Four quality measures, Colorectal Cancer Screening, Chronic Obstructive Pulmonary Disease (COPD) 30-Day Mortality, Rate of Complications after Hip/Knee Replacement, and Controlling High Blood Pressure, were selected based on clinical significance. Calculated rates were then compared with publicly reported rates based on real-world data of Massachusetts and United States.


Of the total Synthea Massachusetts population (n = 1,193,439), 394,476 were eligible for the “colorectal cancer screening” quality measure, and 248,433 (63%) were considered compliant, compared to the publicly reported Massachusetts and national rates being 77.3 and 69.8%, respectively. Of the 409 eligible patients, 0.7% of died within 30 days after COPD exacerbation, versus 7% reported in Massachusetts and 8% nationally. Using an expanded logic, this rate increased to 5.7%. No Synthea residents had complications after Hip/Knee Replacement (Massachusetts: 2.9%, national: 2.8%) or had their blood pressure controlled after being diagnosed with hypertension (Massachusetts: 74.52%, national: 69.7%). Results show that Synthea is quite reliable in modeling demographics and probabilities of services being offered in an average healthcare setting. However, its capabilities to model heterogeneous health outcomes post services are limited.


Synthea and other synthetic patient generators do not currently model for deviations in care and the potential outcomes that may result from care deviations. To output a more realistic data set, we propose that synthetic data generators should consider important quality measures in their logic and model when clinicians may deviate from standard practice.

Peer Review reports


Clinical data synthesis is an emerging technique that has the potential to boost clinical research, system implementation and training, while protecting patient privacy [1]. However, its validity has not been fully examined, which poses questions for its broader adoption.

Access to data is essential for research, implementation and training across disciplines. However, obtaining real-world data can be costly and often presents ethical challenges such as privacy concerns. This is particularly challenging in healthcare, where health records contain highly sensitive information and are strictly protected by laws and organizational policies [1].

To circumvent these challenges, some organizations and individuals have developed different approaches to synthesize clinical data. These approaches are usually based on some probability-based logic and completely bypass the use of real patient-level data. By doing so, it imposes no risk for revealing personally identifiable information. Example products include Patient Generator [2], EMRbots [3], and Synthea [1]. Synthea, for example, emphasizes the use of publicly available health statistics (e.g., census) and clinical guidelines, and attempts to make the synthetic data sufficiently realistic but not real [1].

Before wider use of synthetic data, its validity needs to be tested, e.g. how closely synthetic data is equivalent to real data [4]. Researchers in this field usually referred to this property as the “realism” of synthetic data [5], and researchers in the broader simulation community called it as operational validity and the process of assuring this property as operational validation [6], which consists of a variety of methods. A recent article [5] found that there was no consensus on the methods most appropriate for operational validation of synthetic clinical data, and only few studies have actually done it. This is not a unique shortcoming of synthetic clinical data. Earlier review found that validation of healthcare simulation in general was lacking [6, 7], which consists of a variety of methods.

Because quality of care is one of the primary goals and characteristics of a healthcare system [8], we consider it critically important to have synthetic data presenting the same level of care quality as real data. Therefore, we will use clinical quality measures to validate the synthetic clinical data. Quality measures are evidence-based metrics to quantify the processes and outcomes of healthcare. They are widely used to indicate the level of effectiveness, safety and timeliness of the services that a healthcare provider or organization offers [9].

After reviewing the three synthetic data products, we decided to focus on Synthea because it is open-source, well-documented in peer-reviewed journal articles and online documentation. Patient Generator is a commercial product that builds its core modules based upon Synthea, so our understanding of Synthea would largely apply to Patient Generator. We determined EMRBots as ineligible for our study because, as described later, most of the quality measures chosen focus on health outcomes, which is not an aspect that EMRBots considers in its design. According to the creator of EMRBots, this is because it doesn’t model time-dependent interactions between patient factors and clinical outcomes [3]. Synthea models care processes after clinical guidelines and models care outcomes after literature and clinical expertise. Synthea currently models 38 clinical conditions and their progressions; simulating patient-provider encounters, lab data, medication prescription and more [1]. In this aspect, we find Synthea to be the most comprehensive, open-source synthetic patient generator that is freely available for our validation study. Quality measures might be effective to uncover some unrealistic aspects of Synthea because Synthea models mainly after clinical guidelines, which describe what ideally should happen, while quality measures are “the other side of the coin” to spotlight suboptimal care. So far, although it has been suggested [10], quality measures have never been used in the existing few validation studies [5]. We present the first study using this method.

By doing this study, we hope to contribute to the healthcare community from three perspectives. From the perspective of synthetic data developers, we hope to provide an external validation on a representative product, shedding lights on potential areas of improvement. From the perspective of synthetic data users (researchers, system implementers, teachers, health policy developers), we hope to provide some insights on for what use cases that synthetic data would be a reliable replacement of real data and for what use cases it is not. Lastly, from the perspective of the broader healthcare community, we argue that improving a general-purpose synthetic data generator such as Synthea would essentially improve our understanding of how the healthcare system works. As mentioned above, healthcare quality is an important pillar of a healthcare system. To explain any difference between the quality scores derived from synthetic data and the ones from real-world data could help us better understand the contributing factors to real-world healthcare quality.


The SyntheticMass dataset

The dataset we used is called SyntheticMass, which contains more than one million “synthetic residents” of Massachusetts pre-generated using Synthea and ready for free download [11]. The goal of this synthetic population was to statistically mirror the Massachusetts population regarding demographics, disease burdens, vaccinations, medical visits and social determinants [11]. To achieve this goal, the Synthea model was initiated by real demographics data of Massachusetts residents on the census track, town and county levels. Demographic variables included population, percentage of difference races, median age, median household income, and percentage of college graduates. After the synthetic patients was created, they went through their clinical journeys per disease modules. Explained in detail below, a disease module essential simulates patients’ through a series of clinical processes per recommendations from clinical guidelines and projects their care outcomes per findings from literature or input from clinical experts. If a disease module is set up correctly, it should imitate real-world health care phenomena, including the quality of care.

We found this population to be the most appropriate for our study because of two reasons. Firstly, it attempts to mimic the characteristics of the entire population of Massachusetts, which would make our quality measure results comparable to those that are publicly reported. Secondly, because Synthea adopts Monte Carlo simulation technique, it generates a slightly different population every time the software is run [10]. Using a large, representative, pre-generated population on the other hand, would facilitate other researchers to replicate our work.

SyntheticMass dataset contains a series of tables to mimic typical information from an electronic health record system. Within these tables, we mainly focused on the “encounter”, “condition” and “patient” tables. The encounter table entails patients’ encounters to health facilities, such as the service date, encounter type and principal diagnosis. The condition table provides information on onset and end dates for clinical conditions (signs, symptoms and diagnoses). The condition table accumulates all identifiable conditions that a patient has, even a condition is not the main reason why a patient seeks care in a particular encounter. The patient table provides demographic information such as identifiers, address, birth date, death date (if applicable), and gender. Modeled after electronic health records, diagnoses and procedures in Synthea are coded using Systematized Nomenclature of Medicine -- Clinical Terms (SNOMED-CT).

Disease modules and quality measures

We started with selecting quality measures relevant to the clinical modules available in Synthea. Then we obtained the publicly reported rates of the selected measures for Massachusetts and United States as our real-world reference. We then obtained the specifications of those measures and calculated rates using SyntheticMass datasets, so that we could compare the results from real data to those from synthetic data.

A clinical module is the basic unit in Synthea to model “clinical” and “control” events (or “states” in technical terms) in a clinical domain. “Clinical states” effect disease progression and care, while “control states” effect flow control. Figure 1 is a simplified example of children ear infection provided by Synthea. Children have varying likelihoods of developing ear infection based on their age, which then triggers a non-urgent pediatric admission. During the admission a patient has certain chance of taking either an anti-biotic or painkiller. The example stops here but for other modules, there is usually a process to model outcomes after the treatment (e.g., certain chance of recovery). However, as discussed later, the modeling of outcomes might be indeed a shortcoming in Synthea. Currently there are 38 modules in Synthea, ranging from allergies, chronic diseases (e.g. Asthma), to social circumstances (e.g. homelessness).

Fig. 1
figure 1

An example disease module reproduced from [26] with permission to use from [1]

Table 1 Information on Selected Measures

Our selection criteria was that a quality measure needed to correspond to a clinical module available in Synthea, and was also publicly reported in quality reporting programs. After reviewing measures in the Healthcare Effectiveness Data and Information Set (HEDIS), Hospital Compare and Star Ratings, we found four measures eligible with Synthea’s modules: Colorectal Cancer Screening, Chronic Obstructive Pulmonary Disease (COPD) 30-Day Mortality, Complications after Hip/Knee Replacement, and Controlling High Blood Pressure. Table 1 provides information on each of the selected measures. HEDIS is operated by the U.S. National Quality Assurance Committee (NCQA) and is collection of performance measures geared towards health insurance plans [12]. Hospital Compare [13] is a website operated by the U.S. federal agency Centers for Medicare and Medicaid Services (CMS) that provides performance data from participating hospitals. Star Ratings is also operated by CMS and is a rating tool that uses mostly health insurance claims data to grade Medicare Advantage plans based on quality measures and other metrics [14]. They are both public programs operated by the U.S. federal agency Centers for Medicare and Medicaid Services designed for patients to be able to compare hospitals and health plans in their area with others. Although Hospital Compare and Star Ratings use a variety of data sources, the measures that we selected for this study all use administrative claims data [15]. These measures correspond to the Colorectal Cancer, COPD, Total Joint Replacement, and Metabolic Syndrome Disease Modules in Synthea [16]. The clinical significance of each measure will be elaborated below. As a set, they cover both preventive service and chronic disease management, both ambulatory care and hospital care, and both medical service and surgical service. The first quality measure we calculate using synthetic data is Colorectal Cancer Screening, which requires patients aged 50 to 75 to have appropriate screening for colorectal cancer. This measure is important because treatment for colorectal cancer in its earliest stage can lead to high survival rate, colorectal cancer screening for adults in the 50–75 age group can help detect potentially cancerous polyps or colorectal cancer early [17]. We included patients who were alive in 2015 and 2016. Five tests were considered appropriate in this measure: Colonoscopy (recommended every 10 years), Flexible Sigmoidoscopy (every 5 years), Computed Tomography Colonography (every 5 years), Fecal Occult Blood Test (every year), and Stool DNA Test (every 3 years).

COPD 30-day mortality

The COPD 30-day mortality measure is defined as patients who died during or within 30 days of index admission with a principal diagnosis of COPD exacerbation. This measure is important because patients hospitalized after COPD exacerbations have had their mortality rate significantly affected by the quality of care given. Mortality is used because it is an indicator of the overall efficacy of more difficult to measure individual processes [17].

As shown later in the result section, the denominator identified by strictly following the measure specification is small. For sensitivity analysis, we expanded the definition and examined how that might influence the result. The strictly-defined calculation only used encounters with a principal diagnosis of COPD as part of the denominator/numerator criteria. The expanded calculation included other encounters where the COPD condition was active at the time of the encounter date.

Complication rate for hip/knee replacement

This measure looks for the occurrence of 8 complications within specific time periods after the hip or knee replacement surgery. With an aging population with high rates of osteoarthritis, Total Hip Replacement and Total Knee Replacement complications have been identified as a priority area for outcome measure development [18]. Heart attack, pneumonia, sepsis, septicemia or shock would be counted in numerator if it happens within 7 days of the admission; surgical site bleeding, pulmonary embolism, or death within 30 days; or mechanical complications, periprosthetic joint infection or wound infection within 90 days of admission. The index admission date is defined as the encounter date with one of two SNOMED-CT codes: Total Knee Replacement (609588000) and Total Hip Replacement (52734007).

Controlling high blood pressure

The measure for controlling high blood pressure looks for patients with a diagnosis of hypertension who have had their subsequent blood pressure measurements below 140/90 mmHg (for patients aged 18–59 or 60–85 with diagnosis of diabetes) or 150/90 mmHg (for patients aged 60–85 with no diagnosis of diabetes). This measure is important for population health because hypertension increases a patient’s risk for heart disease and stroke, both of which are leading causes of death in the United States [19]. In this calculation, we found patients in the synthetic population with an active condition of hypertension (SNOMED-CT: 38341003), and have had their blood pressure adequately controlled to below the defined limits after the condition onset date.

Calculated rate using Synthea

A quality measure consists of value sets and clinical logic. A value set is a list of medical codes used to determine numerator and denominator eligibility for the selected measures used value sets, a set of medical codes used in administrative claims and electronic medical records to define diagnoses and procedures, to identify numerator and denominator eligibility for each clinical quality measure. For example, a value set for influenza may contain ICD-10 codes that are related to an influenza diagnosis. The value sets for each quality measure are defined in the technical specifications of each measure, and were obtained from the Value Set Authority Center (VSAC). Mapping between coding systems or terminologies is necessary because these measures were mostly developed for administrative data, thus International Classification of Diseases Version 10 (ICD-10) and Current Procedure Terminology (CPT) was used to codify diseases and procedures, respectively. The ICD-10 and CPT codes were mapped to SNOMED-CT in accordance to documentation obtained from VSAC. When needed, we mapped ICD-10 or CPT codes to SNOMED-CT, which, as mentioned above, is the terminology used in Synthea.

After obtaining quality measure compliance rates for the SyntheticMass population, we carried out statistical bootstrapping in SPSS (IBM Armonk, New York) with 1000 resamples to obtain a 95% confidence interval for each measure.

Publicly reported rate

For the COPD 30-Day Mortality measure and Complications after Hip/Knee Replacement measure, we used publicly reported rates found on Hospital Compare as the reference population for comparison. For the Colorectal Cancer Screening and Controlling High Blood Pressure measure, the real-world rates of Massachusetts were aggregated from publicly reported Star Rating data of nine health insurance companies based in Massachusetts. National measure data was obtained by using rates from the 2017 Healthcare Effectiveness Data and Information Set report [20, 21].


As shown in Table 2, SyntheticMass has a population size of around one sixth of the real Massachusetts population. Apart from average weight and BMI, the synthetic population has comparable demographic characteristics as the Massachusetts population. The likeness of demographic data between the synthetically generated population and Massachusetts population might be attributed to the use of the real-world population as a reference for calibration when developing the Synthea’s data generation process. However, we observed that the SyntheticMass population is much more obese (BMI) on average compared its real-world counterpart.

Table 2 Comparison of Demographics between SyntheticMass and Real World Populations [27,28,29]

Our calculation for Colorectal Cancer Screening identified 314,355 eligible members (denominator), with 215,919 of them numerator compliant (68.7%). This is below the average for Massachusetts population (77.3%) but closer to the nationally reported rate (69.8%). An interesting observation is that Synthea only modules two out of the five eligible tests in their modules: Colonoscopy and Fecal Occult Blood Test.

For the COPD Mortality 30-day measure, our calculation under strictly-defined specification returned a total of 409 encounters with a principal diagnosis of COPD during the measurement year of 2016. Of the 409 eligible admissions, three had associated patient deaths within 30 days of the index admission date (0.7%). With the concern that this logic might be too stringent, we expanded the inclusion criteria to include all admissions from patients with COPD related conditions. Using the expanded specification, the number of encounters in the denominator increased significantly to 181,458, of which 10,373 deaths occurred within 30 days. The expanded rate for this measure is 5.7%, which is lower than the national rate of 8% and state average for Massachusetts hospitals, 7% (1233/17636, range 5.2–9.3%). Another interesting observation is that only two SNOMED-CT codes that are related COPD were used in Synthea: 185086009 (Chronic Obstructive Bronchitis) and 84,733,001 (Pulmonary Emphysema), compared to the 20 ICD-10 diagnosis codes included in the Hospital Compare value sets, Synthea might not be sufficient to describe all the different types and nuances of COPD.

Our Hip/Knee complication measure identified zero patients who met the numerator criteria. Within the entire SyntheticMass population, only 207 synthetic patients had a hip/knee replacement during the measurement period of 2016 (denominator). None of them had an admission with heart attack or pneumonia 7 days after the procedure; or died within 30 days after the procedure. Even after expanding parameters to include any patients who had a condition onset date 7 or 30 days after procedure, our calculation yielded no patient for numerator. The Massachusetts average rate calculated from real data is 2.92% (700/23949, range 1.9–4.4%). Although 0 and 2.92% is a small difference arithmetically, it triggers serious concern over Synthea’s capability to model postoperative complications. A potential explanation is that of the 8 complications, there was only one code (Pneumonia, SNOMED-CT 233604007) directly being used in Synthea. We tried every effort to identify these complications using different codes, and did try to use Myocardial Infarction (SNOMED-CT 22298006) to replace Acute Myocardial Infarction in the original specification, we still could not identify even one numerator case.

As shown in Table 3, the Colorectal Cancer Screening rate of SyntheticMass is very close to the national rate. This is expected since this measure is a procedure-based measure, and Synthea’s strength is exactly that it could model the probability of certain services offered in different phases of care. However, it is much lower than the Massachusetts rate. This infers that Synthea could not model regional variation in quality. The other two outcome-based measures, on the other hand, are much more complex to model as there are many factors involved in the services that may impact the outcomes. Even with expanded logic, the rates of SyntheticMass are still much lower than Massachusetts or national rates.

Table 3 Quality Measure Rates of SyntheticMass versus State/National Rates

Our measure for Controlling High Blood Pressure resulted in zero patients meeting numerator criteria. Of the total SyntheticMass population, 241,311 synthetic adult patients had hypertension (29.91%). We hypothesized that a certain percentage of hypertensive could have their blood pressure controlled post treatment. After analyzing Synthea’s data, this doesn’t seem to be the case. This may indicate that although Synthea is modeled to simulate a realistic percent of population with hypertension, it does not realistically model the outcomes that may occur post-diagnosis of hypertension as a result of intervention.


In this paper, we attempt to validate the realism of Synthea, a synthetic clinical data generator, by calculating clinical quality measures and comparing the results with real-world rates.

Our analysis shows that, apart from average weight and BMI, the synthetic population has comparable demographic characteristics as the Massachusetts population. We speculate two reasons behind this. Firstly, similar to the issue of hypertension (once a patient becomes hypertensive, the condition controlled), once a patient becomes obese, he or she might not lose weight, which inflates the overall obesity rate in the entire population. Secondly, it could also be due to the lack of reference data to accurately calibrate the model, or a difficulty in simulating height/weight interactions simultaneously to create an accurate BMI distribution.

Results of validation using quality measures indicate Synthea has both strengths and weaknesses in its approach. On one hand, as evident in the Colorectal Cancer Screening result, Synthea presents a high level of reliability in modeling the probabilities of certain services being offered in an average healthcare setting. On the other hand, as evident in other outcomes measures and a variance analysis between hospitals, its capabilities to model heterogeneous post-intervention health outcomes limited. We are inconclusive on Synthea’s possible limitations in modeling for regional variances in quality, due to the little variance between the national and state reported rates for all measures but Colorectal Cancer Screening. This is indeed a difficult task, testified by creators of another synthetic data generator EMRBots [3].

This is the first study that uses quality measures to validate synthetic data. Results highlight the importance of incorporating quality measures in the synthesis logic, which, as far as we know, has not been considered in any existing products. The assumption that all care processes will follow clinical guidelines is not realistic. In real-world clinical practices, noncompliance with clinical guidelines is very common and diverse variations in healthcare utilization have been observed for decades [22]. In the long term, an ideal synthesis logic should account for all the factors that influence compliance with clinical guidelines, including the guideline’s quality itself, clinicians’ attitude to behavioral changes, an organization’s resources and many more [23]. Quality measures could serve as one way to explore and verify our understanding of these factors. Such a thinking process could also apply to other aspects of realism in synthetic data besides “quality”, such as “cost” or “access”.

This leads to an important viewpoint on the contribution of synthetic data in general. It is not merely creating another source of data we could safely play with. By researching all the underlying mechanisms that could increase its realism, we could gradually parameterize the factors and interactions that make our health system the way it is now. Quoting Epstein’s famous article Why Model? (Epstein, 2008, [24]), the development and calibration of a simulation model could offer explicit explanation of real-world phenomena, guide data collection, illuminate core dynamics, raise new questions and more. All of these are critical to enhance our understanding of the complexity in health care [25].

Our study has a few limitations. Firstly, we could only identify publicly reported quality measures for four clinical modules in Synthea. This might undervalue Synthea, which might present higher realism in other, less complicated, modules. Secondly, for the selected clinical modules, we only had one quality measure each for validation, which is not optimal since “quality” is a multi-faceted concept. However, the quality measures examined in this paper have been widely adopted in national quality programs and represent fundamental facets of quality in those clinical domains. We believe they are the basic ones that Synthea needs to model after to improve its realism. Thirdly, the original specification of these quality measures are designed mostly for administrative data, which is different from the clinical data (electronic health records) that Synthea generates. Although this may have an impact on the calculated rates, the differences between the synthetic rates and real-world rates are so big for three measures that we believe it could not be solely attributed to the features of different data sources.


In order to spread the use of synthetic clinical data, its realism needs to be tested. Clinical quality measures could serve as an effective validation tool because it is critical that synthetic data presents the same level of care quality as real data. After applying quality measures in Synthea, its strength and weakness have been uncovered, especially its limited capability to model heterogeneous health outcomes after major interventions. To improve its realism, Synthea and other synthetic data generators needs to model factors that make clinical practice deviate from standard guidelines and introduce variations in healthcare quality. Doing so could contribute to our overall understanding of the complexity in healthcare. Future validation studies should continue to identify eligible quality measures to validate new modules available in Synthea, and identify publicly reported rates based on electronic medical records. If Synthea and other synthetic data generators could be continuously improved, expanded and rigorously validated with variations in health care quality in mind, we are optimistic about the future of synthetic clinical data.



Centers for Medicare and Medicaid Services


Chronic Obstructive Pulmonary Disease


Current Procedural Terminology


Healthcare Effectiveness Data and Information Set


International Classification of Diseases and Related Health Problems, revision 10


Systematized Nomenclature of Medicine -- Clinical Terms


Value Set Authority Center


  1. Walonoski J, Kramer M, Nichols J, Quina A, Moesel C, Hall D, et al. Synthea: an approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record. J Am Med Inform Assoc. 2018;25(3):230–8.

    Article  Google Scholar 

  2. Pletcher, T. (2017). MiHIN annual report 2017. Retrieved from The Michigan Health Information Network Shared Services:

  3. Kartoun, U. (2016). A methodology to generate virtual patient repositories. Computing Research Repository.

    Google Scholar 

  4. McLachlan S. Realism in synthetic data. Palmerston North: Massey University; 2017.

    Google Scholar 

  5. McLachlan S, Dube K, Gallagher T, Daley B, Walonoski J. The ATEN Framework for creating the realistic synthetic electronic health record. In: 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018). Funchal: Science and Technology Publications, LDA; 2018. p. 220–30.

    Chapter  Google Scholar 

  6. Sargent R. Verification And Validation Of Simulation Models. In: Proceedings of the 2010 Winter Simulation Conference; 2010. p. 167–83.

    Google Scholar 

  7. Fone D, Hollinghurst S, Temple M, Round A, Lester N, Weightman AL, Palmer S. Systematic review of the use and value of computer simulation modelling in population health and health care delivery. J Public Health Med. 2003;25(4):325–35.

  8. Berwick DM, Nolan TW, Whittington J. The Triple Aim: Care,Health, and Cost. Health Aff. 2008;27(3):759–69.

    Article  Google Scholar 

  9. U.S. Centers for Medicare and Medicaid Services. (2017). Quality measures. Retrieved from Quality Measures:

  10. Open Source Electronic Health Record Agent. (2018). Synthetic Patient Data Project Group. Video File. Retrieved from

  11. SyntheticMass. Retrieved October 10, 2018 from

  12. HEDIS and Performance Measurement. Retrieved October 10, 2018 from

  13. What is Hospital Compare. Retrieved October 10, 2018 from

  14. Star Ratings. Retrieved October 10, 2018 from

  15. Centers for Medicare and Medicaid Services. (n.d.). Hospital Compare: Data Sources. Retrieved October 10, 2018 from The Offical U.S. Government Site for Medicare:

  16. Hall, D. (2018). Module Gallery. Retrieved from Synthea Github:

    Google Scholar 

  17. National Quality Measures Clearinghouse. Chronic obstructive pulmonary disease (COPD): hospital 30-day, all-cause, risk-standardized mortality rate following COPD hospitalization. Rockville: Agency of Healthcare Quality Research. Retrieved from National Quality Measures Clearinghouse; 2017.

    Google Scholar 

  18. Centers for Medicare & Medicaid Services. (2018). #1550 hospital-level risk-standardized complication rate (RSCR) following elective primary total hip arthroplasty (THA). National Quality Forum.

    Google Scholar 

  19. Centers for Diesase Control and Prevention. (2018). About high blood pressure (hypertension). Retrieved from Centers for Diesase Control and Prevention:

  20. National Committee of Quality Assurance. (2017). Colorectal Cancer screening. Retrieved from health care accreditation, Health Plan Accreditation Organization - NCQA:

  21. National Committee of Quality Assurance. (2017). Controlling high blood pressure. Retrieved from health care accreditation, Health Plan Accreditation Organization - NCQA:

  22. Wennberg J, Gittelsohn A. Small area variations in health care delivery: a population-based health information system can guide planning and regulatory decision-making. Science. 1973;182(4117):1102–8.

    Article  CAS  Google Scholar 

  23. Quaglini S. Compliance with clinical practice guidelines. Studies in Health Technology and Informatics. 2008;139:160–79.

  24. Epstein JM. Why model? Journal of Artificial Societies and Social Simulation. 2008;11(4):12.

  25. Plsek PE, Greenhalgh T. The challenge of complexity in health care. BMJ. 2001;323:625.

  26. Generic Module Framework. (2018). Retrieved from Synthea Github:

  27. U.S. Census Bureau. (2018). 2016 American community survey 1-year estimates. Retrieved from American Fact Finder:

  28. Centers for Disease Control and Prevention. (2017). Massachusetts state nutrition, physical activity, and obesity profile. Retrieved from Centers for Disease Control and Prevention:

  29. National Center of Health Statistics. Health, United States, 2016: with Chartbook on long-term trends in health. Hyattsville: Centers for Disease Control and Prevention; 2017.

    Google Scholar 

Download references


We would like to thank Mr. Jason Walonoski for helping us understand how to use the Synthea software and datasets.

Availability for data and materials

The dataset supporting the conclusions of this article is available in the SyntheticMass website,


Not Applicable.

Author information

Authors and Affiliations



Study Conception and Design: JC, DC. Acquisition of Data: JC, DC. Analysis and Interpretation of Data: JC, DC. Drafting of Manuscript: JC, DC. Critical Revision: JC, DC, MP, EC, JJ. All authors read and approved the final manuscript.

Corresponding author

Correspondence to David Chun.

Ethics declarations

Ethics approval

Not Applicable.

Consent for publication

Not Applicable.

Competing interests

Evolent Health is a commercial company that engages in the provision of health care delivery and payment services. It is not a pharmaceutical company and does not sponsor clinical trials. The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, J., Chun, D., Patel, M. et al. The validity of synthetic clinical data: a validation study of a leading synthetic data generator (Synthea) using clinical quality measures. BMC Med Inform Decis Mak 19, 44 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: