Skip to main content
  • Research article
  • Open access
  • Published:

Methods for identifying 30 chronic conditions: application to administrative data

A Correction to this article was published on 04 September 2019

This article has been updated



Multimorbidity is common and associated with poor clinical outcomes and high health care costs. Administrative data are a promising tool for studying the epidemiology of multimorbidity. Our goal was to derive and apply a new scheme for using administrative data to identify the presence of chronic conditions and multimorbidity.


We identified validated algorithms that use ICD-9 CM/ICD-10 data to ascertain the presence or absence of 40 morbidities. Algorithms with both positive predictive value and sensitivity ≥70% were graded as “high validity”; those with positive predictive value ≥70% and sensitivity <70% were graded as “moderate validity”. To show proof of concept, we applied identified algorithms with high to moderate validity to inpatient and outpatient claims and utilization data from 574,409 people residing in Edmonton, Canada during the 2008/2009 fiscal year.


Of the 40 morbidities, we identified 30 that could be identified with high to moderate validity. Approximately one quarter of participants had identified multimorbidity (2 or more conditions), one quarter had a single identified morbidity and the remaining participants were not identified as having any of the 30 morbidities.


We identified a panel of 30 chronic conditions that can be identified from administrative data using validated algorithms, facilitating the study and surveillance of multimorbidity. We encourage other groups to use this scheme, to facilitate comparisons between settings and jurisdictions.


Management of chronic disease is the major challenge facing health systems worldwide [1]. Many people with chronic disease have multiple chronic conditions, which is termed multimorbidity [2]. It is clear that multimorbidity is common and associated with worse clinical outcomes and higher health care costs, compared to good health or to the presence of a single chronic condition [3-6]. However, there are key knowledge gaps concerning the basic epidemiology of multimorbidity [7]; its clinical and economic consequences; and how it contributes to disparities in health [8]. This information is prerequisite to mitigating the impact of multimorbidity and chronic disease [9].

Multimorbidity has been identified as a key research priority by the Public Health Agency of Canada, and is crucial to inform programming and resource forecasting. Knowledge of secular changes in the incidence and prevalence of multimorbidity is required, and would be facilitated by methods for identifying the presence of multimorbidity using administrative data.

Identifying morbidity using administrative data can be simple (e.g., based on a single hospitalization and using only a small number of codes) or complex (e.g., including inpatient and outpatient encounters and long lists of codes). Once developed, algorithms may be validated against a suitable gold standard (e.g., chart reviews; other previously validated algorithms).

Previous work by Barnett et al [3] identified 40 morbidities and were informed by a systematic review of multimorbidity measures [10], the Quality and Outcomes Framework of the UK General Practice contract, and health service planning by NHS Scotland. However, these authors used administrative data sources that are unique to the United Kingdom to identify the presence or absence of these conditions. A corresponding scheme based on the more widely used ICD-9 CM/ICD-10 system is not available.

Therefore, we first identified previously validated algorithms that use the ICD-9 CM/ICD-10 system for ascertaining the presence of chronic conditions using inpatient and outpatient claims and utilization data. We then showed proof of concept for using administrative data to study multimorbidity, by applying these previously validated algorithms to a population of adults residing in Edmonton, Canada between April 2008 and March 2009.


The institutional review boards at the University of Alberta (Pro00038795) and the University of Calgary (E22590) approved the study.


We did a focused literature search for validated algorithms that use ICD-9 CM/ICD-10 codes in administrative data from inpatient and outpatient encounters to ascertain the presence or absence of the 40 morbidities identified by Barnett et al [3]. We searched MEDLINE using combinations of the following MeSH subject headings together with the specific morbidity of interest: ‘International Classification of Diseases’, ‘Reproducibility of Results’, and ‘Sensitivity and Specificity’. Based on an a priori decision, we considered algorithms to be of high validity if they had both positive predictive value (PPV) and sensitivity ≥70% as compared to an acceptable gold standard such as chart review. We considered algorithms to be of moderate validity if they had PPV ≥70% but sensitivity <70%. The cut-off values for PPV and sensitivity were based on previous validation studies of administrative data [11]. We did not consider negative predictive value or specificity, because these parameters are generally >90% in studies of chronic diseases among the general population [11,12]. The definition of multimorbidity required the coexistence of two or more of the morbidities. In a secondary analysis we used a more restrictive definition that required three or more morbidities to be present.

ICD codes

Canadian hospital discharge abstract data are coded with ICD-10 CA, which essentially increases specificity compared to the ICD-10 system by adding more digits [11]. All of the ICD-10 codes from the included algorithms are consistent with ICD-10 CA codes, and thus we used ICD-10 and ICD-10 CA codes interchangeably throughout this manuscript. When ICD-10 codes were not given in the primary papers, we used the Canadian Institute for Health Information (CIHI; conversion table to convert ICD-9 CM codes to ICD-10 codes. Many algorithms required multiple codes within a specified time period to determine incidence of morbidity (Table 1). In each case the index date for the disease was considered to be the date of the first code. For example, in order to determine presence of asthma, we searched for ICD-9 CM 493 and ICD-10 J45 codes in hospitalizations and outpatient encounters. We considered asthma to have developed at the first instance of a single hospitalization with either of these codes, or a single outpatient encounter followed by two further outpatient encounters with either of these codes within two years. In either case, we considered the participant to have asthma for the duration of follow-up.

Table 1 Administrative algorithms for 30 morbidities

Proof of concept

We applied identified algorithms with high or moderate validity to a population-based administrative dataset from Alberta Health (AH; the provincial health ministry) and Alberta clinical laboratories. Details of this administrative dataset including claims, hospitalizations and Ambulatory Care Classification System (ACCS) utilization are given in Figure 1 and have been reported elsewhere [13]. We assembled a cohort of adults aged ≥18 years who resided in the city of Edmonton, Alberta between April 2008 and March 2009, and included all people registered with AH. All Alberta residents are eligible for insurance coverage by AH, and >99% participate in this coverage. The dataset included demographic information such as postal code of residence, laboratory data, and medication in those aged ≥65 years [13]. We identified Edmonton residents from the AH registry file using the community name variable from the Statistics Canada Postal Code 2008 Conversion file [14] (

Figure 1
figure 1

Development of the cohort. ICD International Classification for Diseases, CKD chronic kidney disease. Barnett K, Mercer SW, Norbury M, Watt G, Wyke S, Guthrie B. Epidemiology of multimorbidity and implications for health care, research, and medical education: a cross-sectional study. Lancet. 2012;380(9836):37-43.

To demonstrate proof of concept for applying these algorithms to a large administrative dataset, we presented a simple summary of the prevalence of morbidity and multimorbidity in the study population. Counts and percentages were presented along with a figure showing how the number of morbidities varies by age. In sensitivity analyses, we presented the prevalence of morbidity as assessed by different sources of administrative data.



We identified 16 morbidities for which the best identified algorithm was of high validity: asthma, atrial fibrillation, metastatic cancer, chronic heart failure, chronic kidney disease, chronic pain, cirrhosis, diabetes, hypertension, irritable bowel syndrome, multiple sclerosis, myocardial infarction, peripheral vascular disease, psoriasis, schizophrenia and severe constipation. We identified an additional 14 morbidities (including two algorithms for other types of cancer and one algorithm for another type of liver disease) for which the best identified algorithm was of moderate validity: alcohol misuse, lymphoma, non-metastatic cancer (breast, cervical, colorectal, lung, and prostate), chronic pulmonary disease, chronic viral hepatitis B, dementia, depression, epilepsy, hypothyroidism, inflammatory bowel disease, Parkinson’s disease, peptic ulcer disease, rheumatoid arthritis, and stroke or transient ischemic attack. We excluded the remaining morbidities for which no suitable algorithm could be identified (Additional file 1 Table S1). Thus we identified 30 conditions using administrative algorithms (including ICD-9 CM and ICD-10 codes) that are summarized in the Table 1. Of these 30 algorithms, half were validated for both ICD-9 CM and ICD-10 codes.

We identified all conditions exclusively using ICD-9 CM and ICD-10 data with the exception of chronic kidney disease, for which we used a validated algorithm applied to ICD-9 CM and ICD-10 data [15] and supplemented using serum creatinine and albuminuria data [16]. We considered chronic kidney disease to be present if a participant met either the administrative or laboratory criteria.

In some cases, we made minor changes to the published algorithms to improve anticipated diagnostic performance, to increase consistency between algorithms used for the different conditions, and to include application to the outpatient setting. First, the original publications by Quan et al [11,17] required one hospitalization to identify the presence of a chronic condition; based on input from the first author of that paper, we modified this algorithm to allow either one inpatient code or two outpatient codes within two years to define the presence of these conditions. Second, to improve mapping of ICD-9 codes from the original (published) algorithm into ICD-10, we combined the ‘highly likely’ and the ‘likely’ codes from the original algorithm for chronic pain [18]. Third, for consistency, we modified algorithms that defined conditions as present if participants had two codes within any duration of follow-up (no matter how long) to require that the two codes occur within a three year period [19,20]. Fourth, we expanded the criteria for presence of atrial fibrillation, epilepsy, irritable bowel syndrome, and severe constipation to include two outpatient codes within two years for these conditions. However, to ensure that secondary (post-surgical) bowel complications were not incorrectly classified as chronic bowel conditions, we excluded any hospitalization for surgery when assessing the presence or absence of these conditions [21]. Fifth, we expanded the criteria for presence of stroke or TIA to include one outpatient code. Sixth, we reviewed all algorithms for overlapping codes (situations where the same code was used to identify more than one condition), and modified the algorithms to avoid double-counting of morbidities (see footnotes in the Table 1 for specific details).

Application of the algorithms to the Edmonton cohort

The study cohort included 574,409 participants (Figure 1, Table 2). Almost two-thirds were less than 50 years of age. Ten percent were 70 years of age or older and the proportion of men and women was similar. Approximately half of all participants were not identified as having any of the 30 morbidities for which high or moderate validity algorithms existed. Approximately one quarter were identified as having one of these 30 morbidities. Another quarter were identified as having 2 or more of these 30 morbidities (meeting the primary criterion for multimorbidity), whereas 12% had three or more (meeting the secondary criterion for multimorbidity).

Table 2 Morbidity characteristics of participants residing in Edmonton, Alberta during the April 2008 to March 2009 fiscal year

The apparent prevalence of most morbidities was greatly reduced (often by 50% or more) when we assessed their presence or absence using hospitalization data only (Table 2). The addition of ACCS data to hospitalization and claims data made little difference to prevalence estimates, with the possible exceptions of chronic pain, hepatitis B and cirrhosis (the prevalences of which all changed by >20%). In most cases, the algorithms as originally validated resulted in a prevalence that was intermediate between the most inclusive approach (using hospitalization, claims and ACCS data) and the most restrictive approach (using hospitalization data only). As expected, adding gold standard laboratory data for kidney function (eGFR and albuminuria) resulted in substantial increases in the apparent prevalence of chronic kidney disease as compared to administrative data alone, regardless of which administrative data sources were used.

Figure 2 depicts the percentage of participants with multimorbidity by age group. After hypertension (a prevalence of 23%), 21% had chronic kidney disease, 9% had diabetes, and 9% had depression.

Figure 2
figure 2

Number of morbidities by age in Edmonton, Alberta during the April 2008 to March 2009 fiscal year.


From a published list of chronic conditions [3], we identified a total of 30 validated algorithms including 3 algorithms for different types of cancer and 2 algorithms for liver disease. We applied the algorithms to ICD codes from claims and utilization data, and identified the presence or absence of these conditions in a cohort of 574,409 adults residing in Edmonton, Alberta between April 2008 and March 2009 (Figure 1). The overall prevalence of multimorbidity in this cohort was 26%, which is similar to the prevalence as reported in the Barnett study [3]. Our findings demonstrate proof of concept for using administrative data as a surveillance tool for multimorbidity in settings with systems for reliably capturing population-based claims and utilization data.

Multiple prior studies have ascertained the presence of various chronic conditions in the context of assessing multimorbidity [4,7,22-34]. Although there is no universally accepted definition of multimorbidity (or a list of conditions that should be used to assess the presence of multimorbidity) there appears to be consensus on several issues. First, health conditions used to define multimorbidity should be chronic but not necessarily permanent. Second, two or more concomitant conditions should be required to identify a person as having multimorbidity. Third, an attempt should be made to standardize definitions across studies to facilitate comparisons between populations [7,9,34,35]. At the same time, it is important that algorithms selected for use with administrative data should be validated against a gold standard – and demonstrate acceptable diagnostic properties so as to ensure reasonably accurate classification of individuals with respect to morbidity status. We focused on validated algorithms with positive predictive value and sensitivity ≥70%, compared to an acceptable gold standard such as chart review. Because we had access to laboratory data allowing a gold standard assessment of kidney function, we primarily assessed the presence of chronic kidney disease using estimated glomerular filtration rate (eGFR) and albuminuria rather than administrative data.

To our knowledge, this is the most comprehensive panel of validated algorithms yet applied to administrative data for the study of multimorbidity. Other studies have used reasonable but unvalidated algorithms, a more limited list of candidate chronic conditions or both. Although there are undoubtedly other chronic conditions that could be identified using administrative data, we focused on those for which available algorithms appear to have adequate sensitivity as well as positive predictive value. We will use the set of algorithms described herein as the foundation for a series of studies describing the epidemiology of multimorbidity in Alberta, Canada.

Besides the various definitions of multimorbidity that they have used, existing studies in this area have several other limitations [36]. First, population-based studies are rare (especially in Canadian settings); most studies have captured patients followed by a particular centre and are vulnerable to referral bias. Second, most studies have been unable to assess the link between multimorbidity and clinical outcomes in subgroups defined by age, sex, or low socioeconomic status. Third, little is known about the relative frequency of individual chronic conditions within the multimorbidity syndrome – or about which clusters of conditions are most common and/or clinically significant. Fourth, studies examining the economic consequences of multimorbidity have typically used relatively unsophisticated methods and/or studied only select populations. The scheme outlined in the current manuscript will allow our group to do future studies that close these knowledge gaps – informing policy and practice. We are optimistic that the scheme will also be used by other researchers from other jurisdictions with similar datasets – facilitating comparisons between studies. Future studies should test the relative importance of the morbidities identified in the current manuscript, as well as considering other potentially important morbidities for inclusion.

Limitations of the current approach include those common to all studies using administrative data. For example, we do not have information on potential confounders related to lifestyle (e.g., diet, smoking, exercise) or on measured blood pressure, which may be confounders when examining the association between multimorbidity and outcomes or costs. However, this limitation would not be expected to affect feasibility of applying the algorithms or the prevalence estimates reported here. Second, identification of some of the chronic conditions we studied might have been enhanced by simultaneous consideration of medication data [3]. We decided against including publicly funded medication data to define these conditions because medication coverage in Alberta is limited to people aged ≥65 years; those of lower SES; or with high annual medication costs. Thus, using medication data to define conditions would have biased towards a higher incidence of multimorbidity in older, poorer and sicker participants. We decided against restricting the cohort to people aged ≥65 years, because multimorbidity is relatively common in younger participants – and there might be important differences in the nature and implications of multimorbidity by age. Therefore, we will include all adult Albertans in our forthcoming analyses. Third, since participants must use medical services to be diagnosed with chronic conditions, our findings underestimate the true population burden of multimorbidity – especially for conditions that are less likely to lead to hospitalization but which may still significantly impact quality of life and other important outcomes. Fourth, we did not identify appropriate algorithms for all of the 40 target conditions, possibly because our searches were not exhaustive, and other important conditions such as obesity were not considered in this study. Therefore, our results likely underestimate the true prevalence of multimorbidity. Finally, although we focused on validated algorithms, the diagnostic performance of algorithms may vary between settings, based on coding practices and the reliability of data capture – and we did not systematically evaluate the quality of the original studies. Therefore (despite the lack of an a priori reason to suspect worse performance in our dataset), it is possible that some algorithms that were of high or moderate validity in other jurisdictions may perform less well when applied to Alberta data, especially with the modifications as described herein.


In summary, we identified a panel of 30 chronic conditions that can be identified from administrative data using validated algorithms, facilitating the study and surveillance of multimorbidity. We encourage other groups to use this scheme, to facilitate comparisons of data on multimorbidity between settings and jurisdictions.

Change history

  • 04 September 2019

    Following publication of the original manuscript [1], the authors noted several errors in Table 1. Details of the requested corrections are shown below:


  1. Beaglehole R, Bonita R, Horton R, Adams C, Alleyne G, Asaria P, et al. Priority actions for the non-communicable disease crisis. Lancet. 2011;377(9775):1438–47.

    Article  PubMed  Google Scholar 

  2. Fortin M, Lapointe L, Hudon C, Vanasse A. Multimorbidity is common to family practice. Is it commonly researched? Can Fam Physician. 2005;51:244–5.

    PubMed  Google Scholar 

  3. Barnett K, Mercer SW, Norbury M, Watt G, Wyke S, Guthrie B. Epidemiology of multimorbidity and implications for health care, research, and medical education: a cross-sectional study. Lancet. 2012;380(9836):37–43.

    Article  PubMed  Google Scholar 

  4. Fortin M, Hudon C, Haggerty J, van den Akker M, Almirall J. Prevalence estimates of multimorbidity: a comparative study of two sources. BMC Health Serv Res. 2010;10:111.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Perruccio AV, Katz JN, Losina E. Health burden in chronic disease: multimorbidity is associated with self-rated health more than medical comorbidity alone. J Clin Epidemiol. 2012;65(1):100–6.

    Article  PubMed  Google Scholar 

  6. Lehnert T, Heider D, Leicht H, Heinrich S, Corrieri S, Luppa M, et al. Review: health care utilization and costs of elderly persons with multiple chronic conditions. Med Care Res Rev. 2011;68:387–420.

    Article  PubMed  Google Scholar 

  7. Fortin M, Stewart M, Poitras ME, Almirall J, Maddocks H. A Systematic Review of Prevalence Studies on Multimorbidity: Toward a More Uniform Methodology. Ann Fam Med. 2012;10(2):142–51.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Parekh AK, Goodman RA, Gordon C, Koh HK, HHS Interagency Workgroup on Multiple Chronic Conditions. Managing multiple chronic conditions: a strategic framework for improving health outcomes and quality of life. Public Health Rep. 2011;126(4):460–71.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Salisbury C. Multimorbidity: redesigning health care for people who use it. Lancet. 2012. [Epub ahead of print].

  10. Diederichs C, Berger K, Bartels DB. The measurement of multiple chronic diseases–a systematic review on existing multimorbidity indices. J Gerontol A Biol Sci Med Sci. 2011;66(3):301–11.

    Article  PubMed  Google Scholar 

  11. Quan H, Li B, Saunders LD, Parsons GA, Nilsson CI, Alibhai A, et al. Assessing validity of ICD-9-CM and ICD-10 administrative data in recording clinical conditions in a unique dually coded database. Health Serv Res. 2008;43(4):1424–41.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Altman DG, Bland JM. Diagnostic tests 2: Predictive values. BMJ. 1994;309(6947):102.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Hemmelgarn BR, Clement F, Manns BJ, Klarenbach S, James MT, Ravani P, et al. Overview of the Alberta Kidney Disease Network. BMC Nephrol. 2009;10:30.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Statistics Canada: Postal Code Conversion File [; Accessed January 13.

  15. Ronksley PE, Tonelli M, Quan H, Manns BJ, James MT, Clement FM, et al. Validating a case definition for chronic kidney disease using administrative data. Nephrol Dial Transplant. 2012;27(5):1826–31.

    Article  CAS  PubMed  Google Scholar 

  16. Stevens PE, Levin A. Kidney Disease: Improving Global Outcomes Chronic Kidney Disease Guideline Development Work Group Members: Evaluation and management of chronic kidney disease: synopsis of the kidney disease: improving global outcomes 2012 clinical practice guideline. Ann Intern Med. 2013;158(11):825–30.

    Article  PubMed  Google Scholar 

  17. Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–9.

    Article  PubMed  Google Scholar 

  18. Tian TY, Zlateva I, Anderson DR. Using electronic health records data to identify patients with chronic pain in a primary care setting. J Am Med Inform Assoc. 2013.

  19. Liu L, Allison JE, Herrinton LJ. Validity of computerized diagnoses, procedures, and drugs for inflammatory bowel disease in a northern California managed care organization. Pharmacoepidemiol Drug Saf. 2009;18(11):1086–93.

    Article  PubMed  Google Scholar 

  20. Marrie RA, Fisk JD, Stadnyk KJ, Yu BN, Tremlett H, Wolfson C, et al. The incidence and prevalence of multiple sclerosis in Nova Scotia, Canada. Can J Neurol Sci. 2013;40(6):824–31.

    Article  PubMed  Google Scholar 

  21. Sands BE, Duh M-S, Cali C, Ajene A, Bohn RL, Miller D, et al. Algorithms to identify colonic ischemia, complications of constipation and irritable bowel syndrome in medical claims data: development and validation. Pharmacoepidemiol Drug Saf. 2006;15(1):47–56.

    Article  PubMed  Google Scholar 

  22. Fortin M, Bravo G, Hudon C, Vanasse A, Lapointe L. Prevalence of multimorbidity among adults seen in family practice. Ann Fam Med. 2005;3:223–8.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Fortin M, Hudon C, Dubois M-F, Almirall J, Lapointe L, Soubhi H. Comparative assessment of three different indices of multimorbidity for studies on health-related quality of life. Health Qual Life Outcomes. 2005;3:74.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Fortin M, Soubhi H, Hudon C, Bayliss EA, Van den Akker M. Multimorbidity in primary health care: Creating a research agenda; Preconference Workshop. Vancouver BC: NAPCRG meeting; 2007.

    Google Scholar 

  25. Valderas JM, Starfield B, Roland M. Multimorbidity's many challenges: A research priority in the UK. BMJ. 2007;334:1128.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Britt HC, Harrison CM, Miller GC, Knox SA. Prevalence and patterns of multimorbidity in Australia. Med J Aust. 2008;189:72–7.

    PubMed  Google Scholar 

  27. Marengoni A, Winblad B, Karp A, Fratiglioni L. Prevalence of chronic diseases and multimorbidity among the elderly population in Sweden. Am J Public Health. 2008;98:1198–200.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Uijen AA, van de Lisdonk EH. Multimorbidity in primary care: prevalence and trend over the last 20 years. Eur J Gen Pract. 2008;1(14 Suppl):28–32.

    Article  Google Scholar 

  29. Mercer SW, Smith SM, Wyke S, O'Dowd T, Watt GC. Multimorbidity in primary care: developing the research agenda. Fam Pract. 2009;2:79–80.

    Article  Google Scholar 

  30. Caughey GE, Roughead EE. Multimorbidity research challenges: where to go from here? J Comorbidity. 2011;1:8–10.

    Article  Google Scholar 

  31. Marengoni A, Angleman S, Melis R, Mangialasche F, Karp A, Garmen A, Meinow B, Fratiglioni L: Aging with multimorbidity: A systematic review of the literature. Ageing Res Rev 2011, [Epub ahead of print].

  32. Starfield B, Kinder K. Multimorbidity and its measurement. Health Policy. 2011. [Epub ahead of print].

  33. Agborsangaya CB, Lau D, Lahtinen M, Cooke T, Johnson JA. Multimorbidity prevalence and patterns across socioeconomic determinants: a cross-sectional survey. BMC Public Health. 2012;12:201.

    Article  PubMed  PubMed Central  Google Scholar 

  34. France EF, Wyke S, Gunn JM, Mair FS, McLean G, Mercer SW. Multimorbidity in primary care: a systematic review of prospective cohort studies. Br J Gen Pract. 2012;62(597):297–307.

    Article  Google Scholar 

  35. Tinetti ME, Fried TR, Boyd CM. Designing health care for the most common chronic condition-multimorbidity. JAMA. 2012;307(23):2493–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. U.S. Department of Health & Human Services. Multiple Chronic Conditions - A Strategic Framework: Optimum Health and Quality of Life for Individuals with Multiple Chronic Conditions. In Washington, D.C. 2010.

  37. Gershon AS, Wang C, Guan J, Vasilevska-Ristovska J, Cicutto L, To T. Identifying patients with physician-diagnosed asthma in health administrative databases. Can Respir J. 2009;16(6):183–8.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Alonso A, Agarwal SK, Soliman EZ, Ambrose M, Chamberlain AM, Prineas RJ, et al. Incidence of atrial fibrillation in whites and African-Americans: the Atherosclerosis Risk in Communities (ARIC) study. Am Heart J. 2009;158(1):111–7.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Penberthy L, McClish D, Pugh A, Smith W, Manning C, Retchin S. Using hospital discharge files to enhance cancer surveillance. Am J Epidemiol. 2003;158(1):27–34.

    Article  PubMed  Google Scholar 

  40. Mahajan R, Moorman AC, Liu SJ, Rupp L, Klevens RM, Chronic Hepatitis Cohort Study investigators. Use of the International Classification of Diseases, 9th revision, coding in identifying chronic hepatitis B virus infection in health system data: implications for national surveillance. J Am Med Inform Assoc. 2013;20(3):441–5.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Goldberg D, Lewis J, Halpern S, Weiner M, Lo Re 3rd V. Validation of three coding algorithms to identify patients with end-stage liver disease in an administrative database. Pharmacoepidemiol Drug Saf. 2012;21(7):765–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Hux JE, Ivis F, Flintoft V, Bica A. Diabetes in Ontario: determination of prevalence and incidence using a validated administrative data algorithm. Diabetes Care. 2002;25(3):512–6.

    Article  PubMed  Google Scholar 

  43. Jette N, Reid AY, Quan H, Hill MD, Wiebe S. How accurate is ICD coding for epilepsy? Epilepsia. 2010;51(1):62–9.

    Article  PubMed  Google Scholar 

  44. Quan H, Khan N, Hemmelgarn BR, Tu K, Chen G, Campbell N, et al. Validation of a case definition to define hypertension using administrative data. Hypertension. 2009;54(6):1423–8.

    Article  CAS  PubMed  Google Scholar 

  45. Austin PC, Daly PA, Tu JV. A multicenter study of the coding accuracy of hospital discharge administrative data for patients admitted to cardiac care units in Ontario. Am Heart J. 2002;144(2):290–6.

    Article  PubMed  Google Scholar 

  46. Noyes K, Liu H, Holloway R, Dick AW. Accuracy of Medicare claims data in identifying Parkinsonism cases: comparison with the Medicare current beneficiary survey. Mov Disord. 2007;22(4):509–14.

    Article  PubMed  Google Scholar 

  47. Fan J, Arruda-Olson AM, Leibson CL, Smith C, Liu G, Bailey KR, et al. Billing code algorithms to identify cases of peripheral artery disease from administrative data. J Am Med Inform Assoc. 2013;20(e2):e349–54.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Asgari MM, Wu JJ, Gelfand JM, Salman C, Curtis JR, Harrold LR, et al. Validity of diagnostic codes and prevalence of psoriasis and psoriatic arthritis in a managed care population, 1996-2009. Pharmacoepidemiol Drug Saf. 2013;22(8):842–9.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Lurie N, Popkin M, Dysken M, Moscovice I, Finch M. Accuracy of diagnoses of schizophrenia in Medicaid claims. Hosp Community Psychiatry. 1992;43(1):69–71.

    CAS  PubMed  Google Scholar 

  50. Moscovice MF, Lurie N. Minnesota: Plan Choice by the Mentally Ill in Medicaid Prepaid Health Plans. Adv Health Econ Health Serv Res. 1989;10:265–78.

    CAS  PubMed  Google Scholar 

  51. Kokotailo RA, Hill MD. Coding of stroke and stroke risk factors using international classification of diseases, revisions 9 and 10. Stroke. 2005;36(8):1776–81.

    Article  PubMed  Google Scholar 

Download references


The authors of this report are grateful to Ghenette Houston for administrative support, and to Sophanny Tiv for database development.

This work was supported by the Canadian Institutes for Health Research (MOP 133582), by a team grant to the Interdisciplinary Chronic Disease Collaboration from the Alberta Heritage Foundation for Medical Research (AHFMR) and a Leaders Opportunity Fund grant from the Canada Foundation for Innovation. MT, HQ, and SK are supported by career salary awards from Alberta Innovates-Health Solutions. PR is supported by a post-doctoral fellowship award from the Canadian Institutes for Health Research.

Author information

Authors and Affiliations



Corresponding author

Correspondence to Marcello Tonelli.

Additional information

Competing interests

There are no competing interests. This study is based in part by data provided by Alberta Health and Alberta Health Services. The interpretation and conclusions are those of the researchers and do not represent the views of the Government of Alberta. None of the study sponsors had a role in the study design; collection, analysis, and interpretation of data; writing the report; or in the decision to submit the report for publication.

Authors’ contributions

MT conceived the study. MT, NW and HQ designed and drafted the manuscript. NW performed the statistical analyses. All authors have made substantial contributions to the development of the manuscript, and have all been involved in revising it for important intellectual content and approved the final version.

Additional file

Additional file 1: Table S1.

Validated algorithms for the original 40 morbidities.

Rights and permissions

Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit

The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tonelli, M., Wiebe, N., Fortin, M. et al. Methods for identifying 30 chronic conditions: application to administrative data. BMC Med Inform Decis Mak 15, 31 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: