A novel method for studying the temporal relationship between type 2 diabetes mellitus and cancer using the electronic medical record
© Onitilo et al.; licensee BioMed Central Ltd. 2014
Received: 12 July 2013
Accepted: 29 April 2014
Published: 9 May 2014
We developed an algorithm for the identification of patients with type 2 diabetes and ascertainment of the date of diabetes onset for examination of the temporal relationship between diabetes and cancer using data in the electronic medical record (EMR).
The Marshfield Clinic EMR was searched for patients who developed type 2 diabetes between January 1, 1995 and December 31, 2009 using a combination of diagnostic codes and laboratory data. Subjects without diabetes were also identified and matched to subjects with diabetes by age, gender, smoking history, residence, and date of diabetes onset/reference date.
The final cohort consisted of 11,236 subjects with and 54,365 subjects without diabetes. Stringent requirements for laboratory values resulted in a decrease in the number of potential subjects by nearly 70%. Mean observation time in the EMR was similar for both groups with 13—14 years before and 5–7 years after the reference date. The two cohorts were largely similar except that BMI and frequency of healthcare encounters were greater in subjects with diabetes.
The cohort described here will be useful for the examination of the temporal relationship between diabetes and cancer and is unique in that it allows for determination of the date of diabetes onset with reasonable accuracy.
KeywordsType 2 diabetes mellitus Cancer Pre-diabetes Electronic medical record Method
The National Cancer Institute estimates that approximately 13.7 million Americans with a history of cancer were alive on January 1, 2012  with over 1.5 million additional cases diagnosed each year . Diabetes mellitus is even more prevalent, affecting 25.8 million people, or 8.3% of the population, in the United States . Accordingly, it is not uncommon for the same individual to be diagnosed with the both conditions, potentially compounding both illnesses [4, 5]. Diagnosis of cancer may make management of diabetes more difficult or conversely, diabetes may be predictive of poorer cancer outcomes [4, 6–9]. Understanding the relationship between cancer and diabetes and the impact that one disease may have on the other may provide important insight regarding both health and survival and has become a key research priority.
Recent studies have shed considerable light on the potential physiological and clinical relationship between diabetes and cancer [10–12]. Diabetes and cancer share several important risk factors and attempting to define the relationship between the two diseases is additionally confounded by demographic and lifestyle characteristics as well as exposure to diabetes medications . Cancer tends to be somewhat easier to study using information available in the electronic medical record (EMR) and various cancer registries. Studies of diabetes generally prove to be more difficult. Diabetes develops gradually and is characterized by progressive insulin resistance and hyperinsulinemia during the pre-diabetes phase followed by increasing hyperglycemia after clinical onset. Lifestyle modifications, medication use, and other treatment options are not usually initiated until after clinical diagnosis and many patients with diabetes go unrecognized for long periods of time. Reliance on administrative data, which indicates only when diabetes was recognized and diagnosed, not necessarily when it began, has precluded careful temporal analyses of the relationship between diabetes and cancer. Even so, EMRs have served as an important data source in initial studies of the relationship between cancer and diabetes. Limitations of such studies include imprecision in capture of diabetes onset date, inaccuracies in electronic data, difficulty in distinguishing between type 1 and type 2 diabetes, and biases inherent to retrospective and observational studies. Due in part to these limitations, the temporal and causal relationship between diabetes and cancer, if any, remains difficult to explore.
Numerous individual studies and meta-analyses have yielded important information regarding cancer risk following diabetes onset . However, recent evidence suggests that the hyperglycemia characteristic of overt diabetes may be less important in promoting cancer risk than the hyperinsulinemia characteristic of the pre-diabetes phases [13, 14]. Due to the powerful effects of insulin as a growth factor and the potential for hyperinsulinemia to impact cancer development, a number of studies have attempted to correlate insulin levels with cancer risk, finding some effect for certain cancer types [15–18]. However, little attention has been paid to the pre-diabetes phase specifically in patients known to progress to diabetes, and the long-term temporal relationship between the two diseases remains unclear. The purpose of this paper is to describe a unique method for determining date of onset of type 2 diabetes, even when onset of disease occurs prior to clinical recognition. This algorithm leverages the EMR to draw upon clinical, administrative, and laboratory data to accurately pinpoint the date of diabetes onset, exclude potential subjects with type 1 diabetes, and examine additional confounding factors, such as glycated hemoglobin (HbA1c) levels and medication exposure. Our study algorithm and methods are described in detail and compared to those published by other authors. Limitations and potential biases are also discussed.
Marshfield Clinic is a multi-specialty, regional healthcare system in Wisconsin, USA. The Marshfield Clinic EMR contains data dating back to the 1960s and provides comprehensive information regarding all encounters with the Marshfield Clinic and cooperating hospitals, including St. Joseph’s Hospital in Marshfield, WI. In 2007, Wilke et al.  published an electronic algorithm for identifying patients with diabetes mellitus in the EMR. However, this algorithm was focused on a specific subset of Marshfield Clinic patients enrolled in the Personalized Medicine Research Project (PMRP) and could not accurately pinpoint date of clinical diabetes onset. In the present study, we took this algorithm a step further and developed matched cohorts of patients with and without type 2 diabetes who received care at the Marshfield Clinic to retrospectively examine the temporal relationship between diabetes and three different types of cancer, including breast, prostate, and colon cancer, as well as medication exposure and glycemic control. The study was approved by the Marshfield Clinic Scientific Review Committee and the Institutional Review Board and a waiver of subject consent was granted [study ID—ONI10711/78037].
Patients diagnosed with type 2 diabetes between January 1, 1995 and December 31, 2009 were eligible for inclusion in the study. All potential subjects were required to be 30 years of age or older by the end of the study period and could not have any diabetes-related diagnoses or medication use prior to the study period. The pool of potential subjects was then divided based on whether or not they had any diabetes-related diagnostic codes during the study period. Patients with one or more diabetes-related codes during the study period comprised the pool of potential subjects for the cohort with diabetes. Patients with no diabetes-related diagnoses prior to the end of the study period comprised the pool of potential subjects for the cohort without diabetes.
Data sources included Marshfield Clinic’s comprehensive EMR system and cancer registry . Data were collected electronically and verified through manual chart abstraction of targeted samples. Reference dates for all subjects fell within the 15 year study period from 1995 through 2009, with follow-up through 2011 and observation before the reference date as far back as the patient’s history in the Marshfield Clinic EMR. Based on the need for extensive follow-up information, subjects were required to have received sufficient care through the Marshfield Clinic system so that diagnosis dates for diabetes and/or breast, prostate, or colon cancer could be determined with reasonable accuracy. All subjects were required to have at least one non-diabetes diagnosis or electronic code documenting a well-visit from a Marshfield Clinic provider in at least one of the three calendar years prior to the reference date. Observation times were censored prior to any large gap in the EMR, which was defined as four or more consecutive calendar years.
Cancer diagnoses required two documented diagnoses by ICD-9 code within the EMR. The first date on which the ICD-9 code was used was considered the date of cancer diagnosis and data were merged with data from the Cancer Registry to validate diagnoses and provide additional information. Several covariates with the potential to influence cancer risk were also examined, including comorbidities and clinical risk factors, as well as used of chemotherapy and radiation during cancer treatment. Cancer treatment data were only available for subjects in the local cancer registry, which limited analyses using these data. Comorbidities of interest included myocardial infarction, coronary heart disease, peripheral vascular disease, cardiovascular disease, chronic pulmonary disease, rheumatic heart disease, and renal insufficiency/renal failure, which were summarized using a modified Charlson score (excluding cancer and diabetes). Comorbidities were established by interrogating the EMR for relevant diagnostic codes, requiring at least two documented diagnoses in the subject’s medical record. The EMR was also interrogated for body mass index (BMI), smoking history, and insurance status at reference date as well as frequency of healthcare visits before and after the reference date. For subjects with diabetes, exposure to three classes of diabetes medications including insulin, metformin, and sulfonylurea drugs was ascertained.
The process of participant selection and matching is summarized in Figure 2. Of note, application of our algorithm including laboratory parameters to the pool of potential subjects with diabetes resulted in exclusion of approximately 70% of patients. Less than 10% of potential subjects were lost when those with other diabetes-related diagnoses or abnormal glucose values > 3 years prior to diabetes diagnosis were excluded. An additional 40% of remaining potential subjects were excluded because they did not have at least two high glucose or HbA1c levels, and another 50% of potential subjects were excluded because they did not have a normal HbA1c or glucose value recorded within 3 years prior to diabetes diagnosis. After application of inclusion and exclusion criteria, there were 11,236 patients included in the final cohort with diabetes. After assigning reference dates, 54,365 participants without diabetes remained. Losses in the matching process resulted in a final matched cohort with 4.8 subjects without diabetes for each patient with diabetes, rather than the target ratio of 5:1. Despite a smaller final sample size we believe a more defined cohort is likely to be more informative and better suited for analysis than a less well-defined and refined larger cohort.
In our final validation sample, we manually abstracted evidence of diabetes diagnosis in 70 patient charts. If a diabetes diagnosis was present (N = 50), we verified the date with laboratory values for HbA1c and glucose, office notes, and medications listed. Prior records were checked to ensure that the diagnosis had not been mentioned previously but not coded. In patients in whom no diagnosis of diabetes was evident (N = 20), we verified the absence of any diabetes diagnoses on problem lists, verified that there were no high HbA1c or glucose levels, verified that no diabetes medications were listed, and that diabetes was not mentioned in the notes for a recent office visit or history and physical. In this validation sample, the observed predictive value for control subjects (NPV) was 100% (20/20). It is important to note that cases can always become controls, and this was observed in one control subject who developed diabetes in 2011—7 years after the assigned reference date in 2004—but this has no bearing on algorithm validity. The predictive value for case status (PPV) was 96% (48/50), with two subjects appearing to be incorrectly identified. However, upon arbitration, one of the two subjects was found to have a diagnosis of diabetes during the study period, increasing the positive predictive value to 98%. Overall sensitivity of the algorithm for detecting type II diabetes was 96% (95% CI 86.3–99.4%) and overall specificity was 95% (95% CI 75.1–99.2%). The date of diabetes onset determined by manual chart review was within 6 months of the study-assigned date of onset in over 70% of subjects with diabetes.
Subject descriptive characteristics by type 2 diabetes status
Diabetes (N = 11,236)
No diabetes (N = 54,365)
Mean age (years) (IQR)
Diabetes diagnosis period
Mean BMI (kg/m2) (IQR)
Visit frequency during 2 years before diabetes
Visit frequency during 2 years after diabetes
Mean observation time (IQR)
Years Before Diabetes onset
Years After Diabetes onset
Coronary heart disease
Peripheral vascular disease
Chronic pulmonary disease
Rheumatic heart disease
Several studies have examined the influence of diabetes on cancer risk and the general consensus suggests that diabetes increases cancer risk, with the notable exception of prostate cancer . Diabetes is a progressive disease and physiological changes begin to occur long before clinical onset of disease . During the pre-diabetes phase, patients undergo a prolonged period of increasing insulin resistance and hyperinsulinemia that ultimately results in the progressive hyperglycemia characteristic of diabetes itself. Recent evidence suggests that the hyperinsulinemia characteristic of the pre-diabetes phase is more important for promoting cancer risk than the hyperglycemia present after clinical onset [13, 14]. Despite this evidence, examining cancer risk in the pre-diabetes phase is difficult and the temporal relationship between the two diseases has remained largely unexplored. We developed an electronic algorithm that calls upon administrative, laboratory, and clinical data to accurately identify patients with type 2 diabetes and to determine the date of clinical onset for over 10,000 patients. A cohort without diabetes was also generated and includes over 50,000 patients with assigned reference dates. Together, the cohort of approximately 65,000 patients with over 16 years of follow-up after diabetes onset and 6–7 years of observation before provides a resource for the temporal examination of the relationship between diabetes and cancer risk.
Comparison of algorithms using electronic medical record data for identification of patients with type 2 diabetes
≥ 1 short-stay hospital, skilled nursing facility, or home health agency claim or ≥ 2 physician/supplier claims with diabetes diagnosis
1 – 2 year identification period
1 hospital discharge abstract or 2 physician services claims showing diabetes
2 year period
250.X0, 250.X2, 357.2, 362.0X, 583.81
≥ 1 high HbA1c or random glucose or ≥ 2 random glucose tests
Metformin, sulfonylurea, or insulin
250.X0, 250.X2, or 362.XX (no insulin)
≥ 1 high HbA1c or ≥ 2 high fasting or random glucose tests
Diagnosis ≥ 30 days after first office visit
First appearance in problem list
On problem list (coded or free text) ≥ 2 times in 2 years
≥ 2 high fasting glucose tests in 1 year or any high HbA1c2
High HbA1c, fasting, or random glucose3
Number unique anti-diabetes medications
≥ 2 250.X0 or 250.X2
High HbA1c or fasting glucose test
Insulin or oral hypoglycemic agents except metformin
Surveillance in real time
No type 1 code, ≥ 2 type 2 codes
Abnormal glucose or HbA1c
Type 2 diabetes medications
≥ 1 250.X0 or 250.X2, ≥ 1 year before any type 1 code
≥ 2 high HbA1c or glucose test and ≥ 1 normal HbA1c or glucose test
Excluded for diabetes medication > 30 days before diagnosis
Normal and abnormal labs within 3 years
Earliest of first diagnosis or second high lab
Despite meticulous selection of patients with and without diabetes for study inclusion, our cohort is nevertheless subject to biases inherent to retrospective and observational studies as well as certain time-related biases common in observational studies. In addition, the data available in an EMR are only as good as the data input during routine patient care. As such, for healthcare systems in which several healthcare choices are available nearby, laboratory values and diagnoses captured outside of the healthcare system may not be available. Marshfield Clinic serves a relatively rural, agriculture-based population with little turnover and little choice of healthcare provider. As such, our EMR serves as a robust source of data, but we recognize that certain data points may be missing. Ascertainment bias is inherent to retrospective studies. In the current cohort, ascertainment bias may result from the fact that patients with diabetes have more frequent contact with the healthcare system. Additionally, HbA1c screening was not recommended by the ADA until 2010, after the study period, and in the 3 years prior it is estimated that only 10–20% of adults without diabetes underwent HbA1c testing , which may introduce an additional source of ascertainment bias. As with other laboratory tests, methods for measuring HbA1c have also changed over time. Use of reference period as a matching criterion in cohort development is likely to minimize the effects of any such change, however. Similarly, selection bias may result from the unintentional selection for differing characteristics among patients who, for example, receive a particular diabetes treatment. Additionally, labs drawn to assess HbA1c and glucose levels may be more likely to be performed in patients with a higher risk for cancer, of concern in both groups. The effects of selection bias are minimized to some extent by the cohort design, which uses an extensive matching process to account for age, gender, residence, smoking history, and reference period. In future work using the cohort described here, the potential for additional confounding as a result of selection bias will be minimized via proportional hazards regression modeling with adjustment for relevant covariates. Time-related biases, including immortal time bias, time-window bias, and time-lag bias, will be of particular concern when considering the effect of exposure to diabetes medications on cancer risk . While elimination of such biases may not be realistic, efforts to minimize their effects include use of time-varying analyses, assessment of follow-up time, and examination of both exposure and duration in analyses of diabetes medications. Importantly, reference period was used as a matching criterion in cohort development and follow-up time before and after the reference date or date of diabetes onset was similar.
Murdoch and Detsky recently reported on the inevitable application of the massive amount of data captured by the EMR to health care, emphasizing the potential value of using information generated in the course of routine care to answer important questions and to improve the quality of care . Here we demonstrate an example using data abstracted electronically from the EMR to develop patient cohorts for the careful examination of the temporal relationship between diabetes and cancer. To date, the cohort described here has been used to examine the temporal relationship between diabetes and breast cancer in women , prostate cancer in men , and colon cancer , as well as the effects of glycemic control and medication exposure on cancer risk . In the future, we plan to use this cohort to examine tumor severity and survival as well as the effects of additional disease conditions and comorbidities, such as sleep apnea, on cancer risk in patients with diabetes.
Electronic medical record
Personalized medicine research project
Marshfield epidemiologic study area
International classification of disease, version 9
Body mass index.
This study was made possible by an internal research grant [ONI10711; 78037] provided by the Marshfield Clinic. The authors also wish to thank Marie Fleisner for editorial assistance and Kim Hill for protocol development.
- Siegel R, DeSantis C, Virgo K, Stein K, Mariotto A, Smith T, Cooper D, Gansler T, Lerro C, Fedewa S, Lin C, Leach C, Cannady RS, Cho H, Scoppa S, Hachey M, Kirch R, Jemal A, Ward E: Cancer treatment and survivorship statistics, 2012. CA Cancer J Clin. 2012, 62 (4): 220-241. 10.3322/caac.21149.View ArticlePubMed
- American Cancer Society: Cancer Facts and Figures 2012. 2012, Atlanta (GA): American Cancer Society
- Centers for Disease Control and Prevention: National diabetes fact sheet: national estimates and general information on diabetes and prediabetes in the United States, 2011. 2011, Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention,http://www.cdc.gov/diabetes/pubs/pdf/ndfs_2011.pdf,
- Stava CJ, Beck ML, Feng L, Lopez A, Busaidy N, Vassilopoulou-Sellin R: Diabetes mellitus among cancer survivors. J Cancer Surviv. 2007, 1 (2): 108-115. 10.1007/s11764-007-0016-z.View ArticlePubMed
- Vigneri P, Frasca F, Sciacca L, Pandini G, Vigneri R: Diabetes and cancer. Endocr Relat Cancer. 2009, 16 (4): 1103-1123. 10.1677/ERC-09-0087.View ArticlePubMed
- Ko CY, Maggard M, Livingston EH: Evaluating health utility in patients with melanoma, breast cancer, colon cancer, and lung cancer: a nationwide, population-based assessment. J Surg Res. 2003, 114 (1): 1-5. 10.1016/S0022-4804(03)00167-7.View ArticlePubMed
- Thong MS, van de Poll-Franse L, Hoffman RM, Albertsen PC, Hamilton AS, Stanford JL, Penson DF: Diabetes mellitus and health-related quality of life in prostate cancer: 5-year results from the Prostate Cancer Outcomes Study. BJU Int. 2011, 107 (8): 1223-1231. 10.1111/j.1464-410X.2010.09861.x.PubMed CentralView ArticlePubMed
- Edgington A, Morgan MA: Looking beyond recurrence: comorbitidies in cancer survivors. Clin J Oncol Nurs. 2011, 15 (1): E3-E12. 10.1188/11.CJON.E3-E12.View ArticlePubMed
- Khan NF, Mant D, Carpenter L, Forman D, Rose PW: Long-term health outcomes in a British cohort of breast, colorectal and prostate cancer survivors: a database study. Br J Cancer. 2011, 105 (Suppl 1): S29-S37.PubMed CentralView ArticlePubMed
- Onitilo AA, Engel JM, Glurich I, Stankowski RV, Williams GM, Doi SA: Diabetes and cancer I: risk, survival, and implications for screening. Cancer Causes Control. 2012, 23 (6): 967-981. 10.1007/s10552-012-9972-3.PubMed CentralView ArticlePubMed
- Ferguson RS, Gallagher EJ, Scheinman EJ, Damouni RR, LeRoith D: The epidemiology and molecular mechanisms linking obesity, diabetes, and cancer. Vitam Horm. 2013, 93: 51-98.View ArticlePubMed
- Orgel E, Mittelman SD: The links between insulin resistance, diabetes, and cancer. Curr Diab Rep. 2013, 13 (2): 213-222. 10.1007/s11892-012-0356-6.PubMed CentralView ArticlePubMed
- Onitilo AA, Engel JM, Glurich I, Stankowski RV, Williams GM, Doi SA: Diabetes and cancer II: role of diabetes medications and influence of shared risk factors. Cancer Causes Control. 2012, 23 (7): 991-1008. 10.1007/s10552-012-9971-4.PubMed CentralView ArticlePubMed
- Johnson JA, Bowker SL: Intensive glycaemic control and cancer risk in type 2 diabetes: a meta-analysis of major trials. Diabetologia. 2011, 54 (1): 25-31. 10.1007/s00125-010-1933-3.View ArticlePubMed
- Balkau B, Kahn HS, Courbon D, Eschwège E, Ducimetière P, Paris Prospective Study: Hyperinsulinemia predicts fatal liver cancer but is inversely associated with fata cancer at some other sites: the Paris Prospective Study. Diabetes Care. 2001, 24 (5): 843-849. 10.2337/diacare.24.5.843.View ArticlePubMed
- Ahern TP, Hankinson SE, Willet WC, Pollak MN, Eliassen AH, Tamimi RM: Plasma C-peptide, mammographic breast density, and risk of invasive breats cancer. Cancer Epidemiol Biomarkers Prev. 2013, 22 (10): 1786-1796. 10.1158/1055-9965.EPI-13-0375.View ArticlePubMed
- Wolpin BM, Bao Y, Qian ZR, Wu C, Kraft P, Ogino S, Stampfer MJ, Sato K, Ma J, Buring JE, Sesso HD, Lee IM, Gaziano JM, McTiernan A, Phillips LS, Chochrane BB, Pollak MN, Manson JE, Giovannucci EL, Fuchs CS: Hyperglycemia, insulin resistance, impaired pancreated β-cell function, and risk of pancreatic cancer. J Natl Cancer Inst. 2013, 105 (14): 1027-1035. 10.1093/jnci/djt123.PubMed CentralView ArticlePubMed
- Eliassen AH, Tworoger SS, Mantzoros CS, Pollak MN, Hankinson SE: Circulating insulin and c-peptide levels and risk of breast cancer among predominantly premenopausal women. Cancer Epidemiol Biomarkers Prev. 2007, 16 (1): 161-164. 10.1158/1055-9965.EPI-06-0693.View ArticlePubMed
- Wilke RA, Berg RL, Peissig P, Kitchner T, Sijercic B, McCarty CA, McCarty DJ: Use of an electronic medical record for the identification of research subjects with diabetes mellitus. Clin Med Res. 2007, 5 (1): 1-7. 10.3121/cmr.2007.726.PubMed CentralView ArticlePubMed
- American Diabetes Association: Diagnosis and classification of diabetes mellitus. Diabetes Care. 2010, 33 (Suppl 1): S62-S69.PubMed CentralView Article
- DeStefano F, Eaker ED, Broste SK, Nordstrom DL, Peissig PL, Vierkant RA, Konitzer KA, Gruber RL, Layde PM: Epidemiologic research in an integrated regional medical care system: the Marshfield Epidemiologic Study Area. J Clin Epidemiol. 1996, 49 (6): 643-652. 10.1016/0895-4356(96)00008-X.View ArticlePubMed
- Bertram MY, Vos T: Quantifying the duration of pre-diabetes. Aust N Z J Public Health. 2010, 34 (3): 311-314. 10.1111/j.1753-6405.2010.00532.x.View ArticlePubMed
- Hebert PL, Geiss LS, Tierney EF, Engelgau MM, Yawn BP, McBean AM: Identifying persons with diabetes using Medicare claims data. Am J Med Qual. 1999, 14 (6): 270-277. 10.1177/106286069901400607.View ArticlePubMed
- Hux JE, Ivis F, Flintoft V, Bica A: Diabetes in Ontario: determination of prevalence and incidence using a validated administrative data algorithm. Diabetes Care. 2002, 25 (3): 512-516. 10.2337/diacare.25.3.512.View ArticlePubMed
- Kudyakov R, Bowen J, Ewen E, West SL, Daoud Y, Fleming N, Masica A: Electronic health record use to classify patients with newly diagnosed versus preexisting type 2 diabetes: infrastructure for comparative effectiveness research and population health management. Popul Health Manag. 2012, 15 (1): 3-11. 10.1089/pop.2010.0084.View ArticlePubMed
- Greiver M, Keshavjee K, Martin K, Aliarzadeh B: Who are your patients with diabetes?: EMR case definitions in the Canadian primary care setting. Can Fam Physician. 2012, 58 (7): 804-e421–422PubMed CentralPubMed
- Kandula S, Zeng-Treitler Q, Chen L, Salomon WL, Bray BE: A bootstrapping algorithm to improve cohort identification using structure data. J Biomed Inform. 2011, 44 (Suppl 1): S63-S68.View ArticlePubMed
- Klompas M, Eggleston E, McVetta J, Lazarus R, Li L, Platt R: Automated detection and classification of type 1 versus type 2 diabetes using electronic health record data. Diabetes Care. 2013, 36 (4): 914-921. 10.2337/dc12-0964.PubMed CentralView ArticlePubMed
- Pacheco JA, Thompson W, Kho A: Automatically detecting problem list omissions of type 2 diabetes cases using electronic medical records. AMIA Annu Symp Proc. 2011, 2011: 1062-1069.PubMed CentralPubMed
- Km N, Peissig PL, Kho AN, Bielinski SJ, Berg RL, Choudhary V, Basford M, Chute CG, Kullo IJ, Li R, Pacheco JA, Rasmussen LV, Spangler L, Denny JC: Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc. 2013, 20 (e1): e147-e154. 10.1136/amiajnl-2012-000896.View Article
- Greiver M, Aliarzadeh B, Moineddin R, Meaney C, Ivers N: Diabetes screening with hemoglobin A1c prior to a change in guidelines recommendations: prevalence and patient characteristics. BMC Fam Pract. 2011, 12: 91-10.1186/1471-2296-12-91.PubMed CentralView ArticlePubMed
- Suissa S, Azoulay L: Metformin and the risk of cancer: time-related biases in observational studies. Diabetes Care. 2012, 35 (12): 2665-2673. 10.2337/dc12-0788.PubMed CentralView ArticlePubMed
- Murdoch TB, Detsky AS: The inevitable application of big data to health care. JAMA. 2013, 309 (13): 1351-1352. 10.1001/jama.2013.393.View ArticlePubMed
- Onitilo AA, Stankowski RV, Berg RL, Engel JM, Glurich I, Williams GM, Doi SA: Breast cancer incidence before and after diagnosis of type 2 diabetes mellitus in women: increased risk in the prediabetes phase. Eur J Cancer Prev. 2013, 23 (2): 76-83.View Article
- Onitilo AA, Berg RL, Engel JM, Stankowski RV, Glurich I, Williams GM, Doi SA: Prostate cancer risk in pre-diabetic men: a matched cohort study. Clin Med Res. 2013, 11 (4): 201-9. 10.3121/cmr.2013.1160.PubMed CentralView ArticlePubMed
- Onitilo AA, Berg RL, Engel JM, Glurich I, Stankowski RV, Williams G, Doi SA: Increased risk of colon cancer in men in the pre-diabetes phase. PLoS One. 2013, 8 (8): e70426-10.1371/journal.pone.0070426.PubMed CentralView ArticlePubMed
- Onitilo AA, Stankowski RV, Berg RL, Engel JM, Glurich I, Williams GM, Doi SA: Type 2 diabetes mellitus, glycemic control, and cancer risk. Eur J Cancer Prev. 2013, 23 (2): 134-140.View Article
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6947/14/38/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.