Is pathology necessary to predict mortality among men with prostate-cancer?

Background Statistical models developed using administrative databases are powerful and inexpensive tools for predicting survival. Conversely, data abstraction from chart review is time-consuming and costly. Our aim was to determine the incremental value of pathological data obtained from chart abstraction in addition to information acquired from administrative databases in predicting all-cause and prostate cancer (PC)-specific mortality. Methods We identified a cohort of men with diabetes and PC utilizing population-based data from Ontario. We used the c-statistic and net-reclassification improvement (NRI) to compare two Cox- proportional hazard models to predict all-cause and PC-specific mortality. The first model consisted of covariates from administrative databases: age, co-morbidity, year of cohort entry, socioeconomic status and rural residence. The second model included Gleason grade and cancer volume in addition to all aforementioned variables. Results The cohort consisted of 4001 patients. The accuracy of the admin-data only model (c-statistic) to predict 5-year all-cause mortality was 0.7 (95% CI 0.69-0.71). For the extended model (including pathology information) it was 0.74 (95% CI 0.73-0.75). This corresponded to a change in category of predicted probability of survival among 14.8% in the NRI analysis. The accuracy of the admin-data model to predict 5-year PC specific mortality was 0.76 (95% CI 0.74-0.78). The accuracy of the extended model was 0.85 (95% CI 0.83-0.87). Corresponding to a 28% change in the NRI analysis. Conclusions Pathology chart abstraction, improved the accuracy in predicting all-cause and PC-specific mortality. The benefit is smaller for all-cause mortality, and larger for PC-specific mortality.


Background
Administrative databases are often used to create models to predict clinical outcomes, in particular survival [1][2][3][4]. Most cancers are fast growing, and once diagnosed have an enormous impact on survival. Therefore, commonly, models to predict survival among these subjects include detailed oncologic information [5][6][7]. However, earlier cancer diagnoses and advances in treatment have been associated with reduced cancer mortality, such that in 2003 there were an estimated 10 million cancer survivors in the United States [8]. Consequently, patients are living longer after a diagnosis of cancer to the point where existing comorbidities may have a substantial impact on their overall survival.
Prostate cancer is the most common form of non skin cancer diagnosed in men, with three quarters of cases occurring in men aged 65 years and older [9,10]. Prostate cancer is slow growing. Accordingly, death among prostate cancer patients is more likely to be associated with a subject's comorbidities than prostate cancer itself [11][12][13]. This is particularly true among patients with diabetes [14,15].
Capturing information from pathology data is labor intensive and expensive. Therefore, if the addition of these pathology clinical variables to a predictive model with variables attained solely from administrative data does not enhance model performance, their inclusion should be avoided. The objective of this study was to quantify the impact of adding Gleason grade and cancer volume (obtained from chart review) to a predictive model for mortality among a cohort of elderly men with incident diabetes and prostate cancer. We further aimed to distinguish all-cause mortality from prostate-cancerspecific mortality. We hypothesized that pathology data may have a great impact on disease-specific mortality, but a smaller or even a null effect on all-cause mortality.

Overview
This cohort was used in a prior study to examine the impact of diabetes on prostate cancer survival [16]. We are therefore able to use it to determine the incremental utility of pathological data, obtained from chart abstraction, in addition to demographic and co-morbidity information, acquired from administrative databases, in predicting all-cause and prostate-cancer-specific mortality among diabetic men with prostate cancer. The current study was approved by the institutional review boards at Sunnybrook and Princess Margret Hospital, University Health Network, Toronto, Canada.

Data sources
The province of Ontario has a population of approximately 13 million. All residents are covered under a universal health insurance plan. Individuals aged 65 or older are additionally eligible for prescription drug coverage. We used a variety of electronic health data resources linkable using an encrypted unique identifier. The Ontario Cancer Registry (OCR) is a computerized database of Ontario residents newly diagnosed with cancer (except non-melanoma skin cancer), which is estimated to be more than 95% complete [17]. The Ontario Diabetes Database (ODD) -a validated, administrative dataderived registry of diabetes cases in Ontario [18]. The Ontario Health Insurance Plan (OHIP) database includes claims paid to physicians, groups, laboratories, and out-ofprovince providers [19]. The Canadian Institute for Health Information (CIHI) Discharge Abstract Database (DAD) contains records for each hospital stay [20]. The CIHI National Ambulatory Care Reporting System (NACRS) captures information on ambulatory care, including day surgery, outpatient clinics, and emergency departments. CIHI DAD, NACRS and OHIP were used to capture comorbidities. The Registered Persons Data Base (RPDB) contains demographics and vital status.

Study population and cohort definition
Our study cohort consisted of men aged 66 years or older with incident diabetes and subsequent PC, identified using the following steps. First, we used the ODD to identify all patients aged 66 years or older with incident diabetes in Ontario between April 1, 1994 andMarch 31, 2008 (160,867 subjects). We then crossed referenced with OCR to identify patients with incident PC after diabetes diagnosis (ICD-9 code 185, and ICD-10 code C61). We excluded patients diagnosed with PC prior to diabetes and patients with non-prostate malignancies (25,882 subjects). We then performed a detailed chart review of all pathology reports available from OCR and excluded patients without pathology data (855 subjects). Cohort entry date was defined as date of PC diagnosis. We followed up eligible individuals until they experienced an event (PC-specific or all-cause mortality), or among those who did not die, a last health services contact in Ontario (for those who lost health contact for at least 6 months) or March 31, 2009, whichever came first.

Pathology chart abstraction
For the purpose of this study we performed a pathology chart abstraction of all subjects with available pathology reports at OCR. Chart abstraction was performed by an expert Uro-Oncologist (DM). The variables abstracted were Gleason grade (primary and secondary), volume of tumor (in biopsy the volume was defined as number of positive cores/ number of cores taken; in prostatectomy specimens the volume as recorded by the pathologist), PSA value at biopsy and procedure (i.e. prostate biopsy or transurethral prostatectomy). These variables were linked deterministically to our dataset using the unique OCR identifier. Since only 1312 (28.6%) had a PSA value recorded this variable was not included in the analysis.
For subjects with repeat biopsies during follow-up the first biopsy was considered the "diagnostic biopsy". Any pathology report from an external consultant was documented and included in the analysis. Subjects who had a transurethral prostatectomy and subsequently a prostate biopsy were considered diagnosed by transurethral resection (TUR) but the Gleason score and tumor volume were assigned based on prostate biopsy.

Outcome definitions
We measured 2 outcomes: (1). PC-specific mortality, which was derived from the cause of death recorded in the OCR. The cause of death variable in OCR has been validated in several previous studies [21][22][23]. (2). All-cause mortality was derived from death certificate data in RPDB. The date of death for both outcomes was taken from the RPDB.

Covariates and statistical analysis
In this study we compared two nested Cox-proportional hazard models to model the hazard of all-cause and prostate cancer-specific mortality. The first model consisted of covariates that were derived from administrative databases including: age at prostate cancer diagnosis, Johns Hopkins Clinical Groups Case Mix system weighted sum of Adjusted Diagnostic Group [24,25], year of cohort entry (to adjust for temporal changes in the management of diabetes and prostate cancer), socioeconomic status (SES) and rural housing. Socioeconomic status and housing status were assessed using the RPDB. The RPDB uses information from Stats Canada and links each individual postal code to an appropriate defined neighborhood. Statistics Canada reports the median income for each neighborhood, and ranks them into quintiles (5 refer to the highest socioeconomic status whereas quintile 1 is the lowest). The second model included Gleason grade and cancer volume (obtained from pathology abstraction) in addition to all aforementioned variables. We used the c-statistic described by Pencina et al. [26][27][28] as a measure of the accuracy of the two models. We also used net-reclassification improvement (NRI) as a measure of the utility of the added pathology variables. Using the NRI method [26,29], we first used the fitted model that used administrative data only, to classify persons' predicted probability of 5-year mortality into low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. We chose these categories as we believe they represent clinically meaningful categories; similar categories were used by others to asses all-cause mortality among prostate cancer patients [30]. Subsequently, we repeated the same classification process using the extended model (i.e. with the addition of Gleason grade and cancer volume). If the first model categorized the individual's predicted risk of 5 year mortality to the intermediate category (10-50%), for example, and the second model categorized the same individual into the high category ( more than 50%), the new model moves the predicted risk score 'UP'. If, however, the new model categorized the individual into the low risk category, the model moves the predicted risk score 'DOWN'. We repeated the same analysis for all-cause and prostate-cancer specific mortality. The c-statistic and NRI were calculated using a macro provided Chambless et al. [31] and 95% CI were estimated using 200 bootstrap samples. All analyses were conducted with SAS version 9.2 (SAS Institute).

Results and discussion
Overall, 4856 incident diabetic men older than 66 who subsequently developed prostate cancer were identified. Pathology reports were available for 4001 (82%) who were included in the analysis. The median (IQR) age at PC diagnosis was 75 (72-79) years (Table 1) The multivariable models to predict all-cause mortality are described in Tables 2 and 3 (Table 2-the model consisting of variables derived from administrative data only; Table 3-the model that included Gleason grade and cancer volume in addition to variables derived from administrative data). In both models age, year of cohort entry, rural residence and comorbidities were independent predictors of all-cause mortality. Higher Gleason grade and cancer volume (Table 3) were also associated with increased all-cause mortality, even after controlling for the effect of other variables. The accuracy of the models (i.e. c-statistic) to predict 5-year mortality were 0.7 (95% CI 0.69-0.71) and 0.74 (95% CI 0.73-0.75) for the admin-data only and the extended model (including pathology information), respectively. This corresponded to an incremental increase of 0.04 (95% CI 0.03-0.05) in the c-statistic.
Using the NRI method, we first used the model based on administrative data only, to classify persons' predicted probability of 5-year mortality into low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. The risk category of predicted probability of 5-year mortality did not change in 85.2% of patient when the extended model was applied (Table 4). Among the 214 (5.3%) subjects with low predicted probability of 5-year mortality, the extended model reclassified 31 (14.5%) subjects to a higher risk group (Table 4). Among the 353 (8.8%) patients with a high predicted probability of 5-year mortality, the extended model moved 122 (34.6%) patients to a lower risk group.
Of the 3432 patients classified to 10-50% risk, the extended model moved 221 (6.4%) to the lower risk group and 219 (6.4%) to the higher risk group.
The multivariable models to predict prostate-cancer specific mortality are described in Tables 5 and 6  (Table 5-the model consisting of variable derived from  administrative data only; Table 6-the model that included Gleason grade and cancer volume in addition to the variables in the administrative data model). Higher Gleason grade and cancer volume were important predictors of prostate cancer specific mortality. The accuracy of the models (i.e. c-statistic) to predict 5-year mortality were 0.76 (95% CI 0.74-0.78) and 0.85 (95% CI 0.83-0.87) for the admin-data only and the extended model (including pathology information), respectively. This corresponded to an incremental increase of 0.09 (95% CI 0.07-1.1) in the c-statistic.
Using the NRI method, we first used the model based on administrative data only, to classify persons' predicted probability of 5-year mortality into low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. The risk category of predicted probability of 5-year prostate cancer specific mortality of 928 subjects (28%) in our cohort changed when the extended model was applied (Table 7). Among the 2981 subjects with low predicted probability of 5-year prostate cancer specific mortality (less than 10%), the extended model reclassified 378 (14.5%) to a higher risk group (Table 7). Among the 28 patients with a high predicted probability of 5-year prostate cancer specific mortality (more than 50%), the extended model moved 18 (64.3%) of to a lower risk group. Of the 990 patients classified to 10-50% risk the extended model moved 469 (47.4%) to the lower risk group and 58 (5.9%) to the higher risk group.
Using data from 4001 elderly male diabetic patients who subsequently developed prostate cancer, we demonstrated that pathology data obtained by chart abstraction improved the accuracy in predicting all-cause and prostate-cancer specific mortality. The benefit in predicting all-cause mortality was modest, evident by only a 0.04 difference in the c-statistic, and by the fact that the extended model (with Gleason grade and cancer volume) changed the risk category of predicted probability of survival for 14.8% of the men in our cohort. In contrast to this, pathology data modified the accuracy in predicting prostate cancer specific mortality considerably. The extended model demonstrated a c-statistic of 0.85 (95% CI 0.83-0.87) compared to 0.76 (95% CI 0.74-0.78) when only administrative data was used. Furthermore, the extended model changed the risk category of predicted probability of 5 years survival for 28% of the men in our cohort.
Most predictive models in prostate cancer rely on clinical data such as PSA, Gleason grade, cancer volume and stage [32][33][34]. Detailed clinical information is often missing in large administrative data, and therefore these models have limited use among policy makers and  health service researchers. Deriving missing data from administrative database can be accomplished in several methods. One could abstract a random sample of medical records and use that abstraction to develop an algorithm to infer the data [35]. However this rarely applies to pathology. There is no unified clinical pathway that may identify Gleason grade or volume of tumor. For example two patients with a similar Gleason grade and stage may receive different treatments, and vice versa patients with different pathology may receive similar treatment. Though less efficient, we believe that chart review is the most reliable method for pathology data gathering.
In this study we demonstrated that the value of chart review and detailed pathology data depends on the research question and the accuracy threshold that is acceptable. Changes in NRI risk categories correspond to change in sensitivity at the higher threshold plus change in sensitivity at the lower threshold (personal communication Michael Pencina). For example, if 10% accuracy is needed then chart abstraction is required regardless of outcome. However, in a study were 20% accuracy is acceptable, we demonstrated that chart review is needed only for the outcome of prostate cancer mortality; and not for all-cause mortality.
In our study, we reviewed over 5000 pathology reportsif one assumes a trained chart abstractor (that costs approximately $25 an hour) can review a report in 10minutes, the total time dedicated to chart review was 840 hours. The total cost associated with this endeavor was $21,000. This may not be feasible for a larger cohort.
Two key aspects of prediction model performance are discrimination and calibration [36]. Discrimination refers to the ability of a prediction model to distinguish between patients. A typical measure of discrimination is the c-statistic; the c-statistic provides the probability that, for a randomly selected pair of subjects, the model gives a higher probability to the subject who had the event, or who had the shorter survival time. However, one limitation of the c-statistic is that a strong risk predictor may have limited impact [31,36]. Furthermore, the c-statistic is difficult to interpret clinically. Therefore, in our study we used NRI to assess the difference in calibration between the models. Calibration This table depicts the classification differences between the extended model (with pathology) to the model without pathology. Using a cut-off of less than 10% risk of 5 year all cause mortality. The extended model reclassified 31 of 214 (14.5%) to a higher risk group (between 10-50% risk). Using a cutoff of more than 50% the extended model moved 122 (34.6%) of 353 patients to a lower risk group. Of the 3432 patients classified to 10-50% risk the extended model moved 221 (6.4%) to the lower risk group and 219 (6.4%) to the higher risk group.  measures whether predicted probabilities agree with observed proportions. Reclassification can directly compare the clinical impact of two models by determining how many individuals would be reclassified into clinically relevant risk strata. Adding Gleason grade and cancer volume to the prediction model for all-cause mortality moved approximately 15% of subjects-however adding the same variables moved approximately 30% of subjects in the model predicting prostate-cancer specific mortality. There are many other tests that may be used to measure predictive accuracy such as calibration plots, decision curves, and integrated discrimination improvement [37][38][39][40]. Since the objective of our study was to establish the importance of pathological data and not form the ideal prediction model we did not utilize all these measures. In our NRI analysis we used several cut-offs to assess reclassification. We considered the 5-year probability of mortality as low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. Although risk is a continuum, clinicians ultimately have to make binary choices, such as whether or not to treat a subject. This entails consideration of how high a risk is "high enough" to necessitate action. A risk cut-point depends on the relationship between the harms of an event and the harms of needless treatment. We therefore set our cut-points to be clinically relevant. Although the intermediate risk category range is rather large (10-50% risk), these patients are often grouped together and treatment decisions are made at the extremes. Since we aim to assess the utility of adding clinical variables to a prediction model, we believe that these categories are sufficient.
There are several limitations to our study. First, our population was older, diabetic, and had worse Gleason grade distribution than the general population of PC patients, thus generalizability to a contemporary PC cohort is tempered. It is possible that among a younger cohort pathology may have a greater impact on both all-cause mortality and prostate cancer-specific mortality. However, nearly three quarters of men with prostate cancer are aged 65 and older at the time of diagnosis and most of these patients have other comorbidities such as hypertension, ischemic heart disease, diabetes, etc. [9,13,15] Restricting our cohort to incident diabetics who subsequently develop prostate cancer served to create a more homogeneous cohort and minimize the possibility for misclassification of comorbidities a common problem with administrative data [41,42]. Second, we did not include treatment (such as radiation or surgery) in our predictive models. Since these variables are post baseline (i.e. they occur after the diagnosis) they need to be modeled as time-dependent covariants. If they are not modeled appropriately that introduces immortal bias, since in order to receive a treatment one obviously needs to survive until that time. Since there is no known way of calculating the c-statistic or NRI for Cox proportional models with time dependent covariates we have decided not to include them in our model. Furthermore, since treatment would have been included in both the administrative data and the extended models, their exclusion should not have changed our results. Finally, we lack data on severity of diabetes, body mass index, and prostate cancer stage even in the model including pathology. Further studies are needed to address the impact of adding these variables.

Conclusions
Administrative data are relatively inexpensive to obtain. In contrast, abstracting clinical data from patients' medical records is expensive, time consuming and labor intensive. The current study demonstrates that the additional cost associated with the use of detailed clinical data may not be justified for an outcome of all-cause mortality, but is more important to study prostate-cancer specific mortality. This information is useful to inform administrative health researchers and policy-makers about the proper allocation of funding resources.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions DM had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. DM carried out data collection and assembly, drafted the initial version of the manuscript, and participated in data and statistical analysis, interpretation of the data, and critical revision of the manuscript for important intellectual content. DU, LL, CB, This table depicts the classification differences between the extended model (with pathology) to the model without pathology. Using a cut-off of less than 10% risk of 5 year prostate-cancer specific mortality. The extended model reclassified 378 of 2981 (12.68%) to a higher risk group (between 10-50% risk). Using a cutoff of more than 50% the extended model moved 18 (64.3.6%) of 28 patients to a lower risk group. Of the 990 patients classified to 10-50% risk the extended model moved 469 (47.37%) to the lower risk group and 58 (5.9%) to the higher risk group.
GK, JB, and NF participated in analysis and interpretation of the data, critical revision of the manuscript for important intellectual content, and study supervision. PA conceived the study and its design and participated in data and statistical analysis, interpretation of the data, critical revision of the manuscript for important intellectual content, and study supervision. DM, DU, LL, JB, CN, and NF obtained funding. All authors read and approved the final version of this manuscript.