Skip to main content


Table 1 Summary of the risk factors extracted from different datasets

From: An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival

Risk factor Data source Reference ontology
Individual level Race Gender Ethnicity Marital status Smoking status Insurance payer Residency: county and census tracta Age at diagnosis Year of diagnosis Tumor stage Tumor type Treatment procedure Florida Cancer Data System (FCDS) NCIt TEO OCRV
Contextual level Census tract SVIb household composition and disability Census tract SVI minority status and language Census tract SVI housing and transportation Census tract SVI socioeconomic status Agency for Toxic Substances & Disease Registry (ATSDR) OCRV
Census tract high school completion rates Census tract family poverty ratesc United States Census Bureau OCRV
Census tract rurality statusd   OCRV NCIt
County adult mental and physical health statuse County density of primary care physiciansf County Health Ranking & Roadmaps OCRV
County smoking rate County alcohol consumption rate Behavioral Risk Surveillance System (BRFSS) OCRV NCIt
  1. aThe residency of the individual at the county- and census tract-level (i.e., which county and census tract the individual lives in), which are the linkage variables used to connect the individuals with contextual-level risk factors
  2. bSocial Vulnerability Index (SVI) refers to the resilience of communities when confronted by external stresses on human health, such as when facing disasters or disease outbreaks
  3. cThe percentage of all families whose income in the past 12 months is below the poverty level
  4. dThe rurality status for each census tract is based on the RUCA code
  5. eThe average number of days a county’s adult respondents report that their mental/physical health was not good during past 30 days
  6. fThe ratio of the population to total primary care physicians