Skip to main content

Table 1 Overview of data types and value ranges for data elements covered by the core data model for minimum variant level data

From: Variant information systems for precision oncology

Class Attribute Value range Example
Allele descriptive    
Gene Gene ID Internal ID G0002V5Z
  Gene name HGNC gene symbols KRAS
  Chromosome 1.. 22, X, Y 12
  Entrez gene ID Entrez gene IDs 3845
  Ensembl gene ID Ensembl gene IDs ENSG00000133703
  RefSeq gene ID RefSeq gene IDs NG_007524
Gene transript Gene ID Internal ID G0002V5Z
  Gene transcript ID Internal ID T0006OOW
  RefSeq transcript ID RefSeq Transcript IDs NM_033360
  RefSeq protein ID RefSeq protein IDs NP_203524
  Ensemble transcript ID Ensemble transcript IDs ENST00000256078
  UniProt ID UniProt IDs P01116
Gene position Gene ID Internal ID G0002V5Z
  Genome version Genome build IDs GRCh37.p13
  DNA position Genomic coordinate 12p12.1
Gene pathway Gene ID Internal ID G0002V5Z
  Pathway ID Internal ID P003V724
Gene pathway Pathway ID Internal ID P003V724
  Common name Activation of RAS in B cells  
  Kegg ID Kegg IDs map04014
  Reactome ID Reactome IDs R-HSA-1169092
  PathwayCommons ID PathwayCommons IDs R-HSA-1169092
Allele interpretive    
Variant Variant ID Internal ID V0000LBB
  Variant type “Single nucleotide variant (SNV)”, “multinucleotide variant (MNV)”, “insertion (INS)”, “deletion (DEL)” SNV
Variant position Variant ID Internal ID V0000LBB
  Genome version Genome build IDs GRCh37.p13
  DNA sub. & position HGVS genomic coordinate NC_000012.11:g.25398284C >G
Gene variant Gene ID Internal ID G0002V5Z
  Variant ID Internal ID V0000LBB
  Variant consequence “Non-sense”, “missense”, “silent”, “frame shift”, “in-frame”, “3UTR”, “5UTR”, “splice”, “splice-region”, “intronic”, “upstream”, “downstream” missense
Gene variant transcript Gene ID Internal ID G0002V5Z
  Variant ID Internal ID V0000LBB
  Gene transcript ID Internal ID T0006OOW
  Protein sub. & Position HGVS formatted variants NM_033360.3(KRAS):c.35G >C (p.Gly12Ala)
  Protein domain Descriptive name of protein domain Small GTP-binding protein domain
  Variant consequence “Expression”, “amplification”, “deletion”, “fusion”, “loss of function”, “missense” missense
  Risk score FATHMM, SIFT, PolyPhen 0.98468, 0, 0.97
Somatic interpretive    
Cancer type Cancer type ID Internal ID C000WQFL
  Cancer type name NCI thesaurus | Oncotree IDs Colorectal cancer
  UMLS ID UMLS concept IDs C1527249
  HPO ID HPO concept IDs HP:0003003
Cancer variant Cancer variant ID Internal ID CV00XBQW
  Variant ID Internal ID V0000LBB
  Cancer type ID Internal ID C000WQFL
  Biomarker class “Diagnostic”, “prognostic”, “predictive”, “predisposing”, “pharmacogenomic” predictive
  Clinical relevance level() “Tier 1”, “Tier 2”, “Tier 3” [8] Tier 2
Cancer variant sample Cancer variant ID Internal ID CV00XBQW
  Sample ID Internal ID SXBQW0A7
  Somatic classification “Confirmed somatic”, “confirmed germline”, “unknown” somatic
  Allele frequency Allele frequency in global population 0.00001647
Sample specimen Sample ID Internal ID SXBQW0A7
  Tumor purity Ratio 0.763
  TNM status TNM values T2N1M1
  Primary / relapse Primary || relapse primary
Cancer variant drug Cancer variant ID Internal ID CV00XBQW
  Drug ID Internal ID D00000Z9
Cancer variant drug effect Cancer variant ID Internal ID CV00XBQW
  Drug ID Internal ID D00000Z9
  Effect “Resistant”, “responsive”, “non-responsive”, “sensitive”, “reduced sensitivity”, “other” Resistance or non-response
  Level of evidence see Table 6 C
  Sublevel of evidence see Table 6 3A
Drug Drug ID Internal ID D00000Z9
  Substance name FDA approved | DrugBank substance names Panitumumab
  DrugBank ID DrugBank IDs DB01269
  PharmGKB ID PharmGKB IDs PA162373091
  FDA ID FDA IDs 125147
Drug mechanism Drug ID Internal ID D00000Z9
  Molecular mechanism Description Binds to the epidermal growth factor receptor (EGFR) on both normal and tumor cells[...]
  1. Example data for evidence recording is given in Additional file 3