Skip to main content

Table 1 The dataset used in this study

From: Prognostic factor analysis for breast cancer using gene expression profiles

Dataset grade age ER status TNBC Total Platform
1 2 3 <40 40 ~ 60 >60 + -
METABRIC 170 775 952 118 754 1109 1505 435 317 1981 Illumina HT 12v3
GSE25066 32 180 259 85 327 96 297 205 178 508 Affymetrix HG U133A
GSE2034 NA NA NA NA NA NA 209 77 NA 286 Affymetrix HG U133A
GSE3494 67 128 54 16 90 145 213 34 NA 251 Affymetrix HG U133A
GSE2109 31 113 136 NA NA NA NA NA 47 351 Affymetrix HG U133A
  1. Here, 1981, 508, 286 and 251 samples of gene expression profiles were used from the METABRIC, GSE25066, GSE22034, and GSE3494 datasets, respectively. The METABRIC data set is used for training, and three GSE (GSE25066, GSE2034, and GSE3494) datasets are used for validation. METABRIC, GSE25066, and GSE2109 datasets were used to find differentially expressed genes (DEGs) between TNBC vs. non-TNBC. The numbers located in table represent the number of samples according to breast cancer characteristics