Skip to main content

Table 4 Sample Dataset, the bolded column (Cat. 50) represents the category to predict

From: Predicting disease risks from highly imbalanced data using random forest

  Cat.
1
Cat.
2
Cat.
3
.... Cat.
50
.... Cat. 257 Cat. 258 Cat. 259 Age Race Sex
Patient 1 0 0 0 .... 1 .... 0 1 1 69 3 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Patient N 1 0 0 .... 0 .... 1 0 0 55 1 1