Skip to main content

Table 4 Sample Dataset, the bolded column (Cat. 50) represents the category to predict

From: Predicting disease risks from highly imbalanced data using random forest

 

Cat.

1

Cat.

2

Cat.

3

....

Cat.

50

....

Cat. 257

Cat. 258

Cat. 259

Age

Race

Sex

Patient 1

0

0

0

....

1

....

0

1

1

69

3

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Patient N

1

0

0

....

0

....

1

0

0

55

1

1