Skip to main content

Table 2 Optimum hyper-parameter settings for experiments

From: Maintaining proper health records improves machine learning predictions for novel 2019-nCoV

Setting

AdaBoost

Bagging

Extra-Trees

Decision Tree

k-NN

Base Estimator

None

None

NA

NA

NA

# Estimators

100

10

100

NA

NA

Learning rate

2

NA

NA

NA

NA

Algorithm

SAMME.R

Bagging

Gini

Gini

KDTree

Metric

Mean label accuracy

Mean label accuracy

Gini Impurity

Gini Impurity

Euclidean distance

Random state

None

Random generation

None

Random generation

NA

Max. samples to train needed to train base estimator

NA

1

NA

NA

NA

Out-of-bag samples to estimate generalization error

NA

None

None

NA

NA

Use whole ensemble to fit

NA

Yes

Yes

NA

NA

# Jobs to run in parallel

NA

1

1

NA

1

Random resampling

NA

3141

12

NA

NA

Min. sample to be a leaf

NA

NA

2

2

NA

Sample weighting

NA

NA

All equal, weight of 1

All equal, weight of 1

NA

# of features for best split

NA

NA

Square root of the # of features

Max. features = # of features

NA

Min. number of leaf nodes

NA

NA

Unlimited

NA

NA

Split criteria

NA

NA

Impurity level > 0

NA

NA

Reuse previous call to fit and add more estimators to ensemble

NA

No

Yes

NA

NA

Number of neighbours

NA

NA

NA

NA

1