Skip to main content

Table 2 Optimum hyper-parameter settings for experiments

From: Maintaining proper health records improves machine learning predictions for novel 2019-nCoV

Setting AdaBoost Bagging Extra-Trees Decision Tree k-NN
Base Estimator None None NA NA NA
# Estimators 100 10 100 NA NA
Learning rate 2 NA NA NA NA
Algorithm SAMME.R Bagging Gini Gini KDTree
Metric Mean label accuracy Mean label accuracy Gini Impurity Gini Impurity Euclidean distance
Random state None Random generation None Random generation NA
Max. samples to train needed to train base estimator NA 1 NA NA NA
Out-of-bag samples to estimate generalization error NA None None NA NA
Use whole ensemble to fit NA Yes Yes NA NA
# Jobs to run in parallel NA 1 1 NA 1
Random resampling NA 3141 12 NA NA
Min. sample to be a leaf NA NA 2 2 NA
Sample weighting NA NA All equal, weight of 1 All equal, weight of 1 NA
# of features for best split NA NA Square root of the # of features Max. features = # of features NA
Min. number of leaf nodes NA NA Unlimited NA NA
Split criteria NA NA Impurity level > 0 NA NA
Reuse previous call to fit and add more estimators to ensemble NA No Yes NA NA
Number of neighbours NA NA NA NA 1