Diabetes mellitus risk prediction in the presence of class imbalance using flexible machine learning methods

Table 2 Optimal hyper-parameters values based on fivefold stratified cross-validation grid search

Model	Hyper-parameters
DNN	Number of layers = 4, number of nodes in each layer = (100,75,50,1), dropout rate in each layer = (0.5,0.5,0.25), activation function in each layer = (ReLU, ReLU, ReLU, sigmoid)
XGBoost	Learning rate = 0.3, maximum depth of each tree = 3, minimum loss reduction to split each node = 1, regularization term on weights = 20, subsample ratio of columns for each tree = 0.5
Random forest	Number of trees in the forest = 1500, maximum depth of each tree = 19, the minimum number of samples to split each node = 8

ISSN: 1472-6947