Method | **MCC** | F_{1} score | Accuracy | TP rate | TN rate | PR AUC | ROC AUC |
---|

Random forests | blue**+0.384*** | 0.547 | blue0.740* | 0.491 | 0.864 | 0.657 | blue0.800* |

Decision tree | **+0.376** | blue0.554* | 0.737 | blue0.532* | 0.831 | 0.506 | 0.681 |

Gradient boosting | **+0.367** | 0.527 | 0.738 | 0.477 | 0.860 | 0.594 | 0.754 |

Linear regression | **+0.332** | 0.475 | 0.730 | 0.394 | 0.892 | 0.495 | 0.643 |

One rule | **+0.319** | 0.465 | 0.729 | 0.383 | 0.892 | 0.482 | 0.637 |

Artificial neural network | **+0.262** | 0.483 | 0.680 | 0.428 | 0.815 | blue0.750* | 0.559 |

Naïve bayes | **+0.224** | 0.364 | 0.696 | 0.279 | 0.898 | 0.437 | 0.589 |

SVM radial | **+0.159** | 0.182 | 0.690 | 0.122 | 0.967 | 0.587 | 0.749 |

SVM linear | **+0.107** | 0.115 | 0.684 | 0.072 | blue0.981* | 0.594 | 0.754 |

*k*-nearest neighbors | **-0.025** | 0.148 | 0.624 | 0.121 | 0.866 | 0.323 | 0.493 |

- MCC: Matthews correlation coefficient. TP rate: true positive rate (sensitivity, recall). TN rate: true negative rate (specificify). Confusion matrix threshold for MCC, F
_{1} score, accuracy, TP rate, TN rate: *τ*=0.5. PR: precision-recall curve. ROC: receiver operating characteristic curve. AUC: area under the curve. MCC: worst value = –1 and best value = +1. F_{1} score, accuracy, TP rate, TN rate, PR AUC, ROC AUC: worst value = 0 and best value = 1. MCC, F_{1} score, accuracy, TP rate, TN rate, PR AUC, ROC AUC formulas: Additional file 1 (“Binary statistical rates” section). Gradient boosting: eXtreme Gradient Boosting (XGBoost). SVM radial: Support Vector Machine with radial Gaussian kernel. SVM linear: Support Vector Machine with linear kernel. Our hyper-parameter grid search optimization for *k*-Nearest Neighbors selected *k*=3 on most of the times (10 runs out of 100). Our hyper-parameter grid search optimization for the Support Vector Machine with radial Gaussian kernel selected *C*=10 on most of the times (56 runs out of 100). Our hyper-parameter grid search optimization for the Support Vector Machine with linear kernel selected *C*=0.1 on most of the times (50 runs out of 100). Our hyper-parameter grid search optimization for the Artificial Neural Network selected 1 hidden layer and 100 hidden units on most of the times (74 runs out of 100). We report bluein blue and with ^{∗} the top performer results for each score.