Method | **MCC** | F_{1} score | Accuracy | TP rate | TN rate | PR AUC | ROC AUC |
---|

Random forests | blue**+0.418*** | blue0.754* | blue0.585* | 0.541 | blue0.855* | 0.541 | 0.698 |

Gradient boosting | **+0.414** | 0.750 | blue0.585* | blue0.550* | 0.845 | blue0.673* | blue0.792* |

SVM radial | **+0.348** | 0.720 | 0.543 | 0.519 | 0.816 | 0.494 | 0.667 |

- MCC: Matthews correlation coefficient. TP rate: true positive rate (sensitivity, recall). TN rate: true negative rate (specificify). Confusion matrix threshold for MCC, F
_{1} score, accuracy, TP rate, TN rate: *τ*=0.5. PR: precision-recall curve. ROC: receiver operating characteristic curve. AUC: area under the curve. MCC: worst value = –1 and best value = +1. F_{1} score, accuracy, TP rate, TN rate, PR AUC, ROC AUC: worst value = 0 and best value = 1. MCC, F_{1} score, accuracy, TP rate, TN rate, PR AUC, ROC AUC formulas: Additional file 1 (“Binary statistical rates” section). Gradient boosting: eXtreme Gradient Boosting (XGBoost). SVM radial: Support Vector Machine with radial Gaussian kernel. We reported bluein blue and with ^{∗} the top results for each score.