Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone

Table 4 Survival prediction results on all clinical features – mean of 100 executions

Method	MCC	F₁ score	Accuracy	TP rate	TN rate	PR AUC	ROC AUC
Random forests	blue+0.384*	0.547	blue0.740*	0.491	0.864	0.657	blue0.800*
Decision tree	+0.376	blue0.554*	0.737	blue0.532*	0.831	0.506	0.681
Gradient boosting	+0.367	0.527	0.738	0.477	0.860	0.594	0.754
Linear regression	+0.332	0.475	0.730	0.394	0.892	0.495	0.643
One rule	+0.319	0.465	0.729	0.383	0.892	0.482	0.637
Artificial neural network	+0.262	0.483	0.680	0.428	0.815	blue0.750*	0.559
Naïve bayes	+0.224	0.364	0.696	0.279	0.898	0.437	0.589
SVM radial	+0.159	0.182	0.690	0.122	0.967	0.587	0.749
SVM linear	+0.107	0.115	0.684	0.072	blue0.981*	0.594	0.754
k-nearest neighbors	-0.025	0.148	0.624	0.121	0.866	0.323	0.493

MCC: Matthews correlation coefficient. TP rate: true positive rate (sensitivity, recall). TN rate: true negative rate (specificify). Confusion matrix threshold for MCC, F₁ score, accuracy, TP rate, TN rate: τ=0.5. PR: precision-recall curve. ROC: receiver operating characteristic curve. AUC: area under the curve. MCC: worst value = –1 and best value = +1. F₁ score, accuracy, TP rate, TN rate, PR AUC, ROC AUC: worst value = 0 and best value = 1. MCC, F₁ score, accuracy, TP rate, TN rate, PR AUC, ROC AUC formulas: Additional file 1 (“Binary statistical rates” section). Gradient boosting: eXtreme Gradient Boosting (XGBoost). SVM radial: Support Vector Machine with radial Gaussian kernel. SVM linear: Support Vector Machine with linear kernel. Our hyper-parameter grid search optimization for k-Nearest Neighbors selected k=3 on most of the times (10 runs out of 100). Our hyper-parameter grid search optimization for the Support Vector Machine with radial Gaussian kernel selected C=10 on most of the times (56 runs out of 100). Our hyper-parameter grid search optimization for the Support Vector Machine with linear kernel selected C=0.1 on most of the times (50 runs out of 100). Our hyper-parameter grid search optimization for the Artificial Neural Network selected 1 hidden layer and 100 hidden units on most of the times (74 runs out of 100). We report bluein blue and with ^∗ the top performer results for each score.

ISSN: 1472-6947