Skip to main content

Table 5 Results using 10-fold cross-validation for diabetes classification

From: A data-driven approach to predicting diabetes and cardiovascular disease with machine learning

Lab

Year & Case

Model

AUC

Precision

Recall

F1

No lab

 

Logistic Reg.

0.827

0.75

0.75

0.75

 

1999-2014

SVM

0.849

0.77

0.77

0.77

 

Diab. Case I

Random Forest

0.855

0.78

0.78

0.78

  

XGBoost

0.862

0.78

0.78

0.78

  

Ensemble

0.859

0.78

0.78

0.78

  

Logistic Reg.

0.732

0.67

0.67

0.67

 

1999-2014

SVM

0.734

0.68

0.68

0.68

 

Diab. Case II

Random Forest

0.731

0.67

0.67

0.67

  

XGBoost

0.734

0.67

0.67

0.67

  

Ensemble

0.737

0.68

0.68

0.68

  

Logistic Reg.

0.800

0.72

0.72

0.72

 

2003-2014

SVM

0.822

0.75

0.75

0.75

 

Diab. Case I

Random Forest

0.841

0.77

0.76

0.76

  

XGBoost

0.837

0.75

0.75

0.75

  

Ensemble

0.834

0.75

0.75

0.75

  

Logistic Reg.

0.718

0.66

0.66

0.66

 

2003-2014

SVM

0.716

0.66

0.66

0.66

 

Diab. Case II

Random Forest

0.719

0.67

0.67

0.66

  

XGBoost

0.725

0.67

0.67

0.67

  

Ensemble

0.725

0.66

0.66

0.66

With lab

 

Logistic Reg.

0.866

0.79

0.79

0.79

 

1999-2014

SVM

0.887

0.81

0.81

0.81

 

Diab. Case I

Random Forest

0.937

0.86

0.86

0.86

  

XGBoost

0.957

0.89

0.89

0.89

  

Ensemble

0.944

0.87

0.87

0.87

  

Logistic Reg.

0.724

0.67

0.67

0.67

 

1999-2014

SVM

0.737

0.68

0.68

0.68

 

Diab. Case II

Random Forest

0.738

0.68

0.68

0.68

  

XGBoost

0.802

0.74

0.74

0.74

  

Ensemble

0.783

0.71

0.71

0.71

  

Logistic Reg.

0.877

0.80

0.80

0.80

 

2003-2014

SVM

0.882

0.81

0.80

0.80

 

Diab. Case I

Random Forest

0.939

0.86

0.86

0.86

  

XGBoost

0.962

0.89

0.89

0.89

  

Ensemble

0.948

0.88

0.88

0.88

  

Logistic Reg.

0.738

0.68

0.68

0.68

 

2003-2014

SVM

0.737

0.68

0.68

0.68

 

Diab. Case II

Random Forest

0.740

0.68

0.68

0.67

  

XGBoost

0.834

0.75

0.75

0.75

  

Ensemble

0.798

0.72

0.72

0.72

  1. AUC - Area Under the Curve, \(Precision = \frac {TP}{TP + FP}, Recall = \frac {TP}{TP + FN}\) (where TP - True Positive, FP - False Positive, FN - False Negative), and F1 (score) = \(2\frac {precision*recall}{precision + recall}\). Bold face font signifies best performing model result