Skip to main content

Table 4 The structure of the datasets used for diabetes and cardiovascular classification

From: A data-driven approach to predicting diabetes and cardiovascular disease with machine learning

Year

Case

Observations

Variables

No. of 0s

No. of 1s

1999-2014

Case I

21,131

123

15,599

5,532

1999-2014

Case II

16,426

123

9,944

6,482

2003-2014

Case I

16,443

168

11,977

4,466

2003-2014

Case II

12,636

168

7,503

5,133

2007-2014

Cardio

8,459

131

7,012

1,447

  1. Case I and II datasets are for diabetes classification, Cardio dataset is for CVD classification. 1 - Positive records for the disease; 0 - Negative records for the disease