Skip to main content

Table 4 The structure of the datasets used for diabetes and cardiovascular classification

From: A data-driven approach to predicting diabetes and cardiovascular disease with machine learning

Year Case Observations Variables No. of 0s No. of 1s
1999-2014 Case I 21,131 123 15,599 5,532
1999-2014 Case II 16,426 123 9,944 6,482
2003-2014 Case I 16,443 168 11,977 4,466
2003-2014 Case II 12,636 168 7,503 5,133
2007-2014 Cardio 8,459 131 7,012 1,447
  1. Case I and II datasets are for diabetes classification, Cardio dataset is for CVD classification. 1 - Positive records for the disease; 0 - Negative records for the disease