Skip to main content

Table 9 Comparing the tree structures between trees from the descriptive forest and a single C4.5 tree

From: Descriptive forest: experiments on a novel tree-structure-generalization method for describing cardiovascular diseases

 

All trees are performed from the CVDs dataset

annotation

The details of comparisons

the single C4.5 tree

the {Oldpeak > 85}-tree

the {ChestPainType = ASY}-tree

the {ExerciseAngina = Y}-tree

the {ST_Slope = Flat}-tree

the {Sex = M}-tree

the {Sex = M, ChestPainType = ASY}-tree

the {Sex = M, ExerciseAngina = Y}-tree

the {Sex = M, ST_Slope = Flat}-tree

the descriptive forest

 

1. Tree size

18

11

6

5

9

7

6

10

16

5–16

excluded the boundary nodes

2. Number of leaf nodes

10

6

4

3

5

4

4

6

10

3–10

 

3. Tree depth

4

3

2

2

4

3

2

4

3

2–4

excluded the boundary nodes

4. Number of related instances

918

423

496

371

460

1741

426

328

385

843

 

5. Occurrence of the Age feature

-

-

-

-

-

-

-

-

1

1

 

6. Occurrence of the Sex feature

1

1

-

-

1

b

b

b

b

b,2

b = boundary node

7. Occurrence of the ChestPainType feature

-

-

b

-

-

b

b

-

1

b,1

b = boundary node

8. Occurrence of the RestingBP feature

1

-

-

-

1

-

-

-

-

1

 

9. Occurrence of the cholesterol feature

-

-

-

-

-

-

-

-

2

2

 

10. Occurrence of the FastingBS feature

1

-

-

-

1

1

-

1

-

3

 

11. Occurrence of the RestingECG feature

-

-

-

-

-

-

-

-

1

1

 

12. Occurrence of the MaxHR feature

-

1

-

1

-

-

-

2

1

5

 

13Occurrence of the ExerciseAngina feature

2

2

-

b

1

b

-

b

-

b,3

b = boundary node

Occurrence of the Oldpeak feature

2

b,1

1

1

-

1

1

-

-

b,5

b = boundary node

15. Occurrence of the ST_Slope feature

r

-

1

-

b

b,1

1

1

b

b,4

r = root node, b = boundary node

  1. Note:1Bold text denotes a special case to calculate the number of related instances. The related instances of the {Sex = M}-tree are reduced from 725 to 174 because they work with dependent trees; the related instances of the descriptive forest are 843, while the remaining 75 instances in the CVD dataset can be defined as “has no risk” from the main topics discovered from the CVD dataset