Diagnostic performance of classification trees and hematological functions in hematologic disorders: an application of multidimensional scaling and cluster analysis

Background Several hematological indices have been already proposed to discriminate between iron deficiency anemia (IDA) and β‐thalassemia trait (βTT). This study compared the diagnostic performance of different hematological discrimination indices with decision trees and support vector machines, so as to discriminate IDA from βTT using multidimensional scaling and cluster analysis. In addition, decision trees were used to determine the diagnostic classification scheme of patients. Methods Consisting of 1178 patients with hypochromic microcytic anemia (708 patients with βTT and 470 patients with IDA), this cross-sectional study compared the diagnostic performance of 43 hematological discrimination indices with classification tree algorithms and support vector machines in order to discriminate IDA from βTT. Moreover, multidimensional scaling and cluster analysis were used to identify the homogeneous subgroups of discrimination methods with similar performance. Results All the classification tree algorithms except the LOTUS tree algorithm showed acceptable accuracy measures for discrimination between IDA and βTT in comparison with other hematological discrimination indices. The results indicated that the CRUISE and C5.0 tree algorithms had better diagnostic performance and efficiency among other discrimination methods. Moreover, the AUC of CRUISE and C5.0 tree algorithms indicated more precise classification with values of 0.940 and 0.999, indicating excellent diagnostic accuracy of such models. Moreover, the CRUISE and C5.0 tree algorithms showed that mean corpuscular volume can be considered as the main variable in discrimination between IDA and βTT. Conclusions CRUISE and C5.0 tree algorithms as powerful methods in data mining techniques can be used to develop accurate differential methods along with other laboratory parameters for the discrimination of IDA and βTT. In addition, the multidimensional scaling method and cluster analysis can be considered as the most appropriate techniques to determine the discrimination indices with similar performance for future hematological studies. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-021-01678-5.

The discrimination between these two hematologic disorders is necessary to prevent iron overload and its complications caused by misdiagnosis and inaccurate treatment so as to determine the prenatal causes for hemoglobin chain disorders. However, the differential diagnosis of IDA from βTT is a major challenge given that they provide similar experimental conditions [3,11,12].
In addition to complete blood count (CBC), different tests have been already conducted to differentiate between IDA and βTT precisely; however, they are time-consuming and expensive. The definitive diagnostic methods for the IDA and βTT are respectively based on the increase in HbA2 (Hemoglobin A2), the increase in TIBC (total iron binding capacity), and also the decrease in serum iron and serum ferritin [4,11,[13][14][15][16].
Due to the importance of discriminating between these types of anemia, various studies have been conducted since 1973 to identify appropriate, rapid, and low-cost differential indices for discriminating between IDA and βTT . The existing gaps in the literature about hematological indices showed that each hematological index only includes one or some specific blood parameters. In addition, some indices like Nishad [33] and Matos and Carvalho [41] are suggested based on the parametric statistical model like the discriminant analysis. However, this parametric model needs different assumptions (multivariate normality and equality of covariance matrices) and violation of these assumptions affects the results [42].
Recently, the accessibility of powerful statistical software programs has paved the way for the application of advanced statistical models such as data mining techniques in the differential diagnosis of IDA from βTT. However, few studies have already employed such advanced statistical methods and data mining techniques for differential diagnosis of hematological data [40,[43][44][45][46][47][48][49][50][51][52]. Therefore, this study was intended to compare tree algorithms as powerful machine-learning methods and support vector machines (SVM) with hematological indices in differentiation between IDA and βTT. Tree-based methods can determine homogeneous subgroups of patients needing different treatment strategies or diagnostic tests, making these methods useful for subgroup analysis [53][54][55][56].
The tree-based methods include nonparametric methods and need no assumptions about the functional form of the data. Besides, they deal with the high-dimensional dataset, high-order interactions, and nonlinear relationships. These methods are invariant to monotone transformations of predictor variables, and are robust to outliers, missing values, and also multicollinearity. These algorithms can identify the cutoff points of important predictors to discriminate the patients. In addition, tree algorithms are easy to interpret as they display results graphically, making the results understandable without requiring statistical experience. These methods can also assist the clinician in decision making [57][58][59][60][61][62].
CART (Classification and Regression Tree) algorithm is the best-known classic tree algorithm [63], though it suffers from some problems like greediness and bias in split rule selection. Tree generating in CART is based on the greedy search algorithm, and this search cannot find a global optimum [64]. The splitting method in CART is biased toward independent variables with more distinct values [65,66]. Several tree algorithms are proposed to solve the problems of the CART algorithm. In turn, Evtree algorithm (Evolutionary learning of globally optimal classification and regression trees) [64] has been proposed to solve the greediness problem. Tree algorithms like Quick, Unbiased and Efficient Statistical Tree (QUEST) [67], Classification Rule with Unbiased Interaction Selection and Estimation (CRUISE) [68], Generalized, Unbiased, Interaction Detection and Estimation (GUIDE) [69], Conditional Inference Trees (Ctree) [70], and Logistic Tree with Unbiased Selection (LOTUS) [62] are, in turn, suggested to solve the bias in split rule selection problem.
This study aimed to compare the diagnostic performance of the CART algorithm and remedial tree algorithms for solving the disadvantages of this algorithm and SVM with hematological discrimination indices to discriminate between IDA and βTT by using accuracy measures such as true positive rate (TPR or sensitivity), true negative rate (TNR or specificity), false positive rate (FPR), false negative rate (FNR), accuracy, Youden's index, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR), negative likelihood ratio (PLR), diagnostic odds ratio (DOR), F-measure, and area under the curve (AUC).
Besides, the multidimensional scaling and cluster analysis were applied to extract homogeneous subgroups of hematological discriminating indices and classification tree algorithms with a similar performance according to the accuracy measures used.

Sample and disease type
This study included 1178 patients with hypochromic microcytic anemia from Boghrat clinical center in Tehran, Iran. CBC analysis of EDTA-K2 anti-coagulated blood samples was performed using Sysmex kx-21

Inclusion criteria
Patients with hypochromic microcytic anemia (MCV < 80 fL, MCH < 27 pg), Hb < 12 g.dl for women and Hb < 13 g. dl for men were included in the study. Among them, 708 patients were diagnosed as βTT with HbA2 > 3.5%, and 470 patients were diagnosed as IDA with serum ferritin < 15 ng/ml according to the World Health Organization [WHO] [71,72].

Exclusion criteria
Patients with simultaneous presentation of both diseases, severe anemia (Hb < 8 g.dl), anemia due to chronic disease, infectious disease, chronic inflammation, pregnancy or other hemoglobinopathies were excluded.

Statistical analysis Descriptive statistics and univariate analysis
Descriptive statistics (mean, standard deviation), median and interquartile range) were evaluated for different blood parameters. Normality of data was assessed using Shapiro-wilk test. Mann-Whitney U test was also used to compare the differences between the hematological parameters of both groups (IDA and βTT). P < 0.05 was considered to be statistically significant.

Hematological discriminating indices for discriminating between IDA and βTT
Hematological indices for discrimination between IDA and βTT were computed for each patient according to their formula and cut off. These indices with their formula are shown in Additional file 1: Table S1.

Accuracy measures
Diagnostic performance of discrimination indices was compared with classifications tree algorithms using accuracy measures such as sensitivity, specificity, FPR, FNR, PPV, NPV, Youden's index (sensitivity + specificity -1), accuracy, PLR, NLR, DOR, F-measure and AUC. The discrimination method with sensitivity, specificity, PPV, NPV, Youden's index, accuracy, F-measure and AUC near to 1 provided better performance. Likewise, the discrimination method with PLR > 10, NLR < 0.1 and high DOR caused a good performance for discriminating between IDA from βTT [76,77]. Receiver operating characteristic (ROC) curve analysis was used to compute the AUC, and compare the value of AUC of discrimination methods [78].

Multidimensional scaling
Multidimensional scaling method was used to create a map based on the Euclidean distance for showing similarity or dissimilarity between observations. This map can be in one dimension, two dimensions, and three dimensions or in higher dimensions. Smaller distance among two observations indicates more similar and vice versa. This used a map in two dimensions for showing similarity/dissimilarity among pairs of discrimination methods through accuracy measures such as sensitivity, specificity, PPV, NPV, Youden's Index, accuracy, PLR, NLR, F-measure, and AUC [79].

Cluster analysis
Cluster analysis is a method for extracting homogeneous subgroups of observations. Different algorithms are proposed for cluster analysis. This study used a completelinkage hierarchical algorithm to determine homogeneous subgroups of methods with a similar diagnostic performance using accuracy measures. The optimal number of methods with a similar diagnostic performance was selected using 30 appropriate measures. Finally, the optimal number was selected based on the majority role [80].

Software programs and checklists
Data analysis was done using software R 4.0.0. Package epiR and package pROC were used to compute the accuracy measures and ROC curve analysis, respectively. Classification tree algorithms like CART, J48, Ctree, Evtree, and C5.0 were fitted using packages rpart, Rweka, party, evtree, and C50, respectively. Software for tree algorithms like QUEST, CRUISE, GUIDE, and LOTUS was obtained from http:// pages. stat. wisc. edu/ ~loh/ resea rch. html. SVM algorithm and multidimensional scaling method were fitted using package MASS and package e1071, respectively. The cluster optimal number, or homogeneous groups of diagnostic discrimination methods with a similar diagnostic performances was determined using the package of NbClust. This study was also conducted based on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies and the Standards for Reporting Studies of Diagnostic Accuracy (STARD). These checklists can be obtained from www. equat or-netwo rk. org.

Results
This study included 1178 patients with hypochromic microcytic anemia (708 patients with βTT and 470 patients with IDA) to compare the diagnostic performance of hematological discrimination indices with classification tree algorithms and SVM, so as to discriminate IDA from βTT. Data balance was, in turn, assessed using Shannon entropy [81,82]. Additional file 1: Table S2 indicated the descriptive statistics of hematological parameters across the type of hypochromic microcytic anemia (IDA and βTT). According to this table, all variables indicated significant difference among the groups (P < 0.001). CRUISE, C5.0, CART, and GUIDE algorithms can calculate the normalized importance (%) for each predictor variable. These algorithms indicated similar ranking of hematological parameters importance. In this study, the normalized importance of variables was reported based on the classification tree algorithms with the best diagnostic performance (CRUISE and C5.0 algorithms). This algorithm showed that MCV and HCT variables had the highest and lowest importance for discrimination between IDA and βTT, respectively (Additional file 1: Table S2). Figures 1 and 2 indicated that all predictor variables except HCT and RDW can be used to split the nodes of tree. First variable splitting of tree-based methods except tree algorithms such as Evtree, Ctree, and LOTUS were based on the MCV with similar rule splitting. GUIDE and CART algorithms showed the same tree structure.
Additional file 1: Table S3 displays the values of accuracy measures such as sensitivity, specificity, FPR, FNR, PPV and NPV for each discrimination method (Additional file 1: Table S3). Additional file 1: Table S3 indicated that none of the discrimination methods were fully specific for discrimination between IDA and βTT. This table showed that Janel index and CRUISE tree algorithm had the lowest FPR (while the highest TNR and PPV). In turn, the lowest TNR belonged to the Telmissani-MCHD index, while the lowest PPV was related to the Bessman (RDW) index. Shine and Lal index and Roth index showed perfect TPR (100%) and NPV (100%) as compared to other discrimination methods. Also, these indices showed the lowest FNR and the highest FPR. The lowest TPR (the highest FNR) was related to the Bessman (RDW) index, while the lowest NPV belonged to the Pornprasert (MCHC) index. All tree classification algorithms and SVM showed good performance for discriminating between IDA and βTT based on the accuracy measures like TPR, TNR, PPV and NPV in comparison to other hematological discrimination methods (Additional file 1: Table S3).
The values of accuracy measures such as Youden's index, accuracy, PLR, NLR, and DOR for each discrimination method are shown in Table 1. According to this table, the highest Youden's index/accuracy belonged to the CRUISE and C5.0 tree algorithms, while the lowest Youden's index/accuracy was for the MCHC index. Also, the highest DOR/F-measure belonged to the CRUISE and C5.0 tree algorithms, whereas the Roth index and Bessman (RDW) index had the lowest DOR/F-measure. Table 1 indicated that only CRUISE tree algorithm had PLR > 10 and discrimination methods with NLR < 0.1 were all tree algorithms except C5.0 tree algorithm and indices such as Shine and Lal, Bordbar, Sehgal, and Kerman I.
The value of discrimination method AUC for discrimination between IDA and βTT was shown in Table 2. The ROC analysis showed that CRUISE and C5.0 tree algorithms had the highest AUC. According to the AUC, CRUISE and C5.0 tree algorithms indicated excellent diagnostic accuracy, whereas MCHC index could not be useful for discrimination between  Table 1 Youden's index, accuracy, positive likelihood ratio (PLR), negative likelihood ratio (NLR) and diagnostic odds ratio (DOR) of each hematological index and classification tree algorithm for differentiation between iron deficiency anemia (IDA) and β-thalassemia trait (βTT) with their 95% confidence interval IDA and βTT. Table 2 indicated that AUC of all indices except indices such as Ricerca, Telmissani-MCHD, Huber-Herklotz, Zaghloul1, Zaghloul2 and Kandhrol1 were significantly more than 0.5, and AUC of discrimination indices such as RDW and MCHC were significantly less than 0.5 (P < 0.001).
The comparison between AUC values of classification tree algorithms and hematological discrimination index with the best diagnostic performance among hematological indices (Ehsani index) showed that there was a statistically significant difference between AUC values of tree algorithms with Ehsani index (P < 0.05). In this regard, classification tree algorithms had significantly higher AUC than the mentioned hematological discrimination index. Also, CRUISE and C5.0 tree algorithms had significantly higher AUC than other classification tree algorithms, but there was no significant difference between AUC values of Ctree and CART algorithms (P > 0.05).
Overall, the results showed that CRUISE and C5.0 tree algorithms had a better performance for discrimination between IDA and βTT in comparison to all indices and other classification tree methods. CRUISE tree algorithm extracted six homogenous subgroups of patients (Fig. 1). According to the tree structure of CRUISE tree algorithm, it can be concluded that patients with MCV > 67. 65  In addition, multidimensional scaling method extracted three subgroups of methods. The diagram of this analysis is shown in Fig. 3. One group included hematological discrimination indices such as Pornprasert, RDW, Kandh-rol1, Huber-Herklotz, Sirachainan, Hameed, Zaghloul1, and Zaghloul2, while the other group included Shine and Lal, Roth, Ricerca, and Telmissani-MCHD. The third group in turn included classification tree algorithms, SVM, and some of hematological discrimination indices.
Cluster analysis like multidimensional scaling method extracted three homogenous groups of discrimination methods. The diagram of this analysis is shown in Fig. 4.

Discussion
The two common types of microcytic anemia disorders are IDA and βTT which have similar clinical and experimental conditions [3,11,12]. The discrimination between these two disorders is clinically important needing timeconsuming and expensive tests like HbA2, serum iron, serum ferritin and TIBC [4,11,[13][14][15][16]. Several hematological indices are proposed for rapid and low-cost discrimination between IDA and βTT which are not fully sensitive and specific for differential diagnose .
This study used classification tree algorithms to discriminate between IDA and βTT. These are efficient and low-cost detection methods to extract homogeneous subgroups of patients [53][54][55][56]. Thus, the diagnostic performance of hematological indices was compared with tree-based methods to differentiate IDA and βTT using various accuracy measures.
Additionally, multidimensional scaling was used to extract homogeneous subgroups of methods with a similar performance based on the mentioned criteria.
The findings showed that none of the mentioned discrimination methods are fully sensitive and specific in discrimination between IDA and βTT. Also, tree-based methods exhibited high performance for differential diagnosis in comparison with the other hematological indices. CRUISE tree algorithm indicated better performance than other discrimination methods based on the amount of accuracy measures such as Youden's index, accuracy, PLR, NLR, DOR, F-measure and AUC. These criteria included both sensitivity and specificity and indicated the diagnostic performance of discrimination method more accurately than other criteria. So, this algorithm can help physicians make better clinical decision.
Although sensitivity of hematological discrimination methods such as Ricerca, Telmissani-MCHD, Bordbar, Roth, and Shine and Lal (S&L) was higher than that of CRUISE tree algorithm, these hematological indices had a high false positive rate as compared to the CRUISE tree algorithm. Moreover, with respect to the other measurements, these indices had poor performance in discriminating between IDA and βTT.
Consistent with the findings of this study, other studies demonstrated that Ehsani index had good performance in discrimination between these two disorders in  comparison with other hematological indices [83,84]. Meta-analysis studies indicated that Bessman (RDW) index had a low AUC in comparison to other hematological indices [85,86]. Overall, the findings showed that CRUISE tree algorithm had better performance in discrimination between IDA and βTT as compared to all hematological discrimination indices and other classification tree methods. Moreover, comparison between the AUC of CRUISE and C5.0 tree algorithms and Ehsani index (this index had the best diagnostic performance in comparison to the other hematological indices) showed that there was a statistically significant difference between AUC of these discrimination methods (P < 0.001); CRUISE and C5.0 tree algorithms had significantly higher AUC than this discrimination index. Indeed, all accuracy measures indicated that CRUISE and C5.0 tree algorithms had the best diagnostic performance among the discrimination methods used.
Tree-based methods were fitted using hematological parameters as predictor variables. Based on the results obtained from CRUISE and C5.0 tree methods, MCV was the main hematological predictor parameter in differentiation between different types of hypochromic microcytic anemia. In this regard, it was found that the patient with βTT had lower values of MCV. In a previous study which used different decision trees for discrimination between IDA and βTT, the first split of all algorithms was based on the MCV indicating that MCV was an important predictor variable in discrimination of IDA and βTT [47].
Several studies proposed various tree-based methods for differential diagnostic between microcytic anemia [43,44,47,[50][51][52]. For instance, Bellinger et al. used classification algorithms like J48 decision tree, support vector machines (SVM), k-nearest neighbours (K-NN), multilayer perceptron (MLP) and naϊve Bayes (NB) to discriminate between patients with IDA and βTT or both [50]. In another study, Setsirichok evaluated the classification of blood characteristics by a C4.5 decision tree, a NB classifier and a MLP for classifying eighteen classes of thalassemia abnormality [43]. Likewise, Jahangiri et al. (2017) used classification tree algorithms for constructing differential scheme and investigating the performance of several tree algorithms for the differential diagnosis of IDA from βTT. In agreement with this study, Jahangiri et al. (2017) reported that CRUISE tree algorithm had the highest AUC, and MCV was an important predictor variable in the discrimination of observations into IDA and βTT, and the first split of all algorithms was based on of MCV [47]. Moreover, Chakraborty et al. (2017) utilized Ada-boost algorithm to generate multiple decision trees by using C4.5 decision tree for classification of erythrocytes or anemia detection. Their proposed approach showed accuracy, specificity and sensitivity of 97.81%, 99.7% and 97.33% respectively in detecting abnormal erythrocytes [51].
Comparing the diagnostic performance of several algorithms such as J48, K-NN, artificial neural networks and NB for identifying β-thalassemia carriers, AlAgha concluded that naϊve Bayes had the superior performance to differentiate between normal and β-thalassemia carriers [52]. Overall, the CRUISE and C5.0 tree algorithms with the best performance in this study showed better performance in comparison with tree algorithms in the previous studies [43,87].
Using advanced methods such as tree-based methods for discriminating between IDA and βTT in addition to the differential indices can be a good idea for discriminating between these two hematologic disorders. Though each index only includes one or specific blood parameters, machine learning methods can consider the effects of all blood parameters simultaneously for data prediction and exploratory modeling. Besides, using decision trees for discrimination between IDA and βTT can avoid expensive, time-consuming, and complicated laboratory procedures leading to non-satisfactory hematological indices in discriminating between these two hematologic disorders.
The application of methods like multidimensional scaling and cluster analysis are deemed to be useful to determine different classification methods with similar diagnostic functions. In previous hematological studies, such indices were compared subjectively based on the accuracy measures. Therefore, the application of multidimensional scaling method and cluster analysis are proposed to determine the hematological discrimination indices with similar performance for future hematological studies.

Application in practice for medical studies
In medical diagnostic processes, decision making with high diagnostic performance is very important. Tree-based methods can be considered as appropriate methods for decision making, because they generate differential diagnosis with high accuracy measures (sensitivity, specificity, PPV, NPV, PLR, NLR, DOR, accuracy, and AUC) in comparison to the discrimination indices. In addition, tree algorithms display results graphically, making the results understandable with no statistical expertise. These algorithms can be thus useful for diagnostic classification scheme of patients in medical studies. This study thus considered the discrimination between IDA and βTT to prevent iron overload and its complications caused by misdiagnosis and inaccurate treatment, and also to determine the prenatal causes for hemoglobin chain disorders.

Conclusions
Given its diagnostic performance, CRUISE and C5.0 tree algorithms are considered as an appropriate method for differential diagnosis of patients in comparison to other methods. Moreover, tree-based methods are useful along with other parameters for discriminating between IDA and βTT. In conclusion, considering the advantages of tree algorithms, they can help physicians make better clinical decisions. The results showed that multidimensional scaling method and cluster analysis are appropriate techniques to determine the discrimination indices with similar performance for future studies. In addition, the tree-based methods were identified as good methods for extracting homogeneous subgroups of observations in medical studies.
Additional file 1. Table S1. Discrimination indices for differentiation between iron deficiency anemia (IDA) and β-thalassemia trait (βTT). Table S2. Descriptive statistics of blood parameters of the study groups and normalized importance (%) of hematological parameters based on the CRUISE tree algorithm (SD: standard deviation and IQR: interquartile range). Table S3. Sensitivity (TPR), specificity (TNR), false positive rate (FPR), false negative rate (FNR), positive predictive values (PPV) and negative predictive values (NPV) of each hematological index and classification tree algorithm for differentiation between iron deficiency anemia (IDA) and β-thalassemia trait (βTT) with their 95% confidence interval.