Multi-criteria decision making to validate performance of RBC-based formulae to screen \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document}β-thalassemia trait in heterogeneous haemoglobinopathies

Background India has the most significant number of children with thalassemia major worldwide, and about 10,000-15,000 children with the disease are born yearly. Scaling up e-health initiatives in rural areas using a cost-effective digital tool to provide healthcare access for all sections of people remains a challenge for government or semi-governmental institutions and agencies. Methods We compared the performance of a recently developed formula SCS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{BTT}$$\end{document}BTT and its web application SUSOKA with 42 discrimination formulae presently available in the literature. 6,388 samples were collected from the Postgraduate Institute of Medical Education and Research, Chandigarh, in North-Western India. Performances of the formulae were evaluated by eight different measures: sensitivity, specificity, Youden’s Index, AUC-ROC, accuracy, positive predictive value, negative predictive value, and false omission rate. Three multi-criteria decision-making (MCDM) methods, TOPSIS, COPRAS, and SECA, were implemented to rank formulae by ensuring a trade-off among the eight measures. Results MCDM methods revealed that the Shine & Lal and SCS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{BTT}$$\end{document}BTT were the best-performing formulae. Further, a modification of the SCS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{BTT}$$\end{document}BTT formula was proposed, and validation was conducted with a data set containing 939 samples collected from Nil Ratan Sircar (NRS) Medical College and Hospital, Kolkata, in Eastern India. Our two-step approach emphasized the necessity of a molecular diagnosis for a lower number of the population. SCS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{BTT}$$\end{document}BTT along with the condition MCV\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le$$\end{document}≤ 80 fl was recommended for a higher heterogeneous population set. It was found that SCS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{BTT}$$\end{document}BTT can classify all BTT samples with 100% sensitivity when MCV\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le$$\end{document}≤ 80 fl. Conclusions We addressed the issue of how to integrate the higher-ranked formulae in mass screening to ensure higher performance through the MCDM approach. In real-life practice, it is sufficient for a screening algorithm to flag a particular sample as requiring or not requiring further specific confirmatory testing. Implementing discriminate functions in routine screening programs allows early identification; consequently, the cost will decrease, and the turnaround time in everyday workflows will also increase. Our proposed two-step procedure expedites such a process. It is concluded that for mass screening of BTT in a heterogeneous set of data, SCS\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{BTT}$$\end{document}BTT and its web application SUSOKA can provide 100% sensitivity when MCV\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le$$\end{document}≤ 80 fl. Supplementary Information The online version contains supplementary material available at 10.1186/s12911-023-02388-w.

Appendix A Descriptive statistics of RBC parameters

Appendix B The overview of MCDM methods: Entropy weights method
For weight generation for TOPSIS and COPRAS, we used entropy weights method developed by Shannon and Weaver.The method is based on the probability theory and is used to compute uncertain information or entropy.Following are the steps to apply to generates weight: step 1: We start with the decision matrix, X = (x ij ) m×n with m = 35 alternatives formula A 1 , A 2 , • • • , A m ; and n = 8 number of criteria (Obtained form Table 2 after exclusion).Consequently, the following decision matrix was constructed: Step 2: The entropy value e j can thus measure the amount of decision information contained in the normalized matrix and queued in each criterion is given by ∀{i, j} ∈ {1, .., m} × {1, .., n}.
Step 3: Weights (w j ) for each criteria is determined based on the degree of divergence (d j ), and the following relations are used: (i) From Table 2, we obtain the following weights as presented in Table B3, and used for implementing TOPSIS and COPRAS methods.

B.1 TOPSIS
The TOPSIS method is the widely used MCDM methods as it consists of simple computational steps and easy to explain.This method ranks the alternatives according to the distance between the positive and negative ideal solutions.However, an alternative chosen while making a decision is expected to be close to the ideal solution and far from the non-ideal solution.The TOPSIS method consists of the following steps: Step 1: From the decision matrix (X = (x ij ) m×n ) in Equation (B1), we determine the normalized decision matrix r = (r ij ) m×n , where . Using the weights (w j ), the weighted normalization matrix is calculated as v = r.diag(w),where diag(w) is a diagonal matrix whose the diagonal elements are the weights(w j ) as found Shannon entropy (Table B3).
Step 2: Next, we determine the Ideal (A + ) and Anti-ideal (A − ) solutions using the following formulas: where J 1 and J 2 are the benefit and loss criterion, respectively.Therefore, the ideal solution is considered the best performance, and the anti-ideal solution is considered the worst.Note that in our analysis, seven criteria are benefits criterion, and only false omission rate (FOR) is the loss criterion.
Step 3: The Euclidean distance of each indicator from A + and A − are calculated as follows: Using those distance measures, the relative closeness measures (C * i ∈ (0, 1)) is computed for each alternative, where By arranging closeness (C * i ) measures, the alternatives are ranked from best to worst (Table 3).

B.2 COPRAS (Complex Proportional Assessment)
[63] was introduced the COPRAS method and was used to evaluate the superiority of one alternative over another and make it possible to compare alternatives.The COPRAS method ranks and evaluates alternatives step-by-step for their importance and utility degree [66].The steps of the COPRAS method are as follows: Step 1: From the decision matrix (X = (x ij ) m×n ) in Equation ( B1), determine the normalized matrix .
Step 3: Determine the sums of the weighted normalized values for both the beneficial (seven) and non-beneficial (one) criteria.These sums were calculated using Equation (B2) below.
where y +ij and y −ij are the weighted normalized values of the beneficial and nonbeneficial criteria, respectively.Therefore, the higher K +j and lower K −j values represents the better alternatives.
Step 4. The importance of the alternatives is determined by defining the characteristics of the positive alternatives K +j and negative alternatives K +j .Then, the priorities of the candidate alternatives (C j ) are calculated based on the following formula: where, K −min is the mean of K −j .Finlay, the C j values are used for ranking.

B.3 Simultaneous Evaluation of Criteria and Alternatives (SECA)
Unlike other TOPSIS methods, where weights are determined through Entropy calculation, SECA can simultaneously able to determine weight and rank.The final ranking is obtained by executing the following steps: Step 1: Similar with TOPSIS, we start with the same decision matrix X = (x ij ) m×n and all the criterion are divided into two subcategories: beneficial criterion (BC) and non-beneficial (NC).BCs have a positive effect and growth in their values lead to the improvement of the decision-making function, whereas NC have a negative effect and growth in their values have an reverse effect on the objective function.Note that all the criterion used in our model are BC.However, the the normalized decision matrix (X N ) is determined by using the following formula: Step 2: Determine the standard deviation (σ j , j = 1, 2, • • • n) for each criterion and the correlation between each pair of criteria (π jk ) to obtain the variation information within and in-between criterion.
Step 3: Compute the conflict between each criterion against other criteria (π j ), where π j = n j=1 (1 − π jk ).Note that, an increase in the variation within the criterion, intensifies the objective importance of that criterion.
Step 4: Normalized the σ and π j as the reference points by using the following relation.
For example, we obtained the following results for MLAs as shown in Table B5.
Step 5: Finally, the weight (w j ) for each criterion is obtained by solving the following non-linear optimization problem: In Equation (B5), the objective is to maximize the performance of each formulae by considering the impact of the overall performance score of alternative (λ a ), and variation within and between criteria through (λ b ) and (λ c ), respectively.Through the process, the coefficient β ≥ 0 is used for merging objectives, and it the importance of reaching weights for each criterion.Note that the model constraint ensure that the sum of weights should be equal to unit, and ϵ = 0.001 is the lower bound for each criterion.We find the final weight based on the results presented in Table B4.

Table A1 :
Descriptive statistics of RBC parameters in different groups (Mean ± standard deviation) for test data set

Table A2 :
Descriptive statistics of RBC parameters in different groups (Mean ± standard deviation) in validation data set tested by SCS BT T