Skip to main content

Table 9 Qualitative assessment of urological Expert Systems

From: A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology

Art

Mdl

Validation methods

Credibility

Evaluation

Validation

Verification

Strength and bias

[27]

RBR

Patients' evaluation

No

Yes

Yes

No

Only qualitative evaluation

[18]

RBR

Blinded comparison against 4 experts with independent experts rating and 3 centres RCT pilot trial

Yes

Yes

Yes

No

Consideration of system evaluation with real time testing but small number

[21]

FRB

Improve practitioner accuracy

No

No

No

No

Insufficient info on development and validation

[15]

RBR

RCT reliability and validity by experts’ reviews

Yes

Yes

Yes

No

Small number in the study and short duration of follow up

[95]

ANN

ROC, Sp, Se

No

No

Yes

No

Small number for validation

[63]

FSS

ROC, Sp, Se

No

No

Yes

No

2 methods for validation, compared to experts and data

[143]

ANN

Compare to histology results

No

No

Yes

No

No comparison to human to demonstrate usability, no p value or CI

[103]

FNM

ROC, LR, RMS

No

No

Yes

No

p value calculated to compare all models

[103]

ANN

ROC, LR, RMS

No

No

Yes

No

p value calculated to compare all models, the effect of combining HK p53 with other variables

[102]

ANN

ROC, Sp, Se

No

No

Yes

No

No p value

[76]

ANN

Correlation co-efficient

No

No

Yes

No

Correlation co-efficient between expert and system? Kappa more accurate

[40]

FRB

Not published

No

No

No

No

Not validated

[68]

ANN

AUC ROC

No

No

Yes

No

p value calculated vs LR

[19]

RBR

Feedback from patients with no control group

No

Yes

No

No

No validation but user (patient evaluation)

[29]

FRB

Comparison to experts and non-experts

No

No

Yes

No

Expert as gold standard

[25]

RBR

PPV 62%, NPV 100%

Se 100% Sp 33%

No

No

Yes

No

Small number, low specificity

[55]

ANN

ROC AUC then compare with LR, kappa stats

No

No

Yes

No

Multimodal of validation

[99]

ANN

ROC, Sp, Se

No

No

Yes

No

Not long term follows up

[43]

ANN

ROC (0.74 and 0.86)

No

No

Yes

No

TRUS finding from expert panel, human as gold standard

[105]

FNM

ROC, LR

No

No

Yes

No

p value calculated to compare all models

[105]

ANN

Kaplan Maier for survival

No

No

Yes

No

p for comparison ANN and FNM calculated

[145]

kNN

Comparison to other classifiers and ROC

No

Yes

Yes

No

Evaluated the usability of the product and was found to have less than significant effect

[129]

ANN

ROC Se, Sp

No

No

Yes

No

Sensitivity analysis of input variables

[22]

ANN

ROC 0.7, accuracy 79%

No

No

Yes

No

Compare to experts without accounting for human error

[85]

FRB

ROC Se, Sp

No

No

Yes

No

No user evaluation

[24]

FRB

Ac 0.76, Se 0.79, Sp 0.75

No

No

Yes

No

Expert as gold standard

[109]

ANN

ROC Compare to LR

No

No

Yes

No

CI calculated

[12]

FRB

Ac 0.93, Se 0.97, Sp 0.99

No

No

Yes

No

Expert as gold standard

[110]

ANN

Prediction error percent

No

No

Yes

No

Experimental results

[48]

SVM

ROC AUC

No

No

Yes

No

P value calculated to compare all models

[146]

ANN

Overlap measure (segmented by experts)

No

No

Yes

No

Expert as gold standard

[23]

ANN

Ac 0.84, Se 0.93, Sp 0.33

No

No

Yes

No

Experts verified data no account for human error

[30]

FNM

Accuracy 86.8%

No

No

Yes

No

Guidelines as gold standard

[20]

RBR

Evaluation by experts, 95 retrospective

No

No

Yes

No

Expert as gold standard, qualitative evaluation

[26]

HYB

FUZZY

ONT

Kappa vs experts, k = 0.89

No

No

Yes

No

Kappa limitation prospective, randomisation,

[16]

RBR

Se 0.95, Sp 0.72, Bayesian analysis S&S, usability of system by Likert scale (Cronbach’s alpha 0.9)

Yes

Yes

Yes

No

Full system evaluation but nurse as gold standard, no attempts to eliminate error

[91]

ANN

ROC AUC compare with Partin nomogram and LR

No

No

Yes

No

No correlation with user

[17]

FNM

Kappa vs experts, Se 0.95, Sp 0.92

No

No

Yes

No

Human expert as gold standard and no qualitative evaluation (weight of error)

[60]

ANN

Ac 60% (testing) 75% (training)

No

No

Yes

No

Compare to gold standard, Urodynamic

[117]

ANN

PPV 100%

No

No

Yes

No

No calculation of NPP and overall accuracy

[32]

FNM

Correlation coefficient = 0.99

No

No

Yes

No

Small number of cases for validation

[150]

FCM

OR 86.3%

No

No

Yes

No

Comparison with experts as gold standard than mapping to histology

[141]

ANN

ROC, Se 64.2%, Sp 59.6%, PPV 61.6%, NPV 62.2%, AUC 0.6852

No

No

Yes

No

Similar to urodynamic as research tool

[54]

FRB

None

No

No

Yes

No

No validation

  1. All systems’ development was qualitatively assessed against the common industrial steps in the development pathway described by Okeefe and Benbasat. With exception of the system validation, the rest of the cycle was defective with no explanation. The validation had variable degree of strength with common application of the receiver operator characteristic for estimating the area under the curve for data driven systems