Skip to main content

A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology

Abstract

Background

Testing a hypothesis for ‘factors-outcome effect’ is a common quest, but standard statistical regression analysis tools are rendered ineffective by data contaminated with too many noisy variables. Expert Systems (ES) can provide an alternative methodology in analysing data to identify variables with the highest correlation to the outcome. By applying their effective machine learning (ML) abilities, significant research time and costs can be saved. The study aims to systematically review the applications of ES in urological research and their methodological models for effective multi-variate analysis. Their domains, development and validity will be identified.

Methods

The PRISMA methodology was applied to formulate an effective method for data gathering and analysis. This study search included seven most relevant information sources: WEB OF SCIENCE, EMBASE, BIOSIS CITATION INDEX, SCOPUS, PUBMED, Google Scholar and MEDLINE. Eligible articles were included if they applied one of the known ML models for a clear urological research question involving multivariate analysis. Only articles with pertinent research methods in ES models were included. The analysed data included the system model, applications, input/output variables, target user, validation, and outcomes. Both ML models and the variable analysis were comparatively reported for each system.

Results

The search identified n = 1087 articles from all databases and n = 712 were eligible for examination against inclusion criteria. A total of 168 systems were finally included and systematically analysed demonstrating a recent increase in uptake of ES in academic urology in particular artificial neural networks with 31 systems. Most of the systems were applied in urological oncology (prostate cancer = 15, bladder cancer = 13) where diagnostic, prognostic and survival predictor markers were investigated. Due to the heterogeneity of models and their statistical tests, a meta-analysis was not feasible.

Conclusion

ES utility offers an effective ML potential and their applications in research have demonstrated a valid model for multi-variate analysis. The complexity of their development can challenge their uptake in urological clinics whilst the limitation of the statistical tools in this domain has created a gap for further research studies. Integration of computer scientists in academic units has promoted the use of ES in clinical urological research.

Peer Review reports

Introduction

In the 1950’s J McCarthy in Stanford University and A Turing in Cambridge University proposed the concept of machine simulation of human learning and intelligence [1, 2]. Being keen mathematicians, they advanced the basic mathematical logic into programming languages enabling machines to perform more complex functions. E Shortliffe advanced those systems to develop MYCIN, which successfully simulated the reasoning of a human microbiologist in diagnosing and treating patients with microbial infection [3]. Their model introduced Expert Systems (ES) to the scientific literature and a ten year review by Liao et al. demonstrated their wide prevalence in the industrial fields with immense applications including health care [4]. In contrast to Liao’s review, other studies questioned their real time implementation in health care and suggested a lack of their uptake and integration in the health care systems [5]. This is despite evidence from systematic reviews demonstrating the positive impact of computer aid systems on patients’ outcome and health care [6, 7].

This study aimed to systematically review published ES in urological health care with a primary aim to demonstrate their availability, progression, testing and applications. The secondary aim was to evaluate their development life cycle against standards suggested by O’Keefe and Benbasat in their review articles on ES development [8, 9]. The later would evaluate the gap between their development and implementation in health care.

Methods

The study methodology followed the recommendations outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Fig. 1). No ethical approval was required because the type of the study waives this requirement.

Fig. 1
figure 1

PRISMA flow chart for the systematic review of articles included in the review of expert systems in urology

Search

Information sources including WEB OF SCIENCE, EMBASE, BIOSIS CITATION INDEX, SCOPUS, PUBMED, Google Scholar and MEDLINE were searched using key words in (Table 1). Articles published between 1960 and 2016 were considered and examined against the inclusion criteria. While tailoring the conducted search for each literature database, the key words were combined by “OR” in each domain, then domains were combined by “AND”.

Table 1 Keywords used for literature search

Eligibility criteria

For the primary aim, data search was conducted to yield the collected results then analyse them according to pre-planned eligibility criteria based on the system model, year of production, type and outcome of its validation, functional domain application, variables for input and output, target user and domain. This selection criteria were designed with an objective to identify expert system studies and demonstrate their prevalence, testing, and applications in clinical urology. Only articles and studies written in English were included.

Further qualitative analysis was required to meet the study secondary aim. For this, further data was gathered on credibility (user perception on the system), evaluation (system usability), validation (building the right system) and verification (building the system right) then compare against the standards reported in [8, 9].

Data filtering

The resultant reference list of each included article was checked to identify a potentially eligible item that had not been retrieved by the initial search. All retrieved articles were collated in a final reference list on a management software (Endnote, X8), then duplicate studies were removed from the list.

Upon including more than one hundred articles, the rest of the eligible articles were meticulously compared to the ones included, then excluded based on demonstrating clear similarity. This was applied to avoid expanding the size of the data without adding to the study analysis.

Results

ANN was the commonest model to be applied in Urological ES (Fig. 2). The rest of the models demonstrated diversity which is consistent with other published industrial systems [4].

Fig. 2
figure 2

Analysis of Expert Systems (ES) by models (n = 169). ANN was the most common but other systems were applied on different domain as fuzzy neural model (FNM), rule-based system (RBS), fuzzy rule based (FRB), support vector machine (SVT), Bayesian network (BN) and decision trees (DT)

Prostate cancer was the commonest domain for urological ES with most of the system focusing on cancer diagnosis. These systems were applied to various domains (Fig. 3), and they were further stratified and analysed according to their core functional application as outlines in the methodology.

Fig. 3
figure 3

Urological domains (n = 168) applied by Expert Systems (ES). Prostate cancer (CaP) was the commonest domain followed by bladder cancer (Bca) then other diseases as benign prostatic disease (BPD), pelvi ureteric junction obstruction (PUJ), urinary tract infection (UTI), renal cell cancer (RCC), vesico ureteric reflux (VU reflux)

Quantitative analysis

Decision support systems

The main objective of ES in this domain was to facilitate the clinical decision making by identifying key elements from patients clinical and laboratory examinations then refine a theoretical diagnostic or treatment strategy [10]. They can guide the expert to find the right answer [11] or take over the decision making to support the none expert as [12] or even replace both to interact with the patient directly [13].

They have supported various aspects of urological decision making such as diagnosis, investigations analysis, radiotherapy dose calculation, the delivery of behavioural treatment and therapeutic dialogues.

Domains

Urinary dysfunction (U Dys) was the commonest domain to be covered in the decision support system application (n = 9), which could be further categorised into U Dys diagnostic, investigation analysis and therapeutic systems. They have demonstrated a range of methodologies, validation, and target users (Table 2) applicable to Decision support systems in Urological domain. For instance, Keles et al. [14] designed an ES to support junior nurses in diagnosing urinary elimination dysfunction in a selected group of patients while [15, 16] systems were able to support any medical user to diagnose urinary incontinence with an accuracy reaching higher than 90%. The target user of most of these systems were predominantly medical health care workers including both experts and none experts, with exception of [13, 17] which can be directly used by patients to receive an assessment of their urinary elimination dysfunction followed by a tailored treatment plan.

Table 2 Decision support systems in urological domain

Prostate diseases were represented in 6 systems while 3 of them modelled by [10, 12, 20] for diagnosing both benign and malignant prostatic disease, namely cancer prostate (CaP).

All systems in this domain were diagnosis support system with exception of [19] which also provided treatment for benign prostatic hyperplasia (BPH) and [11] calculated the required radiotherapy dose for treating CaP.

Sexual dysfunctions were modelled in 3 systems where [21] diagnosed male sexual dysfunction with an accuracy of 89%, while [22] added a therapeutic model for the same disease with an overall accuracy of 79%. Sexpert by [23] was the third system in this category developed in 1988 and in fact the oldest ES to be identified from our search in all urological domains. Interestingly this RB system was designed to interact directly with couples suffering from sexual dysfunction where the system responds to their query with a tailored therapeutic dialogue for treating their problem.

Urinary tract infection (UTI) was diagnosed and treated by one of the hybrid fuzzy systems FNM developed by [24] with an accuracy of 86.8%.

Diagnosis prediction

In this domain, ES quantifying the probability of a clinical diagnosis with a defined margin of error. They simulate a second expert opinion and it has been suggested that their use could eliminate unnecessary invasive investigation as the application of ANN by [26] could reduce up to 68% of repeated TRUS biopsies to diagnose CaP.

Domains

Prostate cancer was the main domain for this application with 19 systems out of 20. Most of them were designed to predict organ confinement before radical surgical excision of the prostate (Tables 3, 4). The target population were patients with clinically localised CaP and their accuracy reached high estimates as in [28], where the system was able to predict 98% of the low risk group for lymph node involvement using preoperative available date (PSA, clinical stage and Gleason score).

Table 3 Diagnosis prediction application of Expert Systems (ES) in Urology
Table 4 Disease stage prediction

Chiu et al. [29] modelled a system with clinical variables for patients undergoing nuclear bone scintigraphy for predicting skeletal metastasis. The system was able to predict metastatic disease in the test group with Se 87.5%, Sp 83.3%.

None seminoma testicular cancer was the other domain in this application with the system [27] able to predict the cancer disease stage (Table 4) with accuracy reaching 87%.

Treatment outcome prediction

In this application, ES combined disease and patient related factors to estimate the success of a specific treatment or intervention. As in [30, 38, 64, 69] where the system predicted the outcome of extra corporeal shock wave (ESWL) for treating kidney stones and [74, 75] providing an estimation of cancer recurrence after radical surgical treatment of prostate cancer.

Domains

Prostate cancer was also common domain in this application (n = 23). Potter [74, 75] described 4 models developed by data acquired from patients with clinically localised CaP and had radical prostatectomy with curative intent. The variables included clinical and histological findings of the surgical specimen and they were able to predict up to 81% who did not have evidence biochemical failure (rising PSA) in their follow up. Hamid et al. [76] and Gomha [77] models were not restricted to the clinically localised CaP cohort and their study population included patients at different disease stages and on any treatment pathway. Their models included 2 experimental histological markers (tumour suppressor gene p53 and the proto-oncogene bcl-2) in their input variables and the estimated predictive accuracy of the patient response to treatment were reaching 68% and 80% (p < 0.00001) respectively.

Nephrolithiasis treatment was expressed by 6 other systems applying the treatment outcome prediction concept. Cummings et al. targeted this group in his ANN [78] where he trained his network with patients’ data treated at the emergency service of 3 centres with ureteric stones, to identify patients failing conservative management and requiring further intervention. When tested on a different set of 55 cases, the system correctly predicted 100% of the patients who passed the stone spontaneously with an overall accuracy of 76%.

Extra corporeal shockwave lithotripsy (ESWL) is one of the favourable interventions in the nephrolithiasis treatment domain. The stone here receives strong external shock waves, which can subsequently reduce it into small fragment and eliminate the need for direct instrumentation of the renal tract. Their reported success rate can only provide a generalised prediction of outcome to the individual case and ANN was capable of providing an alternative multivariate analytical tool in the 4 models developed by [30, 38, 64, 69]. They estimated high accuracy of their models (Table 5), as in [64], the system predicted 97% of the patients who were confirmed to be stone free following ESWL for treating ureteric stone.

Table 5 Treatment outcome prediction

Paediatric pelvi-ureteric junction obstruction is primarily treated conservatively unless there is any evidence of renal function compromise, recurring infection or worsening radiological findings. For the failing group, pyeloplasty is the second line of treatment and [81] developed an ANN to estimate the success rate of this procedure for each individual case by predicting the post-operative degree of hydronephrosis with a reported 100% accuracy in the small tested sample.

Vesico ureteric reflux or reflux uropathy is another paediatric disease, characterised by back flow of urine from the bladder into the ureter through incompetent Vesico ureteric functional valve. Treatment is primarily conservative as it can be a self-limiting disease or surgery to reimplantation the ureters or endoscopic injection of bulking agent at the ureteric orifices [80]. The study authors trained a neural network using 261 cases whom have received endoscopic injection and the system predicted 94% of the patients who did not benefit from the treatment [80].

Laparoscopic partial and radical nephrectomy were the domain of the [82], which was developed by multi institutional case data (age, co-morbidities, tumour size, and extension) of patients having laparoscopic partial or radical nephrectomy. The system was able to predict the length of their postoperative hospital stay with an accuracy of 72%.

Bladder cancer can be treated with complete bladder excision and [79] developed systems to predict the cure rate with an accuracy of 83%.

Recurrence and survival prediction

The ES in this domain aimed to provide individualised risk analysis tools estimating the disease specific mortality and recognising the group whom may benefit from more aggressive or adjuvant treatment.

Domains

Bladder cancer survival and recurrence prediction following radical cystectomy (RC) with curative intention was the commonest domain in this application (24 out of 26 total systems). The lymph nodal involvement is highly predictive of the recurrence and these patients are considered for adjuvant or neoadjuvant systemic chemotherapy. The node free cohort will include high-risk patients who were not identified by the conventional linear stratification system. Catto et al. developed a FNM system to identify this high risk group in the nodal free cohort by predicting the disease recurrence rate (Se 81%, Sp 85%) and their survival with a median error of 8.15 months [92]. The high-risk group identified by this model can benefit from systemic treatment post cystectomy to improve their disease related morbidity and mortality [95, 96]. The 5 years survival post cystectomy was the output of 2 other ANN with a high prediction efficacy of 77% and 90% respectively (Table 6) [97, 99].

Table 6 Recurrence and progression prediction

Renal cell cancer is primarily treated with partial or radical nephrectomy for clinically localised disease with systemic therapy for the metastatic disease. There is still a degree of uncertainty in stratifying individual disease risk in order to predict the indication and outcome of systemic therapy in the group with distant metastasis. Vukicevic et al. [98] attempted to clarify this uncertainty by training a neural network with patients’ data who had nephrectomy (partial or radical) and received systemic therapy. The mature model predicted the patients who survived the disease at 3 years with an overall accuracy of 95% (CI 0.878–0.987).

None seminoma testicular cancer 5 years recurrence was the domain of [118] ANN. The system was trained with multicentre data and in its testing phase and predicted 100% of the patients who did not suffer from disease recurrence at 5 years with an overall predictive accuracy of 94% (AUC = 87%).

Predicting research variables

In academia, testing a hypothesis for ‘factors-outcome effect’ is a popular quest and the standard statistical regression analysis tools may not be effective for data contaminated by irrelevant variables [119]. AI can provide an alternative methodology in the analysis to identify variables with high correlation to the outcome by applying machine learning as in ANN. The area under the curve (AUC) is estimated for the system predictive accuracy applying all researched variables. Those research variables can be given random values or randomised then the AUC is re estimated for comparison with the original [120]. Only variables that decreases the AUC are considered significant and the wider the discrepancy of the AUC the more significant they are (Table 7).

Table 7 Research variable prediction
Domains

Prostate cancer was a common domain in this application with a total of 15 systems analysing predictive factors for diagnosis of cancer, response to treatment and quality of life with prostatic disease. One of the hot topics in Urological cancer is discovering alternative CaP diagnostic markers since serum PSA is not sensitive for distinguishing benign from malignant disease. Stephan et al. investigated the diagnostic value of three markers in this domain: Macrophage inhibitory cytokine-1, macrophage inhibitory factor and human kallikrein 11 [108]. These were used as variables (nodes) in ANN models and compared their accuracy to the linear regression of %fPSA. They have reported that only the ANN model including all three variables was more accurate (AUC 91%, Se 90%, Sp 80%) than all other models proving his hypothesis that they are only relevant as when combined.

Similarly, another study estimated the predictive values of serum PSA precursors (-5, -7 proPSA) in diagnosing prostate cancer using and comparing the accuracy to %fPSA [107]. The -5, -7 pro PSA were only significant in the cohort with PSA between 4 and10 µg/l and did not improve the predictive accuracy when added to the %fPSA. The same author tested this hypothesis on another free PSA precursor (-2 proPSA) by developing ANN with the %p2PSA (-2 ProPSA: fPSA) among other disease variables, which have improved the system accuracy (AUC 85% from 75%) [120].

Three systems evaluated the presence of bcl-2 and p53 (tumor suppressor genes) as a predictive variable for response to prostate cancer treatment [76, 77]. Their combination was reported to be significant (Ac 85%, p < 0.00001) in [77] but [76] found that only bcl-2 is relevant in the other two models (accuracy 63–68%).

Bladder cancer diagnosis and disease progression was the second most common domain with 13 systems. Kolasa et al. [110] have modeled an ANN with three novel urine markers: urine levels of nuclear matrix protein-22, monocyte chemoattractant protein-1 and urinary intercellular adhesion molecule-1, to predict the diagnosis of bladder cancer and it succeeded in predicting all cancer free patients when the three variables were used as a group. Catto.et al. [119] developed two AI models (ANN & FNM) performing microarray analysis on genes associated with bladder cancer progression. Their models narrowed down these genes from 200 to 11 progression-associated genes out of 200 ([OR] 0.70; 95% [CI] 0.56–0.87), which were found to be more accurate than the regression analysis when compared to the specimen immunohistology results.

Kolasa et al. [110] model predicting the pre-histology diagnosis of malignancy based on urine level of novel tumour markers. Their ANN was found to be more accurate (Se 100%, Sp 75.7%) than haematuria diagnosed on urine dipstick (Se 92.6%, Sp 51.8%) and atypical urine cytology (Se 66.7%, Sp 81%).

ESWL of renal stones was the research domain of [30, 69], where they aimed at identifying significant variables correlated to the treatment outcome (stone free) and developing a predictive model. Chiu et al. [69] model did not recognise residual fragments following ESWL as a significant risk for triggering further stone growth and [30] identified these factor: positive BMI, infundibular width (IW) 5 mm, infundibular ureteropelvic angle 45% or more (IUPA), to be all predictive of lower pole stone breaking and clearance.

Benign prostatic hyperplasia was modelled in a system [114] to link the disease specific clinical and radiological factors with the disease progression in patients with mild disease (IPSS < 7) and not receiving any treatment. His ANN identified: obstructive symptoms (Oss), PSA of more than 1.5 ng/ml and transitional zone volume of more than 25 cm3, to be correlated to disease progression and can accurately predict 78% of the cohort who will need further treatment.

Urinary dysfunction diagnosis accuracy by clinical symptoms was compared to urodynamic findings in female patients with pelvic organ prolapse by [115] and both the linear regression and ANN models could not establish relation between the symptoms and urodynamic based diagnosis hence dismissing the hypothesis of only relying on clinical symptoms to reach an accurate diagnosis and replace the need for urodynamics study.

Hypogonadism (Hgon) was represented in [133] system where the diagnosis was made based on patient’s age, erectile dysfunction and depression with AUC of 70% (p < 0.01).

Image analysis

This one of the advancing applications of AI in medicine where the system either analyse the variables in the reported medical images as data input or identifies these variables through a separate image analyser without the need for expert to report the scan or images. The first category was included among other systems mentioned above as in the diagnosis prediction domain where [47] included different variables from TRUS in the system input to predict CaP diagnosis. In this domain, we focused on the other group where the images are presented to the machine in the form raw data translated by the image analyser and the system will then apply their machine learning to identify the cause effect pattern (Table 8).

Table 8 Image analysis
Domains

Prostate cancer image analysis was modelled in 10 systems to enhance diagnostic accuracy as in [126] and disease progression prediction as in [128]. The first system represented each TRUS image pixel as one variable or neuron in a pulse coupled neural network and trained their system with 212 prostate cancer images to segment prostate gland boundary with an average overlap accuracy (overlap measure = difference between PCNN boundary and the expert) of 81% for ten images [126].

The other 4 systems analysed histological images of a cohort of patients post RP with clinically localised CaP to predict the disease progression. The histological images were given coloured coding and analysed by the system that used variables as % of epithelial cell and glandular Lumina to identify the high risk group for disease recurrence with an accuracy reaching 90% [128].

LUT disease urine cytology images were analysed by 2 models in [123], which identified all patients with benign disease with an overall accuracy of 97%.

Nephrolithiasis stone biochemistry analysis can be achieved through an expert analysis of infrared spectroscopy which was simulated by [124] where the infrared spectra wavelength numbers were modelled as input variables and the system prediction accuracy of the expert analysed stone specimen had a root square mean error of 3.471.

Qualitative analysis

The same articles were considered for the qualitative analysis against the four stages (validation, verification, evaluations and credibility) reported in Okeefe industrial survey [8] and Benbasat article [9]. The completion of the four stages examined in this qualitative analysis was demonstrated by none of the included systems. There is a possibility that some of these missing stages has been performed but not published in the scientific literature.

Validation was performed by almost all the systems (166 out 169) with varying degree of study strength, bias, and limitations (Table 9). Most of the data driven systems (ANN, SVM, BN, kNN and FNM) were validated by the ROC and AUC by having a training and validation set or cross validation or applying the leave one out technique. Samli et al. enhanced the validity of their system by estimating the kappa statistics with the ROC [134].

Table 9 Qualitative assessment of urological Expert Systems

Evaluation was only performed by a small fraction of these systems (n = 6). Their evaluation was aiming at the user or the expert but rarely both. There is no evidence to support that these were performed at early stages to determine the substantiality of the system to the user.

System credibility and verification were never performed. It would be implied that the verification was performed to an extent but not reported as it is a technical part of the development.

‘System development limitation and bias evaluation’ demonstrated an overall acceptable validation methodology with valid statistical analysis. However, a few observed limitations (Table 9) were reported with the common encounter being the consideration human opinion as a gold standard (n = 9). For instance, the gold standard in diagnosing prostate cancer is tissue biopsy confirmation. The interpretation of the expert clinical diagnosis as the gold standard reference can lead to statistical errors and invalidate the study.

Discussion

Expert Systems are widely available in Urological domains, with a large range of models, applications, domains, and target users including patients, students, non-experts, experts, and researchers. The number of published systems has risen over the years but with a consistent lack of publications reporting their real time testing or healthcare implementation (Fig. 4).

Fig. 4
figure 4

Expert System (ES) analysis by year of publication showing an upward trend and increase in number of publications. Systems were included according to the keywords for expert system models and applied in urological domains

There is an increasing interest in analysing this gap which is reflected from the scope of AI historic review articles which aimed to only familiarise the readers with ES existence and application [33, 125]. In fact, the majority had a relatively narrow scope on the evolution and application of one ES models (artificial neural network) in prostate cancer diagnosis. Recently, similar to our research, there has been more interest in AI validation, and lack of uptake despite the faith in their ability. Therefore, in this study we quantified ES progression and applications in Urology while examining their developmental life cycle.

It was evident that CaP was the commonest domain in almost all applications contributing with more than two thirds of the systems (91 systems in total). Different aspects of this domain have been simulated by these systems to include diagnosis, therapeutics, predictions of disease progression or treatment outcome, researching variables and medical images analysis. Most of these systems were simulating urologist cognitive function with little guidance on their benefits and how they can be implemented to improve cancer decision making.

In industry, this is usually performed before the system development by evaluating the system usability from the user perspective. This part has lacked or not been acknowledged in the published studies and is possibly a core reason for the lack of their integration in urological health care. Furthermore, none of these systems has been a subject to live testing in a well-designed study to prove its efficacy over standard tools or in the clinical context to prove its validity to justify their complex structure to AI novice health care professionals. The qualitative analysis demonstrated that validation is the only stage of the development cycle to be applied by most of the systems and there is a lack of system evaluation, credibility, and verification. The evaluation can be subdivided into usability (usually by average user), utility and system quality (by experts) [9]. Despite the crucial stage of ES development, there has been a lack of attention in the published articles to integrate it into the development life cycle. This can mean the whole system can fail and also challenge its uptake [8].

An example can be drawn from this review where the majority of the systems focused on CaP diagnosis and treatment. Their implementation would be challenged by the standard decision-making tools of the cancer multidisciplinary team and the ethical concerns of relying on ANN in making such life changing and expensive decision. The utility analysis of those ES would have been essential for tailoring their development for real time applications where they can be more substantial to the user. One example is lack of community-based systems for the initial referral of suspected cancer patients and follow up of stable disease, where NICE have identified a need for such decision support models [152, 153].

There was a wide diversity of modelling in Urological ES with ANN being the most common model in this review. These would bypass the need for direct learning from experts and the exhaustive process of knowledge acquisition, which is a core requirement for knowledge-based systems to attest the whole system progress [55]. However, their analytical hidden layer of nodes “black box phenomenon” has been a subject for wide criticism and rejection from clinicians due to lack of transparency and understanding of its function.

Stephan et al. suggested a statistical solution to identify the variables significance by performing sensitivity analysis [154]. This estimates the variation of the AUC with introduction or elimination of each variable. This can only reflect the significance of each variable but does not explain how the cases are being solved nor quantify this to the user in a standard statistical value. This can be useful in research as they can identify significant variables in a large set data and has been successfully applied in the field of academic urology as in [119] where the system successfully identified the relevant gene signature for bladder cancer progression which saved time and cost of microarray analysis of all suspected genes.

Holzinger et al. emphasised on the importance of the explicability of the AI model specially in medicine which is a clear challenge for machine learning due to their complex reasoning [155]. Their study attempted to simplify the explanation by classifying the systems into post-hoc or ante-hoc. In post-hoc, explanations were provided for a specific decision as in model agnostic framework where the black box reasoning can be explained through transparent approximations of the mathematical models and variable [156, 157]. Those are reproduced on demand for a specific problem rather than the whole system which can shed more light on the system function. It is not certain if those can be easily interpreted by the AI novice clinician, but it has provided more explicit models for tackling the black box phenomenon.

Knowledge based systems can be explained by ante hoc models where the whole system reasoning can be represented. Those systems rely on expert knowledge in their development and face the bottle neck phenomenon in their applications. Furthermore, they are not always successful in identifying and mapping multilinear mathematical rules and machine learning is mandatory or at least more efficient [155]. Bologna and Hayashi et al. suggested that machine learning is more successful in complex problem solving with inverse relation between the machine performance, and it is built-in transparency [158].

Another common aspect lacking in these articles was the coupling of their system development methodology with the medical device registration requirements. This is essential as ES often function as standalone software with no human supervision to their calculation. This categorises the system as a medical device with mandatory perquisite to register with the relevant authorities as Medicines & Healthcare products Regulatory Agency in the UK [5].

Cabitza et al. compared AI validation to other medical interventions as drugs and emphasised on considering the “software as a medical device” [159]. Unlike other devices or drugs, AI models in healthcare are unique in being more dynamic which should be reflected in their validation cycle. They also quoted the known term “techno-vigilance” to learn from other medical device validation pathways. They recommended different outlook to validation where it is broken down to statistical (efficacy), relational (usability), pragmatic (effectiveness) and ecological (cost-effectiveness) with available standards for those steps (ISO 5725, ISO 9241 and ISO 14155). The latter is viewed as a novel standard for evaluating the cost benefits of applying specific AI model in healthcare which would require longitudinal modelling of health economics [159]. This was evidently lacking in articles that were included in our review and in fact most of the studies were non-randomised and retrospective.

Similarly, Nagendran et al. systematically analysed studies that compare AI performance to experts in classifying medical imaging into diseased and non-diseased, they concluded that AI performance was non-inferior to human experts with potential for out-performing [160]. Their 10 years review identified from literature 2 randomised clinical trials and 9 prospective non-randomised trials extracted from a total of 10 and 81 studies, respectively. Their review assessed the risk of bias using PROBAST (prediction model risk of bias assessment tool) criteria for non-randomised studies. The tool is designed for identifying the risk of bias by analysing four domains (participant, predictors, outcome, and analysis) [161], which is applicable to systematic review analysing prediction model with a target outcome.

In our study, as there was no unified outcome for the included prediction tools, the scope was on the role of validation rather than the outcome. Therefore, those tools assessing the risk of bias were not utilised due to the wide gaps in the tool checklist between the included articles. Such study design and data heterogeneities were also evident in Nagendran et al. and similar to our study, data synthesis was not possible. This will pose a challenge reinforcing the application of AI models in healthcare due to lack of level 1 evidence which is mandatory in healthcare for accepting a novel intervention.

Finally, the quality of the data analysis was beyond the scope of our systematic review despite being essential for developing quality AI systems. Cabitza et al. examined this gap and focused on the data governance [161]. There has been very limited evidence on data quality appraisal and standards with call for further research and allocation of more resources specially in healthcare where the data are notoriously limited with errors or discordance.

The potential application of AI in urology with focus on its future application has been recently discussed by Eminaga et al. [162]. They have shown an increasing interest in urology research, but with a challenged mechanistic update due to the model complexity and lack of end user understanding of its design and function. Furthermore, they identified discrepancy between AI engineering and clinical application which reflects some lack of communication between both disciplines.

This can be either a consequence or a cause for lack of clinical utility testing, which increases the need for research in this domain to be incorporated in the software development [163]. In fact, it has been recommended to perform the utility test before developing the system to tailor its application [164, 165]. Despite having different methodology to our systematic review, the recommendations were similar with strong emphasis on the lack of utility testing and its impact on AI uptake in healthcare [166,167,168].

Conclusion

ES have been advancing in Urology with demonstrated versatility and efficacy. They have suffered from lack of formality in their development, testing and methodology for registration, which has limited their uptake. Future research is recommended in identifying criteria for successful functional domain applications, knowledge engineering and integrating the system development with the registration requirement for their future implementation in the health care systems.

Availability of data and material

For data and supporting materials access, please contact authors for data requests.

Abbreviations

Ac:

Accuracy

AI:

Artificial intelligence

ANN:

Artificial neural networks

AP:

Acute prostatitis

Bca:

Bladder cancer

BC:

Backward chaining

BCF:

Biochemical failure

BCG:

Bacille Calmette–Guérin

BP:

Back propagation neural network

BPD:

Benign prostatic disease

BPH:

Benign prostatic hyperplasia

CAD:

Computer aided diagnosis

CBR:

Case based reasoning

CP:

Chronic prostatitis

CV:

Cross validation

Dom:

Domain

DRE:

Digital rectal exam

ED:

Erectile dysfunction

ES:

Expert Systems

FC:

Forward chaining

Fert:

Fertility

FH:

Family history

FLS:

Fuzzy logic systems

F-ONT:

Fuzzy ontology

FNM:

Fuzzy neural modelling

FRB:

Fuzzy rule-based systems

FSH:

Follicular stimulating hormone level

GA:

Genetic algorithm

Gl:

Gleason score

Hgon:

Hypogonadism

Hk11:

Human kallikrein 11

Incont:

Incontinence

IS:

Information systems

ISS:

Irritative symptoms

IT:

Information technology

IUPA:

Infundibular ureteropelvic angle

IW:

Infundibular width

KA:

Knowledge acquisition

KMSP:

Kaplan Meir Survival Plot

KE:

Knowledge engineer

Lap:

Laparoscopy

LH:

Luteinising hormone level

LOO:

Leave one out

LUT:

Lower urinary tract

LVQ:

Learning vector quanitizer

MIC-1:

Macrophage inhibitory cytokine-1

MIF:

Macrophage inhibitory factor

MH:

Medical history

ML:

Machine learning

MHRA:

Medicines and Healthcare products Regulatory Agency

Mdl:

Model

Nep:

Nephrectomy

Nlt:

Nephrolithiasis

NICE:

National Institute for Health and Care Excellence

Nomo:

Nomogram

NPV:

Negative predictive value

Nsc:

None seminoma testicular cancer

Oss:

Obstructive symptoms

Pop:

Pelvic organ prolapse

Pca:

Prostate cancer

PPV:

Positive predictive value

PRL:

Prolactin level

PSAd:

PSA density

PSAv:

PSA velocity

PVR:

Post void residual

Qmax:

Maximum flow rate

RA:

Requirement analysis

RBR:

Rule based reasoning

RC:

Radical cystectomy

RCC:

Renal cell carcinoma

Recur:

Recurrence

Res:

Response

ROC:

Receiver operating characteristic

RP:

Radical prostatectomy

Sc:

Single centre

Se:

Sensitivity

SPC:

Stable prostate cancer

Sp:

Specificity

tPSA:

Total PSA

TPV:

Total prostatic volume

TRUS:

Trans rectal ultrasound scan

TT:

Total Testosterone

TZD:

Transitional zone PSA density

TZV:

Transitional zone volume

U Dyn:

Urodynamic study

U Dys:

Urinary dysfunction

UTI:

Urinary tract infection

V&V:

Verification and validation

VU rflx:

Vesico-ureteric reflux

%fPSA:

Percentage free/total PSA

%p2PSA:

Percentage p2PSA/fPSA

p2PSA:

-2 ProPSA

U incont:

Urinary incontinence

References

  1. McCarthy J, Minsky ML, Shannon CE. A proposal for the Dartmouth summer research project on artificial intelligence—August 31, 1955. Ai Mag. 2006;27(4):12–4.

    Google Scholar 

  2. Turing A. Computing machinery and intelligence. In: Epstein R, Roberts G, Beber G, editors. Parsing the turing test. Netherlands: Springer; 2009. p. 23–65.

    Chapter  Google Scholar 

  3. Shortliffe EH, et al. computer as a consultant for selection of antimicrobial therapy for patients with bacteremia. Clin Res. 1975;23(3):A385–A385.

    Google Scholar 

  4. Jackson P. Introduction to expert systems. Boston: Addison-Wesley; 1999.

    Google Scholar 

  5. Liao SH. Expert system methodologies and applications—a decade review from 1995 to 2004. Expert Syst Appl. 2005;28(1):93–103.

    Article  Google Scholar 

  6. Ammenwerth E, et al. Clinical decision support systems: need for evidence, need for evaluation. Artif Intell Med. 2013;59(1):1–3.

    Article  PubMed  Google Scholar 

  7. Garg AX, et al. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes—a systematic review. J Am Med Assoc: JAMA. 2005;293(10):1223–38.

    Article  CAS  Google Scholar 

  8. Kawamoto K, et al. Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success. BMJ. 2005;330(7494):765.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Okeefe RM, Oleary DE. Expert system verification and validation—a survey and tutorial. Artif Intell Rev. 1993;7(1):3–42.

    Article  Google Scholar 

  10. Benbasat I, Dhaliwal JS. A framework for the validation of knowledge acquisition. Knowl Acquis. 1989;1(2):215–33.

    Article  Google Scholar 

  11. Pandey B, Mishra RB. Knowledge and intelligent computing system in medicine. Comput Biol Med. 2009;39(3):215–30.

    Article  PubMed  Google Scholar 

  12. Koutsojannis C, et al. Using machine learning techniques to improve the behaviour of a medical decision support system for prostate diseases. In: 2009 9th international conference on intelligent systems design and applications. 2009. p. 341–6.

  13. Petrovic S, Mishra N, Sundar S. A novel case based reasoning approach to radiotherapy planning. Expert Syst Appl. 2011;38(9):10759–69.

    Article  Google Scholar 

  14. Keles A, et al. Neuro-fuzzy classification of prostate cancer using NEFCLASS-J. Comput Biol Med. 2007;37(11):1617–28.

    Article  PubMed  Google Scholar 

  15. Gorman R. Expert system for management of urinary incontinence in women. In: Proceedings of the annual symposium on computer application in medical care. 1995. p. 527–31.

  16. Hao ATH, et al. Nursing process decision support system for urology ward. Int J Med Inform. 2013;82(7):604–12.

    Article  PubMed  Google Scholar 

  17. Lopes M, et al. Fuzzy cognitive map in differential diagnosis of alterations in urinary elimination: a nursing approach. Int J Med Inform. 2013;82(3):201–8.

    Article  PubMed  Google Scholar 

  18. Petrucci K, et al. Evaluation of UNIS: urological nursing information systems. In: Proceedings of the annual symposium on computer application [sic] in medical care. Symposium on computer applications in medical care. 1991.

  19. Boyington AR, et al. Development of a computer-based system for continence health promotion. Nurs Outlook. 2004;52(5):241–7.

    Article  PubMed  Google Scholar 

  20. Koutsojannis C, Lithari C, Hatzilygeroudis I. Managing urinary incontinence through hand-held real-time decision support aid. Comput Methods Programs Biomed. 2012;107(1):84–9.

    Article  PubMed  Google Scholar 

  21. Sucevic D, Ilic I. Uncertain knowledge processing in urology diagnostic problems based Expert System. In: 6th Mediterranean electrotechnical conference, proceedings vols 1 and 2. 1991. p. 741–3.

  22. Altunay S, et al. A new approach to urinary system dynamics problems: evaluation and classification of uroflowmeter signals using artificial neural networks. Expert Syst Appl. 2009;36(3):4891–5.

    Article  Google Scholar 

  23. Gil D, Johnsson M. Using support vector machines in diagnoses of urological dysfunctions. Expert Syst Appl. 2010;37(6):4713–8.

    Article  Google Scholar 

  24. Koutsojannis C, Tsimara M, Nabil E. HIROFILOS: a medical expert system for prostate diseases. In: Zaharim A, Mastorakis N, Gonos I, editors. Proceedings of the 7th Wseas international conference on computational intelligence, man-machine systems and cybernetics. 2008. 254–259.

  25. Pereira M, Schaefer M, Marques JB. Remote expert system of support the prostate cancer diagnosis. In: Conference proceedings of the annual international conference of the IEEE engineering in medicine and biology society. IEEE engineering in medicine and biology society. Conference, vol 5. 2004. p. 3412–5.

  26. Torshizi AD, et al. A hybrid fuzzy-ontology based intelligent system to determine level of severity and treatment recommendation for Benign Prostatic Hyperplasia. Comput Methods Programs Biomed. 2014;113(1):301–13.

    Article  PubMed  Google Scholar 

  27. Binik YM, et al. Intelligent computer-based assessment and psychotherapy - an expert system for sexual dysfunction. J Nerv Ment Dis. 1988;176(7):387–400.

    Article  CAS  PubMed  Google Scholar 

  28. Beligiannis G, et al. A GA driven intelligent system for medical diagnosis. In: Knowledge-based intelligent information and engineering systems, Pt 1, proceedings, vol 4251. 2006. p. 968–75.

  29. Koutsojannis C, Hatzilygeroudis L. FESMI: a fuzzy expert system for diagnosis and treatment of male impotence. In: Knowledge-based intelligent information and engineering systems, Pt 2, proceedings, vol 3214. 2004. p. 1106–13.

  30. Papageorgiou EI. Fuzzy cognitive map software tool for treatment management of uncomplicated urinary tract infection. Comput Methods Programs Biomed. 2012;105(3):233–45.

    Article  PubMed  Google Scholar 

  31. Arlen AM, Alexander SE, Wald M, Cooper CS. Computer model predicting breakthrough febrile urinary tract infection in children with primary vesicoureteral reflux. J Pediatr Urol. 2016 Oct;12(5):288.e1-288.e5.

    Article  Google Scholar 

  32. Goyal NK, et al. Prediction of biochemical failure in localized carcinoma of prostate after radical prostatectomy by neuro-fuzzy. Indian J Urol. 2007;23(1):14–7.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Ronco AL, Fernandez R. Improving ultrasonographic diagnosis of prostate cancer with neural networks. Ultrasound Med Biol. 1999;25(5):729–33.

    Article  CAS  PubMed  Google Scholar 

  34. Babaian RJ, et al. Performance of a neural network in detecting prostate cancer in the prostate-specific antigen reflex range of 2.5 to 4.0 ng/ml. Urology. 2000;56(6):1000–6.

    Article  CAS  PubMed  Google Scholar 

  35. Finne P, et al. Predicting the outcome of prostate biopsy in screen-positive men by a multilayer perceptron network. Urology. 2000;56(3):418–22.

    Article  CAS  PubMed  Google Scholar 

  36. Stephan C, et al. Multicenter evaluation of an artificial neural network to increase the prostate cancer detection rate and reduce unnecessary biopsies. Clin Chem. 2002;48(8):1279–87.

    Article  CAS  PubMed  Google Scholar 

  37. Djavan B, et al. Novel artificial neural network for early detection of prostate cancer. J Clin Oncol. 2002;20(4):921–9.

    Article  PubMed  Google Scholar 

  38. Remzi M, et al. An artificial neural network to predict the outcome of repeat prostate biopsies. Urology. 2003;62(3):456–60.

    Article  PubMed  Google Scholar 

  39. Kalra P, et al. A neurocomputational model for prostate carcinoma detection. Cancer. 2003;98(9):1849–54.

    Article  PubMed  Google Scholar 

  40. Saritas I, Allahverdi N, Sert IU. A fuzzy expert system design for diagnosis of prostate cancer. In Proceedings of the 4th international conference conference on Computer systems and technologies: e-Learning (CompSysTech '03). Association for Computing Machinery, New York, NY, USA, 345–351.

  41. Matsui Y, et al. The use of artificial neural network analysis to improve the predictive accuracy of prostate biopsy in the Japanese population. Jpn J Clin Oncol. 2004;34(10):602–7.

    Article  PubMed  Google Scholar 

  42. Porter CR, et al. Model to predict prostate biopsy outcome in large screening population with independent validation in referral setting. Urology. 2005;65(5):937–41.

    Article  PubMed  Google Scholar 

  43. Lee HJ, et al. Role of transrectal ultrasonography in the prediction of prostate cancer—artificial neural network analysis. J Ultrasound Med. 2006;25(7):815–21.

    Article  PubMed  Google Scholar 

  44. Benecchi L. Neuro-fuzzy system for prostate cancer diagnosis. Urology. 2006;68(2):357–61.

    Article  PubMed  Google Scholar 

  45. Stephan C, et al. Networks for prostate biopsy indication in two different patient Populations comparison of two different artificial neural. Urology. 2007;70(3):596–601.

    Article  PubMed  Google Scholar 

  46. Kawakami S, et al. Development, validation, and head-to-head comparison of logistic regression-based nomograms and artificial neural network models predicting prostate cancer on initial extended biopsy. Eur Urol. 2008;54(3):601–11.

    Article  PubMed  Google Scholar 

  47. Stephan C, et al. A -2 proPSA-based artificial neural network significantly improves differentiation between prostate cancer and benign prostatic diseases. Prostate. 2009;69(2):198–207.

    Article  CAS  PubMed  Google Scholar 

  48. Lee HJ, et al. Image-based clinical decision support for transrectal ultrasound in the diagnosis of prostate cancer: comparison of multiple logistic regression, artificial neural network, and support vector machine. Eur Radiol. 2010;20(6):1476–84.

    Article  PubMed  Google Scholar 

  49. Meijer RP, et al. The value of an artificial neural network in the decision-making for prostate biopsies. World J Urol. 2009;27(5):593–8.

    Article  CAS  PubMed  Google Scholar 

  50. Saritas I, Ozkan IA, Sert IU. Prognosis of prostate cancer by artificial neural networks. Expert Syst Appl. 2010;37(9):6646–50.

    Article  Google Scholar 

  51. Lawrentschuk N, et al. Predicting prostate biopsy outcome: artificial neural networks and polychotomous regression are equivalent models. Int Urol Nephrol. 2011;43(1):23–30.

    Article  PubMed  Google Scholar 

  52. Ecke TH, et al. Outcome prediction for prostate cancer detection rate with artificial neural network (ANN) in daily routine. Urol Oncol Semin Orig Investig. 2012;30(2):139–44.

    Google Scholar 

  53. Filella X, et al. The influence of prostate volume in prostate health index performance in patients with total PSA lower than 10 μg/L. Clin Chim Acta. 2014;436:303–7.

    Article  CAS  PubMed  Google Scholar 

  54. Yuksel et al.: Application of soft sets to diagnose the prostate cancer risk. Journal of Inequalities and Applications 2013 2013:229.

    Article  Google Scholar 

  55. Samli MM, Dogan I. An artificial neural network for predicting the presence of spermatozoa in the testes of men with nonobstructive azoospermia. J Urol. 2004;171(6, Part 1):2354–7.

    Article  PubMed  Google Scholar 

  56. Powell CR, et al. Computational models for detection of endocrinopathy in subfertile males. Int J Impot Res. 2007;20(1):79–84.

    Article  PubMed  Google Scholar 

  57. Ramasamy R, et al. A comparison of models for predicting sperm retrieval before microdissection testicular sperm extraction in men with nonobstructive azoospermia. J Urol. 2013;189(2):638–42.

    Article  PubMed  Google Scholar 

  58. Paya AS, et al. Development of an artificial neural network for helping to diagnose diseases in urology. In Proceedings of the 1st international conference on Bio inspired models of network, information and computing systems (BIONETICS '06). Association for Computing Machinery, New York, NY, USA, 9–es.

  59. Gil D, et al. Application of artificial neural networks in the diagnosis of urological dysfunctions. Expert Syst Appl. 2009;36(3):5754–60.

    Article  Google Scholar 

  60. Wadie BS, et al. Application of artificial neural network in prediction of bladder outlet obstruction: a model based on objective, noninvasive parameters. Urology. 2006;68(6):1211–4.

    Article  PubMed  Google Scholar 

  61. Wadie BS, Badawi AM, Ghoneim MA. The relationship of the international prostate symptom score and objective parameters for diagnosing bladder outlet obstruction. Part II: the potential usefulness of artificial neural networks. J Urol. 2001;165(1):35–7.

    Article  CAS  PubMed  Google Scholar 

  62. Tewari A, Narayan P. Novel staging tool for localized prostate cancer: a pilot study using genetic adaptive neural networks. J Urol. 1998;160(2):430–6.

    Article  CAS  PubMed  Google Scholar 

  63. Chang PL, et al. Evaluation of a decision-support system for preoperative staging of prostate cancer. Med Decis Making. 1999;19(4):419–27.

    Article  CAS  PubMed  Google Scholar 

  64. Batuello JT, et al. Artificial neural network model for the assessment of lymph node spread in patients with clinically localized prostate cancer. Urology. 2001;57(3):481–5.

    Article  CAS  PubMed  Google Scholar 

  65. Han M, et al. Evaluation of artificial neural networks for the prediction of pathologic stage in prostate carcinoma. Cancer. 2001;91(8):1661–6.

    Article  CAS  PubMed  Google Scholar 

  66. Mattfeldt T, et al. Prediction of postoperative prostatic cancer stage on the basis of systematic biopsies using two types of artificial neural networks. Eur Urol. 2001;39(5):530–6.

    Article  CAS  PubMed  Google Scholar 

  67. Matsui Y, et al. Artificial neural network analysis for predicting pathological stage of clinically localized prostate cancer in the Japanese population. Jpn J Clin Oncol. 2002;32(12):530–5.

    Article  PubMed  Google Scholar 

  68. Zlotta AR, et al. An artificial neural network for prostate cancer staging when serum prostate specific antigen is 10 NG./ML. or less. J Urol. 2003;169(5):1724–8.

    Article  PubMed  Google Scholar 

  69. Chiu JS, et al. Artificial neural network to predict skeletal metastasis in patients with prostate cancer. J Med Syst. 2009;33(2):91–100.

    Article  PubMed  Google Scholar 

  70. Kim SY, et al. Pre-operative prediction of advanced prostatic cancer using clinical decision support systems: accuracy comparison between support vector machine and artificial neural network. Korean J Radiol. 2011;12(5):588–94.

    Article  PubMed  PubMed Central  Google Scholar 

  71. Regnier-Coudert O, et al. Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers. Artif Intell Med. 2012;55(1):25–35.

    Article  PubMed  Google Scholar 

  72. Veltri RW, et al. Comparison of logistic regression and neural net modeling for prediction of prostate cancer pathologic stage. Clin Chem. 2002;48(10):1828–34.

    Article  CAS  PubMed  Google Scholar 

  73. Cosma G, et al. Prediction of pathological stage in patients with prostate cancer: a neuro-fuzzy model. PLoS ONE. 2016;11(6):e0155856.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  74. Moul JW, et al. Neural-network analysis of quantitative histological factors to predict pathological stage in clinical stage-I nonseminomatous testicular cancer. J Urol. 1995;153(5):1674–7.

    Article  CAS  PubMed  Google Scholar 

  75. Poulakis V, et al. Prediction of clearance of inferior caliceal calculi with extracorporeal shock wave lithotripsy. Using an artificial neural network analysis. Urol A. 2002;41(6):583–95.

    Article  CAS  Google Scholar 

  76. Hamid A, et al. Artificial neural networks in predicting optimum renal stone fragmentation by extracorporeal shock wave lithotripsy: a preliminary study. BJU Int. 2003;91(9):821–4.

    Article  CAS  PubMed  Google Scholar 

  77. Gomha MA, et al. Can we improve the prediction of stone-free status after extracorporeal shock wave lithotripsy for ureteral stones? A neural network or a statistical model? J Urol. 2004;172(1):175–9.

    Article  PubMed  Google Scholar 

  78. Michaels EK, et al. Use of a neural network to predict stone growth after shock wave lithotripsy. Urology. 1998;51(2):335–8.

    Article  CAS  PubMed  Google Scholar 

  79. Naguib RNG, et al. Neural network analysis of combined conventional and experimental prognostic markers in prostate cancer: a pilot study. Br J Cancer. 1998;78(2):246–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Potter SR, et al. Genetically engineered neural networks for predicting prostate cancer progression after radical prostatectomy. Urology. 1999;54(5):791–5.

    Article  CAS  PubMed  Google Scholar 

  81. Porter C, et al. Artificial neural network model to predict biochemical failure after radical prostatectomy. Mol Urol. 2001;5(4):159–62.

    Article  CAS  PubMed  Google Scholar 

  82. Seker H, et al. A fuzzy logic based-method for prognostic decision making in breast and prostate cancers. IEEE Trans Inf Technol Biomed. 2003;7(2):114–22.

    Article  PubMed  Google Scholar 

  83. Poulakis V, et al. Preoperative neural network using combined magnetic resonance imaging variables, prostate-specific antigen, and Gleason score to predict positive surgical margins. Urology. 2004;64(3):516–21.

    Article  PubMed  Google Scholar 

  84. Poulakis V, et al. Preoperative neural network using combined magnetic resonance imaging variables, prostate specific antigen and Gleason score to predict prostate cancer stage. J Urol. 2004;172(4):1306–10.

    Article  PubMed  Google Scholar 

  85. de Paula Castanho MJ, et al. Fuzzy expert system: an example in prostate cancer. Appl Math Comput. 2008;202(1):78–85.

    Google Scholar 

  86. Botoca C, et al. Prediction of prostate capsule penetration using neural networks. In: Proceedings of the 8th Wseas international conference on computational intelligence, man-machine systems and cybernetics (Cimmacs '09). 2009. p. 108–11.

  87. Castanho MJP, et al. Fuzzy expert system for predicting pathological stage of prostate cancer. Expert Syst Appl. 2013;40(2):466–70.

    Article  Google Scholar 

  88. Hu XH, et al. Risk prediction models for biochemical recurrence after radical prostatectomy using prostate-specific antigen and Gleason score. Asian J Androl. 2014;16(6):897–901.

    Article  PubMed  PubMed Central  Google Scholar 

  89. Tewari A, et al. Genetic adaptive neural network to predict biochemical failure after radical prostatectomy: a multi-institutional study. Mol Urol. 2001;5(4):163–9.

    Article  CAS  PubMed  Google Scholar 

  90. Borque A, et al. The use of neural networks and logistic regression analysis for predicting pathological stage in men undergoing radical prostatectomy: a population based study. J Urol. 2001;166(5):1672–8.

    Article  CAS  PubMed  Google Scholar 

  91. Tsao CW, et al. Artificial neural network for predicting pathological stage of clinically localized prostate cancer in a Taiwanese population. J Chin Med Assoc. 2014;77(10):513–8.

    Article  PubMed  Google Scholar 

  92. Cummings JM, et al. Prediction of spontaneous ureteral calculous passage by an artificial neural network. J Urol. 2000;164(2):326–8.

    Article  CAS  PubMed  Google Scholar 

  93. Dal Moro F, et al. A novel approach for accurate prediction of spontaneous passage of ureteral stones: support vector machines. Kidney Int. 2006;69(1):157–60.

    Article  Google Scholar 

  94. Sun CC, Chang P. Prediction of unexpected emergency room visit after extracorporeal shock wave lithotripsy for urolithiasis - an application of artificial neural network in hospital information system. AMIA Annu Symp Proc. 2006;2006:1113.

  95. Bagli DJ, et al. Artificial neural networks in pediatric urology: Prediction of sonographic outcome following pyeloplasty. J Urol. 1998;160(3):980–3.

    CAS  PubMed  Google Scholar 

  96. Seçkiner I, et al. Use of artificial neural networks in the management of antenatally diagnosed ureteropelvic junction obstruction. Can Urol Assoc J. 2011;5(6):E152.

    Article  PubMed  PubMed Central  Google Scholar 

  97. Parekattil SJ, et al. Multi-institutional validation study of neural networks to predict duration of stay after laparoscopic radical/simple or partial nephrectomy. J Urol. 2005;174(4):1380–4.

    Article  PubMed  Google Scholar 

  98. Vukicevic AM, et al. Evolutionary assembled neural networks for making medical decisions with minimal regret: application for predicting advanced bladder cancer outcome. Expert Syst Appl. 2014;41(18):8092–100.

    Article  Google Scholar 

  99. Serrano-Durba A, et al. The use of neural networks for predicting the result of endoscopic treatment for vesico-ureteric reflux. BJU Int. 2004;94(1):120–2.

    Article  PubMed  Google Scholar 

  100. Naguib RNG, Qureshi KN, Hamdy FC, Neal DE. Neural network analysis of prognostic markers in bladder cancer. In: Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Magnificent Milestones and Emerging Opportunities in Medical Engineering (Cat. No.97CH36136), 1997, vol.3, pp. 1007–9.

  101. Qureshi KN, et al. Neural network analysis of clinicopathological and molecular markers in bladder cancer. J Urol. 2000;163(2):630–3.

    Article  CAS  PubMed  Google Scholar 

  102. Fujikawa K, et al. Predicting disease outcome of non-invasive transitional cell carcinoma of the urinary bladder using an artificial neural network model: results of patient follow-up for 15 years or longer. Int J Urol. 2003;10(3):149–52.

    Article  PubMed  Google Scholar 

  103. Catto JWF, et al. Artificial intelligence in predicting bladder cancer outcome: a comparison of neuro-fuzzy modeling and artificial neural networks. Clin Cancer Res. 2003;9(11):4172–7.

    PubMed  Google Scholar 

  104. Abbod MF, et al. Artificial intelligence for the prediction of bladder cancer. Biomed Eng Appl Basis Commun. 2004;16(02):49–58.

    Article  Google Scholar 

  105. Catto JWF, et al. Neuro-fuzzy modeling: an accurate and interpretable method for predicting bladder cancer progression. J Urol. 2006;175(2):474–9.

    Article  PubMed  Google Scholar 

  106. Cai T, et al. Artificial intelligences in urological practice: the key to success? Ann Oncol. 2007;18(3):604-U10.

    Article  CAS  PubMed  Google Scholar 

  107. Bassi P, et al. Prognostic accuracy of an artificial neural network in patients undergoing radical cystectomy for bladder cancer: a comparison with logistic regression analysis. BJU Int. 2007;99(5):1007–12.

    Article  PubMed  Google Scholar 

  108. Catto JWF, et al. Neurofuzzy Modeling to determine recurrence risk following radical cystectomy for nonmetastatic urothelial carcinoma of the bladder. Clin Cancer Res. 2009;15(9):3150–5.

    Article  PubMed  Google Scholar 

  109. El-Mekresh M, et al. Prediction of survival after radical cystectomy for invasive bladder carcinoma: risk group stratification, nomograms or artificial neural networks? J Urol. 2009;182(2):466–72.

    Article  PubMed  Google Scholar 

  110. Kolasa M, et al. Application of artificial neural network to predict survival time for patients with bladder cancer. Comput Med Act. 2009;65:113–22.

    Article  Google Scholar 

  111. Buchner A, et al. Prediction of outcome in patients with urothelial carcinoma of the bladder following radical cystectomy using artificial neural networks. Ejso. 2013;39(4):372–9.

    Article  CAS  PubMed  Google Scholar 

  112. Wang G, et al. Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques. Comput Biol Med. 2015;63:124–32.

    Article  PubMed  Google Scholar 

  113. Cai T, et al. Artificial intelligence for predicting recurrence-free probability of non-invasive high-grade urothelial bladder cell carcinoma. Oncol Rep. 2007;18(4):959–64.

    PubMed  Google Scholar 

  114. Buchner A, et al. Outcome assessment of patients with metastatic renal cell carcinoma under systemic therapy using artificial neural networks. Clin Genitourin Cancer. 2012;10(1):37–42.

    Article  PubMed  Google Scholar 

  115. Marszall MP, et al. ANN as a prognostic tool after treatment of non-seminoma testicular cancer. Cent Eur J Med. 2012;7(5):672–9.

    Google Scholar 

  116. Kuo R-J, et al. Application of a two-stage fuzzy neural network to a prostate cancer prognosis system. Artif Intell Med. 2015;63(2):119–33.

    Article  PubMed  Google Scholar 

  117. Tanthanuch M, Tanthanuch S. Prediction of upper urinary tract calculi using an artificial neural network. J Med Assoc Thai. 2004;87(5):515–8.

    PubMed  Google Scholar 

  118. Cancer Research UK, https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/bladder-cancer#heading-Zero. Accessed May 2021.

  119. Herr HW, et al. Defining optimal therapy for muscle invasive bladder cancer. J Urol. 2007;177(2):437–43.

    Article  CAS  PubMed  Google Scholar 

  120. von der Maase H, et al. Long-term-survival results of a randomized trial comparing gemcitabine plus cisplatin, with methotrexate, vinblastine, doxorubicin, plus cisplatin in patients with bladder cancer (Retracted article See vol. 16, pg. 1481, 2011). J Clin Oncol. 2005;23(21):4602–8.

    Article  PubMed  CAS  Google Scholar 

  121. Krongrad A, et al. Predictors of general quality of life in patients with benign prostate hyperplasia or prostate cancer. J Urol. 1997;157(2):534–8.

    Article  CAS  PubMed  Google Scholar 

  122. Han M, et al. A neural network predicts progression for men with Gleason score 3+4 versus 4+3 tumors after radical prostatectomy. Urology. 2000;56(6):994–9.

    Article  CAS  PubMed  Google Scholar 

  123. Parekattil SJ, Fisher HAG, Kogan BA. Neural network using combined urine nuclear matrix protein-22, monocyte chemoattractant protein-1 and urinary intercellular adhesion molecule-1 to detect bladder cancer. J Urol. 2003;169(3):917–20.

    Article  CAS  PubMed  Google Scholar 

  124. Djavan B, et al. Longitudinal study of men with mild symptoms of bladder outlet obstruction treated with watchful waiting for four years. Urology. 2004;64(6):1144–8.

    Article  PubMed  Google Scholar 

  125. Kshirsagar A, et al. Predicting hypogonadism in men based upon age, presence of erectile dysfunction, and depression. Int J Impot Res. 2006;18(1):47–51.

    Article  CAS  PubMed  Google Scholar 

  126. Stephan C, et al. Clinical utility of human glandular kallikrein 2 within a neural network for prostate cancer detection. BJU Int. 2005;96(4):521–7.

    Article  CAS  PubMed  Google Scholar 

  127. Abbod MF, et al. Artificial intelligence technique for gene expression profiling of urinary bladder cancer. In: 2006 3rd international IEEE conference on intelligent systems. 2006.

  128. Stephan C, et al. A (-5,-7) ProPSA based artificial neural network to detect prostate cancer. Eur Urol. 2006;50(5):1014–20.

    Article  PubMed  Google Scholar 

  129. Stephan C, et al. Improved prostate cancer detection with a human kallikrein 11 and percentage free PSA-based artificial neural network. Biol Chem. 2006;387(6):801–5.

    Article  CAS  PubMed  Google Scholar 

  130. Stephan C, et al. An artificial neural network for five different assay systems of prostate-specific antigen in prostate cancer diagnostics. BJU Int. 2008;102(7):799–805.

    Article  CAS  PubMed  Google Scholar 

  131. Cinar M, et al. Early prostate cancer diagnosis by using artificial neural networks and support vector machines. Expert Syst Appl. 2009;36(3):6357–61.

    Article  Google Scholar 

  132. Stephan C, et al. Internal validation of an artificial neural network for prostate biopsy outcome. Int J Urol. 2010;17(1):62–8.

    Article  PubMed  Google Scholar 

  133. Catto JWF, et al. The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression. Eur Urol. 2010;57(3):398–406.

    Article  CAS  PubMed  Google Scholar 

  134. Serati M, et al. Urinary symptoms and urodynamic findings in women with pelvic organ prolapse: is there a correlation? results of an artificial neural network analysis. Eur Urol. 2011;60(2):253–60.

    Article  PubMed  Google Scholar 

  135. Gil D, et al. Predicting seminal quality with artificial intelligence methods. Expert Syst Appl. 2012;39(16):12564–73.

    Article  Google Scholar 

  136. Girela JL, et al. Semen parameters can be predicted from environmental factors and lifestyle using artificial intelligence methods. Biol Reprod. 2013;88(4):99–1.

    Article  CAS  Google Scholar 

  137. Stephan C, et al. Multicenter evaluation of -2 proprostate-specific antigen and the prostate health index for detecting prostate cancer. Clin Chem. 2013;59(1):306–14.

    Article  CAS  PubMed  Google Scholar 

  138. Cai T, et al. Clinical importance of lymph node density in predicting outcome of prostate cancer patients. J Surg Res. 2011;167(2):267–72.

    Article  PubMed  Google Scholar 

  139. Kim M, et al. Factors influencing nonabsolute indications for surgery in patients with lower urinary tract symptoms suggestive of benign prostatic hyperplasia: analysis using causal Bayesian networks. Int Neurourol J. 2014;18(4):198–205.

    Article  PubMed  PubMed Central  Google Scholar 

  140. Green WJF, et al. KI67 and DLX2 predict increased risk of metastasis formation in prostate cancer-a targeted molecular approach. Br J Cancer. 2016;115(2):236–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  141. Logvinenko T, Chow JS, Nelson CP. Predictive value of specific ultrasound findings when used as a screening test for abnormalities on VCUG. J Pediatr Urol. 2015;11(4):176.e1-176.e7.

    Article  Google Scholar 

  142. Wells DM, Niederer J. A medical expert system approach using artificial neural networks for standardized treatment planning. Int J Radiat Oncol Biol Phys. 1998;41(1):173–82.

    Article  CAS  PubMed  Google Scholar 

  143. Loch T, et al. Artificial neural network analysis (ANNA) of prostatic transrectal ultrasound. Prostate. 1999;39(3):198–204.

    Article  CAS  PubMed  Google Scholar 

  144. Mattfeldt T, et al. Prediction of prostatic cancer progression after radical prostatectomy using artificial neural networks: a feasibility study. BJU Int. 1999;84(3):316–23.

    Article  CAS  PubMed  Google Scholar 

  145. Llobet R, et al. Computer-aided detection of prostate cancer. Int J Med Inform. 2007;76(7):547–56.

    Article  PubMed  Google Scholar 

  146. Hassanien AE, Al-Qaheri H, El-Dahshan ESA. Prostate boundary detection in ultrasound images using biologically-inspired spiking neural network. Appl Soft Comput. 2011;11(2):2035–41.

    Article  Google Scholar 

  147. Matulewicz L, et al. Anatomic segmentation improves prostate cancer detection with artificial neural networks analysis of H-1 magnetic resonance spectroscopic imaging. J Magn Reson Imaging. 2014;40(6):1414–21.

    Article  PubMed  Google Scholar 

  148. Gatidis S, et al. Combined unsupervised–supervised classification of multiparametric PET/MRI data: application to prostate cancer. NMR Biomed. 2015;28(7):914–22.

    Article  PubMed  Google Scholar 

  149. Pantazopoulos D, et al. Comparing neural networks in the discrimination of benign from malignant lower urinary tract lesions. Br J Urol. 1998;81(4):574–9.

    Article  CAS  PubMed  Google Scholar 

  150. Xiao D, et al. 3D detection and extraction of bladder tumors via MR virtual cystoscopy. Int J Comput Assist Radiol Surg. 2016;11(1):89–97.

    Article  PubMed  Google Scholar 

  151. Hurst RE, et al. Neural net-based identification of cells expressing the p300 tumor-related antigen using fluorescence image analysis. Cytometry. 1997;27(1):36–42.

    Article  CAS  PubMed  Google Scholar 

  152. Volmer M, et al. Artificial neural-network predictions of urinary calculus compositions analyzed with infrared-spectroscopy. Clin Chem. 1994;40(9):1692–7.

    Article  CAS  PubMed  Google Scholar 

  153. Pantazopoulos D, et al. Back propagation neural network in the discrimination of benign from malignant lower urinary tract lesions. J Urol. 1998;159(5):1619–23.

    Article  CAS  PubMed  Google Scholar 

  154. Lamb DJ, Niederberger CS. Artificial-intelligence in medicine and male-infertility. World J Urol. 1993;11(2):129–36.

    Article  CAS  PubMed  Google Scholar 

  155. Holzinger A, et al. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9(4):e1312.

    Article  PubMed  PubMed Central  Google Scholar 

  156. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.

    Article  CAS  PubMed  Google Scholar 

  157. Lakkaraju HKE, Caruana R, Leskovec J. Interpretable and explorable approximations of black box models. 2017. Arxiv 1707.01154.

  158. Bologna G, Hayashi Y. Characterization of symbolic rules embedded in deep DIMLP networks: a challenge to transparency of deep learning. J Artif Intell Soft Comput Res. 2017;7(4):265–86.

    Article  Google Scholar 

  159. Cabitza F, Zeitoun JD. The proof of the pudding: in praise of a culture of real-world validation for medical artificial intelligence. Ann Transl Med. 2019;7(8):161.

    Article  PubMed  PubMed Central  Google Scholar 

  160. Nagendran M, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689.

    Article  PubMed  PubMed Central  Google Scholar 

  161. Moons KGM, et al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med. 2019;170(1):W1–33.

    Article  PubMed  Google Scholar 

  162. Cabitza F, Campagner A, Balsano C. Bridging the “last mile” gap between AI implementation and operation: “data awareness” that matters. Ann Transl Med. 2020;8(7):501.

    Article  PubMed  PubMed Central  Google Scholar 

  163. Kattan MW, Cowen ME, Miles BJ. Computer modeling in urology. Urology. 1996;47(1):14–21.

    Article  CAS  PubMed  Google Scholar 

  164. Eminaga O, Liao JC. Chapter 16—prospect and adversity of artificial intelligence in urology. In: Xing L, Giger ML, Min JK, editors. Artificial intelligence in medicine. London: Academic Press; 2021. p. 309–37.

    Chapter  Google Scholar 

  165. Chang TC, et al. Current trends in artificial intelligence application for endourology and robotic surgery. Urol Clin N Am. 2021;48(1):151–60.

    Article  Google Scholar 

  166. NICE. Prostate cancer: diagnosis and treatment CG175. National Institute for Health and Care Excellence. 2014

  167. NICE. Prostate cancer: diagnosis and treatment CG28. 2008.

  168. Eminaga O, et al. Diagnostic classification of cystoscopic images using deep convolutional neural networks. JCO Clin Cancer Inform. 2018;2:1–8.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

No sources of funding or any form of financial support of disclose.

Author information

Authors and Affiliations

Authors

Contributions

All listed authors have read and approved the final manuscript. All listed authors contributed sufficiently to take responsibility for the whole content of the manuscript following the criteria in ICJME guidelines of authorship rights and responsibilities. HS for conceptualisation, literature review, data curation, formal analysis, methodology and original writing, review, and editing. DS and JNL for supervision, writing review and editing. AA for field investigation, validation, draft review and editing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Amir Awwad.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

No competing interests or exclusive licenses used in preparing this manuscript. The authors indicated no potential conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Salem, H., Soria, D., Lund, J.N. et al. A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology. BMC Med Inform Decis Mak 21, 223 (2021). https://doi.org/10.1186/s12911-021-01585-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12911-021-01585-9