A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology

Salem, Hesham; Soria, Daniele; Lund, Jonathan N.; Awwad, Amir

doi:10.1186/s12911-021-01585-9

Research
Open access
Published: 22 July 2021

A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology

BMC Medical Informatics and Decision Making volume 21, Article number: 223 (2021) Cite this article

7700 Accesses
9 Citations
1 Altmetric
Metrics details

Abstract

Background

Testing a hypothesis for ‘factors-outcome effect’ is a common quest, but standard statistical regression analysis tools are rendered ineffective by data contaminated with too many noisy variables. Expert Systems (ES) can provide an alternative methodology in analysing data to identify variables with the highest correlation to the outcome. By applying their effective machine learning (ML) abilities, significant research time and costs can be saved. The study aims to systematically review the applications of ES in urological research and their methodological models for effective multi-variate analysis. Their domains, development and validity will be identified.

Methods

The PRISMA methodology was applied to formulate an effective method for data gathering and analysis. This study search included seven most relevant information sources: WEB OF SCIENCE, EMBASE, BIOSIS CITATION INDEX, SCOPUS, PUBMED, Google Scholar and MEDLINE. Eligible articles were included if they applied one of the known ML models for a clear urological research question involving multivariate analysis. Only articles with pertinent research methods in ES models were included. The analysed data included the system model, applications, input/output variables, target user, validation, and outcomes. Both ML models and the variable analysis were comparatively reported for each system.

Results

The search identified n = 1087 articles from all databases and n = 712 were eligible for examination against inclusion criteria. A total of 168 systems were finally included and systematically analysed demonstrating a recent increase in uptake of ES in academic urology in particular artificial neural networks with 31 systems. Most of the systems were applied in urological oncology (prostate cancer = 15, bladder cancer = 13) where diagnostic, prognostic and survival predictor markers were investigated. Due to the heterogeneity of models and their statistical tests, a meta-analysis was not feasible.

Conclusion

ES utility offers an effective ML potential and their applications in research have demonstrated a valid model for multi-variate analysis. The complexity of their development can challenge their uptake in urological clinics whilst the limitation of the statistical tools in this domain has created a gap for further research studies. Integration of computer scientists in academic units has promoted the use of ES in clinical urological research.

Peer Review reports

Introduction

In the 1950’s J McCarthy in Stanford University and A Turing in Cambridge University proposed the concept of machine simulation of human learning and intelligence [1, 2]. Being keen mathematicians, they advanced the basic mathematical logic into programming languages enabling machines to perform more complex functions. E Shortliffe advanced those systems to develop MYCIN, which successfully simulated the reasoning of a human microbiologist in diagnosing and treating patients with microbial infection [3]. Their model introduced Expert Systems (ES) to the scientific literature and a ten year review by Liao et al. demonstrated their wide prevalence in the industrial fields with immense applications including health care [4]. In contrast to Liao’s review, other studies questioned their real time implementation in health care and suggested a lack of their uptake and integration in the health care systems [5]. This is despite evidence from systematic reviews demonstrating the positive impact of computer aid systems on patients’ outcome and health care [6, 7].

This study aimed to systematically review published ES in urological health care with a primary aim to demonstrate their availability, progression, testing and applications. The secondary aim was to evaluate their development life cycle against standards suggested by O’Keefe and Benbasat in their review articles on ES development [8, 9]. The later would evaluate the gap between their development and implementation in health care.

Methods

The study methodology followed the recommendations outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Fig. 1). No ethical approval was required because the type of the study waives this requirement.

Search

Information sources including WEB OF SCIENCE, EMBASE, BIOSIS CITATION INDEX, SCOPUS, PUBMED, Google Scholar and MEDLINE were searched using key words in (Table 1). Articles published between 1960 and 2016 were considered and examined against the inclusion criteria. While tailoring the conducted search for each literature database, the key words were combined by “OR” in each domain, then domains were combined by “AND”.

Table 1 Keywords used for literature search

Full size table

Eligibility criteria

For the primary aim, data search was conducted to yield the collected results then analyse them according to pre-planned eligibility criteria based on the system model, year of production, type and outcome of its validation, functional domain application, variables for input and output, target user and domain. This selection criteria were designed with an objective to identify expert system studies and demonstrate their prevalence, testing, and applications in clinical urology. Only articles and studies written in English were included.

Further qualitative analysis was required to meet the study secondary aim. For this, further data was gathered on credibility (user perception on the system), evaluation (system usability), validation (building the right system) and verification (building the system right) then compare against the standards reported in [8, 9].

Data filtering

The resultant reference list of each included article was checked to identify a potentially eligible item that had not been retrieved by the initial search. All retrieved articles were collated in a final reference list on a management software (Endnote, X8), then duplicate studies were removed from the list.

Upon including more than one hundred articles, the rest of the eligible articles were meticulously compared to the ones included, then excluded based on demonstrating clear similarity. This was applied to avoid expanding the size of the data without adding to the study analysis.

Results

ANN was the commonest model to be applied in Urological ES (Fig. 2). The rest of the models demonstrated diversity which is consistent with other published industrial systems [4].

Prostate cancer was the commonest domain for urological ES with most of the system focusing on cancer diagnosis. These systems were applied to various domains (Fig. 3), and they were further stratified and analysed according to their core functional application as outlines in the methodology.

Quantitative analysis

Decision support systems

The main objective of ES in this domain was to facilitate the clinical decision making by identifying key elements from patients clinical and laboratory examinations then refine a theoretical diagnostic or treatment strategy [10]. They can guide the expert to find the right answer [11] or take over the decision making to support the none expert as [12] or even replace both to interact with the patient directly [13].

They have supported various aspects of urological decision making such as diagnosis, investigations analysis, radiotherapy dose calculation, the delivery of behavioural treatment and therapeutic dialogues.

Domains

Urinary dysfunction (U Dys) was the commonest domain to be covered in the decision support system application (n = 9), which could be further categorised into U Dys diagnostic, investigation analysis and therapeutic systems. They have demonstrated a range of methodologies, validation, and target users (Table 2) applicable to Decision support systems in Urological domain. For instance, Keles et al. [14] designed an ES to support junior nurses in diagnosing urinary elimination dysfunction in a selected group of patients while [15, 16] systems were able to support any medical user to diagnose urinary incontinence with an accuracy reaching higher than 90%. The target user of most of these systems were predominantly medical health care workers including both experts and none experts, with exception of [13, 17] which can be directly used by patients to receive an assessment of their urinary elimination dysfunction followed by a tailored treatment plan.

Table 2 Decision support systems in urological domain

Full size table

Prostate diseases were represented in 6 systems while 3 of them modelled by [10, 12, 20] for diagnosing both benign and malignant prostatic disease, namely cancer prostate (CaP).

All systems in this domain were diagnosis support system with exception of [19] which also provided treatment for benign prostatic hyperplasia (BPH) and [11] calculated the required radiotherapy dose for treating CaP.

Sexual dysfunctions were modelled in 3 systems where [21] diagnosed male sexual dysfunction with an accuracy of 89%, while [22] added a therapeutic model for the same disease with an overall accuracy of 79%. Sexpert by [23] was the third system in this category developed in 1988 and in fact the oldest ES to be identified from our search in all urological domains. Interestingly this RB system was designed to interact directly with couples suffering from sexual dysfunction where the system responds to their query with a tailored therapeutic dialogue for treating their problem.

Urinary tract infection (UTI) was diagnosed and treated by one of the hybrid fuzzy systems FNM developed by [24] with an accuracy of 86.8%.

Diagnosis prediction

In this domain, ES quantifying the probability of a clinical diagnosis with a defined margin of error. They simulate a second expert opinion and it has been suggested that their use could eliminate unnecessary invasive investigation as the application of ANN by [26] could reduce up to 68% of repeated TRUS biopsies to diagnose CaP.

Domains

Prostate cancer was the main domain for this application with 19 systems out of 20. Most of them were designed to predict organ confinement before radical surgical excision of the prostate (Tables 3, 4). The target population were patients with clinically localised CaP and their accuracy reached high estimates as in [28], where the system was able to predict 98% of the low risk group for lymph node involvement using preoperative available date (PSA, clinical stage and Gleason score).

Table 3 Diagnosis prediction application of Expert Systems (ES) in Urology

Full size table

Table 4 Disease stage prediction

Full size table

Chiu et al. [29] modelled a system with clinical variables for patients undergoing nuclear bone scintigraphy for predicting skeletal metastasis. The system was able to predict metastatic disease in the test group with Se 87.5%, Sp 83.3%.

None seminoma testicular cancer was the other domain in this application with the system [27] able to predict the cancer disease stage (Table 4) with accuracy reaching 87%.

Treatment outcome prediction

In this application, ES combined disease and patient related factors to estimate the success of a specific treatment or intervention. As in [30, 38, 64, 69] where the system predicted the outcome of extra corporeal shock wave (ESWL) for treating kidney stones and [74, 75] providing an estimation of cancer recurrence after radical surgical treatment of prostate cancer.

Domains

Prostate cancer was also common domain in this application (n = 23). Potter [74, 75] described 4 models developed by data acquired from patients with clinically localised CaP and had radical prostatectomy with curative intent. The variables included clinical and histological findings of the surgical specimen and they were able to predict up to 81% who did not have evidence biochemical failure (rising PSA) in their follow up. Hamid et al. [76] and Gomha [77] models were not restricted to the clinically localised CaP cohort and their study population included patients at different disease stages and on any treatment pathway. Their models included 2 experimental histological markers (tumour suppressor gene p53 and the proto-oncogene bcl-2) in their input variables and the estimated predictive accuracy of the patient response to treatment were reaching 68% and 80% (p < 0.00001) respectively.

Nephrolithiasis treatment was expressed by 6 other systems applying the treatment outcome prediction concept. Cummings et al. targeted this group in his ANN [78] where he trained his network with patients’ data treated at the emergency service of 3 centres with ureteric stones, to identify patients failing conservative management and requiring further intervention. When tested on a different set of 55 cases, the system correctly predicted 100% of the patients who passed the stone spontaneously with an overall accuracy of 76%.

Extra corporeal shockwave lithotripsy (ESWL) is one of the favourable interventions in the nephrolithiasis treatment domain. The stone here receives strong external shock waves, which can subsequently reduce it into small fragment and eliminate the need for direct instrumentation of the renal tract. Their reported success rate can only provide a generalised prediction of outcome to the individual case and ANN was capable of providing an alternative multivariate analytical tool in the 4 models developed by [30, 38, 64, 69]. They estimated high accuracy of their models (Table 5), as in [64], the system predicted 97% of the patients who were confirmed to be stone free following ESWL for treating ureteric stone.

Table 5 Treatment outcome prediction

Full size table

Paediatric pelvi-ureteric junction obstruction is primarily treated conservatively unless there is any evidence of renal function compromise, recurring infection or worsening radiological findings. For the failing group, pyeloplasty is the second line of treatment and [81] developed an ANN to estimate the success rate of this procedure for each individual case by predicting the post-operative degree of hydronephrosis with a reported 100% accuracy in the small tested sample.

Vesico ureteric reflux or reflux uropathy is another paediatric disease, characterised by back flow of urine from the bladder into the ureter through incompetent Vesico ureteric functional valve. Treatment is primarily conservative as it can be a self-limiting disease or surgery to reimplantation the ureters or endoscopic injection of bulking agent at the ureteric orifices [80]. The study authors trained a neural network using 261 cases whom have received endoscopic injection and the system predicted 94% of the patients who did not benefit from the treatment [80].

Laparoscopic partial and radical nephrectomy were the domain of the [82], which was developed by multi institutional case data (age, co-morbidities, tumour size, and extension) of patients having laparoscopic partial or radical nephrectomy. The system was able to predict the length of their postoperative hospital stay with an accuracy of 72%.

Bladder cancer can be treated with complete bladder excision and [79] developed systems to predict the cure rate with an accuracy of 83%.

Recurrence and survival prediction

The ES in this domain aimed to provide individualised risk analysis tools estimating the disease specific mortality and recognising the group whom may benefit from more aggressive or adjuvant treatment.

Domains

Bladder cancer survival and recurrence prediction following radical cystectomy (RC) with curative intention was the commonest domain in this application (24 out of 26 total systems). The lymph nodal involvement is highly predictive of the recurrence and these patients are considered for adjuvant or neoadjuvant systemic chemotherapy. The node free cohort will include high-risk patients who were not identified by the conventional linear stratification system. Catto et al. developed a FNM system to identify this high risk group in the nodal free cohort by predicting the disease recurrence rate (Se 81%, Sp 85%) and their survival with a median error of 8.15 months [92]. The high-risk group identified by this model can benefit from systemic treatment post cystectomy to improve their disease related morbidity and mortality [95, 96]. The 5 years survival post cystectomy was the output of 2 other ANN with a high prediction efficacy of 77% and 90% respectively (Table 6) [97, 99].

Table 6 Recurrence and progression prediction

Full size table

Renal cell cancer is primarily treated with partial or radical nephrectomy for clinically localised disease with systemic therapy for the metastatic disease. There is still a degree of uncertainty in stratifying individual disease risk in order to predict the indication and outcome of systemic therapy in the group with distant metastasis. Vukicevic et al. [98] attempted to clarify this uncertainty by training a neural network with patients’ data who had nephrectomy (partial or radical) and received systemic therapy. The mature model predicted the patients who survived the disease at 3 years with an overall accuracy of 95% (CI 0.878–0.987).

None seminoma testicular cancer 5 years recurrence was the domain of [118] ANN. The system was trained with multicentre data and in its testing phase and predicted 100% of the patients who did not suffer from disease recurrence at 5 years with an overall predictive accuracy of 94% (AUC = 87%).

Predicting research variables

In academia, testing a hypothesis for ‘factors-outcome effect’ is a popular quest and the standard statistical regression analysis tools may not be effective for data contaminated by irrelevant variables [119]. AI can provide an alternative methodology in the analysis to identify variables with high correlation to the outcome by applying machine learning as in ANN. The area under the curve (AUC) is estimated for the system predictive accuracy applying all researched variables. Those research variables can be given random values or randomised then the AUC is re estimated for comparison with the original [120]. Only variables that decreases the AUC are considered significant and the wider the discrepancy of the AUC the more significant they are (Table 7).

Table 7 Research variable prediction

Full size table

Domains

Prostate cancer was a common domain in this application with a total of 15 systems analysing predictive factors for diagnosis of cancer, response to treatment and quality of life with prostatic disease. One of the hot topics in Urological cancer is discovering alternative CaP diagnostic markers since serum PSA is not sensitive for distinguishing benign from malignant disease. Stephan et al. investigated the diagnostic value of three markers in this domain: Macrophage inhibitory cytokine-1, macrophage inhibitory factor and human kallikrein 11 [108]. These were used as variables (nodes) in ANN models and compared their accuracy to the linear regression of %fPSA. They have reported that only the ANN model including all three variables was more accurate (AUC 91%, Se 90%, Sp 80%) than all other models proving his hypothesis that they are only relevant as when combined.

Similarly, another study estimated the predictive values of serum PSA precursors (-5, -7 proPSA) in diagnosing prostate cancer using and comparing the accuracy to %fPSA [107]. The -5, -7 pro PSA were only significant in the cohort with PSA between 4 and10 µg/l and did not improve the predictive accuracy when added to the %fPSA. The same author tested this hypothesis on another free PSA precursor (-2 proPSA) by developing ANN with the %p2PSA (-2 ProPSA: fPSA) among other disease variables, which have improved the system accuracy (AUC 85% from 75%) [120].

Three systems evaluated the presence of bcl-2 and p53 (tumor suppressor genes) as a predictive variable for response to prostate cancer treatment [76, 77]. Their combination was reported to be significant (Ac 85%, p < 0.00001) in [77] but [76] found that only bcl-2 is relevant in the other two models (accuracy 63–68%).

Bladder cancer diagnosis and disease progression was the second most common domain with 13 systems. Kolasa et al. [110] have modeled an ANN with three novel urine markers: urine levels of nuclear matrix protein-22, monocyte chemoattractant protein-1 and urinary intercellular adhesion molecule-1, to predict the diagnosis of bladder cancer and it succeeded in predicting all cancer free patients when the three variables were used as a group. Catto.et al. [119] developed two AI models (ANN & FNM) performing microarray analysis on genes associated with bladder cancer progression. Their models narrowed down these genes from 200 to 11 progression-associated genes out of 200 ([OR] 0.70; 95% [CI] 0.56–0.87), which were found to be more accurate than the regression analysis when compared to the specimen immunohistology results.

Kolasa et al. [110] model predicting the pre-histology diagnosis of malignancy based on urine level of novel tumour markers. Their ANN was found to be more accurate (Se 100%, Sp 75.7%) than haematuria diagnosed on urine dipstick (Se 92.6%, Sp 51.8%) and atypical urine cytology (Se 66.7%, Sp 81%).

ESWL of renal stones was the research domain of [30, 69], where they aimed at identifying significant variables correlated to the treatment outcome (stone free) and developing a predictive model. Chiu et al. [69] model did not recognise residual fragments following ESWL as a significant risk for triggering further stone growth and [30] identified these factor: positive BMI, infundibular width (IW) 5 mm, infundibular ureteropelvic angle 45% or more (IUPA), to be all predictive of lower pole stone breaking and clearance.

Benign prostatic hyperplasia was modelled in a system [114] to link the disease specific clinical and radiological factors with the disease progression in patients with mild disease (IPSS < 7) and not receiving any treatment. His ANN identified: obstructive symptoms (Oss), PSA of more than 1.5 ng/ml and transitional zone volume of more than 25 cm³_, to be correlated to disease progression and can accurately predict 78% of the cohort who will need further treatment.

Urinary dysfunction diagnosis accuracy by clinical symptoms was compared to urodynamic findings in female patients with pelvic organ prolapse by [115] and both the linear regression and ANN models could not establish relation between the symptoms and urodynamic based diagnosis hence dismissing the hypothesis of only relying on clinical symptoms to reach an accurate diagnosis and replace the need for urodynamics study.

Hypogonadism (Hgon) was represented in [133] system where the diagnosis was made based on patient’s age, erectile dysfunction and depression with AUC of 70% (p < 0.01).

Image analysis

This one of the advancing applications of AI in medicine where the system either analyse the variables in the reported medical images as data input or identifies these variables through a separate image analyser without the need for expert to report the scan or images. The first category was included among other systems mentioned above as in the diagnosis prediction domain where [47] included different variables from TRUS in the system input to predict CaP diagnosis. In this domain, we focused on the other group where the images are presented to the machine in the form raw data translated by the image analyser and the system will then apply their machine learning to identify the cause effect pattern (Table 8).

Table 8 Image analysis

Full size table

Domains

Prostate cancer image analysis was modelled in 10 systems to enhance diagnostic accuracy as in [126] and disease progression prediction as in [128]. The first system represented each TRUS image pixel as one variable or neuron in a pulse coupled neural network and trained their system with 212 prostate cancer images to segment prostate gland boundary with an average overlap accuracy (overlap measure = difference between PCNN boundary and the expert) of 81% for ten images [126].

The other 4 systems analysed histological images of a cohort of patients post RP with clinically localised CaP to predict the disease progression. The histological images were given coloured coding and analysed by the system that used variables as % of epithelial cell and glandular Lumina to identify the high risk group for disease recurrence with an accuracy reaching 90% [128].

LUT disease urine cytology images were analysed by 2 models in [123], which identified all patients with benign disease with an overall accuracy of 97%.

Nephrolithiasis stone biochemistry analysis can be achieved through an expert analysis of infrared spectroscopy which was simulated by [124] where the infrared spectra wavelength numbers were modelled as input variables and the system prediction accuracy of the expert analysed stone specimen had a root square mean error of 3.471.

Qualitative analysis

The same articles were considered for the qualitative analysis against the four stages (validation, verification, evaluations and credibility) reported in Okeefe industrial survey [8] and Benbasat article [9]. The completion of the four stages examined in this qualitative analysis was demonstrated by none of the included systems. There is a possibility that some of these missing stages has been performed but not published in the scientific literature.

Validation was performed by almost all the systems (166 out 169) with varying degree of study strength, bias, and limitations (Table 9). Most of the data driven systems (ANN, SVM, BN, kNN and FNM) were validated by the ROC and AUC by having a training and validation set or cross validation or applying the leave one out technique. Samli et al. enhanced the validity of their system by estimating the kappa statistics with the ROC [134].

Table 9 Qualitative assessment of urological Expert Systems

Full size table

Evaluation was only performed by a small fraction of these systems (n = 6). Their evaluation was aiming at the user or the expert but rarely both. There is no evidence to support that these were performed at early stages to determine the substantiality of the system to the user.

System credibility and verification were never performed. It would be implied that the verification was performed to an extent but not reported as it is a technical part of the development.

‘System development limitation and bias evaluation’ demonstrated an overall acceptable validation methodology with valid statistical analysis. However, a few observed limitations (Table 9) were reported with the common encounter being the consideration human opinion as a gold standard (n = 9). For instance, the gold standard in diagnosing prostate cancer is tissue biopsy confirmation. The interpretation of the expert clinical diagnosis as the gold standard reference can lead to statistical errors and invalidate the study.

Discussion

Expert Systems are widely available in Urological domains, with a large range of models, applications, domains, and target users including patients, students, non-experts, experts, and researchers. The number of published systems has risen over the years but with a consistent lack of publications reporting their real time testing or healthcare implementation (Fig. 4).

There is an increasing interest in analysing this gap which is reflected from the scope of AI historic review articles which aimed to only familiarise the readers with ES existence and application [33, 125]. In fact, the majority had a relatively narrow scope on the evolution and application of one ES models (artificial neural network) in prostate cancer diagnosis. Recently, similar to our research, there has been more interest in AI validation, and lack of uptake despite the faith in their ability. Therefore, in this study we quantified ES progression and applications in Urology while examining their developmental life cycle.

It was evident that CaP was the commonest domain in almost all applications contributing with more than two thirds of the systems (91 systems in total). Different aspects of this domain have been simulated by these systems to include diagnosis, therapeutics, predictions of disease progression or treatment outcome, researching variables and medical images analysis. Most of these systems were simulating urologist cognitive function with little guidance on their benefits and how they can be implemented to improve cancer decision making.

In industry, this is usually performed before the system development by evaluating the system usability from the user perspective. This part has lacked or not been acknowledged in the published studies and is possibly a core reason for the lack of their integration in urological health care. Furthermore, none of these systems has been a subject to live testing in a well-designed study to prove its efficacy over standard tools or in the clinical context to prove its validity to justify their complex structure to AI novice health care professionals. The qualitative analysis demonstrated that validation is the only stage of the development cycle to be applied by most of the systems and there is a lack of system evaluation, credibility, and verification. The evaluation can be subdivided into usability (usually by average user), utility and system quality (by experts) [9]. Despite the crucial stage of ES development, there has been a lack of attention in the published articles to integrate it into the development life cycle. This can mean the whole system can fail and also challenge its uptake [8].

An example can be drawn from this review where the majority of the systems focused on CaP diagnosis and treatment. Their implementation would be challenged by the standard decision-making tools of the cancer multidisciplinary team and the ethical concerns of relying on ANN in making such life changing and expensive decision. The utility analysis of those ES would have been essential for tailoring their development for real time applications where they can be more substantial to the user. One example is lack of community-based systems for the initial referral of suspected cancer patients and follow up of stable disease, where NICE have identified a need for such decision support models [152, 153].

There was a wide diversity of modelling in Urological ES with ANN being the most common model in this review. These would bypass the need for direct learning from experts and the exhaustive process of knowledge acquisition, which is a core requirement for knowledge-based systems to attest the whole system progress [55]. However, their analytical hidden layer of nodes “black box phenomenon” has been a subject for wide criticism and rejection from clinicians due to lack of transparency and understanding of its function.

Stephan et al. suggested a statistical solution to identify the variables significance by performing sensitivity analysis [154]. This estimates the variation of the AUC with introduction or elimination of each variable. This can only reflect the significance of each variable but does not explain how the cases are being solved nor quantify this to the user in a standard statistical value. This can be useful in research as they can identify significant variables in a large set data and has been successfully applied in the field of academic urology as in [119] where the system successfully identified the relevant gene signature for bladder cancer progression which saved time and cost of microarray analysis of all suspected genes.

Holzinger et al. emphasised on the importance of the explicability of the AI model specially in medicine which is a clear challenge for machine learning due to their complex reasoning [155]. Their study attempted to simplify the explanation by classifying the systems into post-hoc or ante-hoc. In post-hoc, explanations were provided for a specific decision as in model agnostic framework where the black box reasoning can be explained through transparent approximations of the mathematical models and variable [156, 157]. Those are reproduced on demand for a specific problem rather than the whole system which can shed more light on the system function. It is not certain if those can be easily interpreted by the AI novice clinician, but it has provided more explicit models for tackling the black box phenomenon.

Knowledge based systems can be explained by ante hoc models where the whole system reasoning can be represented. Those systems rely on expert knowledge in their development and face the bottle neck phenomenon in their applications. Furthermore, they are not always successful in identifying and mapping multilinear mathematical rules and machine learning is mandatory or at least more efficient [155]. Bologna and Hayashi et al. suggested that machine learning is more successful in complex problem solving with inverse relation between the machine performance, and it is built-in transparency [158].

Another common aspect lacking in these articles was the coupling of their system development methodology with the medical device registration requirements. This is essential as ES often function as standalone software with no human supervision to their calculation. This categorises the system as a medical device with mandatory perquisite to register with the relevant authorities as Medicines & Healthcare products Regulatory Agency in the UK [5].

Cabitza et al. compared AI validation to other medical interventions as drugs and emphasised on considering the “software as a medical device” [159]. Unlike other devices or drugs, AI models in healthcare are unique in being more dynamic which should be reflected in their validation cycle. They also quoted the known term “techno-vigilance” to learn from other medical device validation pathways. They recommended different outlook to validation where it is broken down to statistical (efficacy), relational (usability), pragmatic (effectiveness) and ecological (cost-effectiveness) with available standards for those steps (ISO 5725, ISO 9241 and ISO 14155). The latter is viewed as a novel standard for evaluating the cost benefits of applying specific AI model in healthcare which would require longitudinal modelling of health economics [159]. This was evidently lacking in articles that were included in our review and in fact most of the studies were non-randomised and retrospective.

Similarly, Nagendran et al. systematically analysed studies that compare AI performance to experts in classifying medical imaging into diseased and non-diseased, they concluded that AI performance was non-inferior to human experts with potential for out-performing [160]. Their 10 years review identified from literature 2 randomised clinical trials and 9 prospective non-randomised trials extracted from a total of 10 and 81 studies, respectively. Their review assessed the risk of bias using PROBAST (prediction model risk of bias assessment tool) criteria for non-randomised studies. The tool is designed for identifying the risk of bias by analysing four domains (participant, predictors, outcome, and analysis) [161], which is applicable to systematic review analysing prediction model with a target outcome.

In our study, as there was no unified outcome for the included prediction tools, the scope was on the role of validation rather than the outcome. Therefore, those tools assessing the risk of bias were not utilised due to the wide gaps in the tool checklist between the included articles. Such study design and data heterogeneities were also evident in Nagendran et al. and similar to our study, data synthesis was not possible. This will pose a challenge reinforcing the application of AI models in healthcare due to lack of level 1 evidence which is mandatory in healthcare for accepting a novel intervention.

Finally, the quality of the data analysis was beyond the scope of our systematic review despite being essential for developing quality AI systems. Cabitza et al. examined this gap and focused on the data governance [161]. There has been very limited evidence on data quality appraisal and standards with call for further research and allocation of more resources specially in healthcare where the data are notoriously limited with errors or discordance.

The potential application of AI in urology with focus on its future application has been recently discussed by Eminaga et al. [162]. They have shown an increasing interest in urology research, but with a challenged mechanistic update due to the model complexity and lack of end user understanding of its design and function. Furthermore, they identified discrepancy between AI engineering and clinical application which reflects some lack of communication between both disciplines.

This can be either a consequence or a cause for lack of clinical utility testing, which increases the need for research in this domain to be incorporated in the software development [163]. In fact, it has been recommended to perform the utility test before developing the system to tailor its application [164, 165]. Despite having different methodology to our systematic review, the recommendations were similar with strong emphasis on the lack of utility testing and its impact on AI uptake in healthcare [166,167,168].

Conclusion

ES have been advancing in Urology with demonstrated versatility and efficacy. They have suffered from lack of formality in their development, testing and methodology for registration, which has limited their uptake. Future research is recommended in identifying criteria for successful functional domain applications, knowledge engineering and integrating the system development with the registration requirement for their future implementation in the health care systems.

Availability of data and material

For data and supporting materials access, please contact authors for data requests.

Abbreviations

Ac:: Accuracy
AI:: Artificial intelligence
ANN:: Artificial neural networks
AP:: Acute prostatitis
Bca:: Bladder cancer
BC:: Backward chaining
BCF:: Biochemical failure
BCG:: Bacille Calmette–Guérin
BP:: Back propagation neural network
BPD:: Benign prostatic disease
BPH:: Benign prostatic hyperplasia
CAD:: Computer aided diagnosis
CBR:: Case based reasoning
CP:: Chronic prostatitis
CV:: Cross validation
Dom:: Domain
DRE:: Digital rectal exam
ED:: Erectile dysfunction
ES:: Expert Systems
FC:: Forward chaining
Fert:: Fertility
FH:: Family history
FLS:: Fuzzy logic systems
F-ONT:: Fuzzy ontology
FNM:: Fuzzy neural modelling
FRB:: Fuzzy rule-based systems
FSH:: Follicular stimulating hormone level
GA:: Genetic algorithm
Gl:: Gleason score
Hgon:: Hypogonadism
Hk11:: Human kallikrein 11
Incont:: Incontinence
IS:: Information systems
ISS:: Irritative symptoms
IT:: Information technology
IUPA:: Infundibular ureteropelvic angle
IW:: Infundibular width
KA:: Knowledge acquisition
KMSP:: Kaplan Meir Survival Plot
KE:: Knowledge engineer
Lap:: Laparoscopy
LH:: Luteinising hormone level
LOO:: Leave one out
LUT:: Lower urinary tract
LVQ:: Learning vector quanitizer
MIC-1:: Macrophage inhibitory cytokine-1
MIF:: Macrophage inhibitory factor
MH:: Medical history
ML:: Machine learning
MHRA:: Medicines and Healthcare products Regulatory Agency
Mdl:: Model
Nep:: Nephrectomy
Nlt:: Nephrolithiasis
NICE:: National Institute for Health and Care Excellence
Nomo:: Nomogram
NPV:: Negative predictive value
Nsc:: None seminoma testicular cancer
Oss:: Obstructive symptoms
Pop:: Pelvic organ prolapse
Pca:: Prostate cancer
PPV:: Positive predictive value
PRL:: Prolactin level
PSAd:: PSA density
PSAv:: PSA velocity
PVR:: Post void residual
Qmax:: Maximum flow rate
RA:: Requirement analysis
RBR:: Rule based reasoning
RC:: Radical cystectomy
RCC:: Renal cell carcinoma
Recur:: Recurrence
Res:: Response
ROC:: Receiver operating characteristic
RP:: Radical prostatectomy
Sc:: Single centre
Se:: Sensitivity
SPC:: Stable prostate cancer
Sp:: Specificity
tPSA:: Total PSA
TPV:: Total prostatic volume
TRUS:: Trans rectal ultrasound scan
TT:: Total Testosterone
TZD:: Transitional zone PSA density
TZV:: Transitional zone volume
U Dyn:: Urodynamic study
U Dys:: Urinary dysfunction
UTI:: Urinary tract infection
V&V:: Verification and validation
VU rflx:: Vesico-ureteric reflux
%fPSA:: Percentage free/total PSA
%p2PSA:: Percentage p2PSA/fPSA
p2PSA:: -2 ProPSA
U incont:: Urinary incontinence

References

McCarthy J, Minsky ML, Shannon CE. A proposal for the Dartmouth summer research project on artificial intelligence—August 31, 1955. Ai Mag. 2006;27(4):12–4.
Google Scholar
Turing A. Computing machinery and intelligence. In: Epstein R, Roberts G, Beber G, editors. Parsing the turing test. Netherlands: Springer; 2009. p. 23–65.
Chapter Google Scholar
Shortliffe EH, et al. computer as a consultant for selection of antimicrobial therapy for patients with bacteremia. Clin Res. 1975;23(3):A385–A385.
Google Scholar
Jackson P. Introduction to expert systems. Boston: Addison-Wesley; 1999.
Google Scholar
Liao SH. Expert system methodologies and applications—a decade review from 1995 to 2004. Expert Syst Appl. 2005;28(1):93–103.
Article Google Scholar
Ammenwerth E, et al. Clinical decision support systems: need for evidence, need for evaluation. Artif Intell Med. 2013;59(1):1–3.
Article PubMed Google Scholar
Garg AX, et al. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes—a systematic review. J Am Med Assoc: JAMA. 2005;293(10):1223–38.
Article CAS Google Scholar
Kawamoto K, et al. Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success. BMJ. 2005;330(7494):765.
Article PubMed PubMed Central Google Scholar
Okeefe RM, Oleary DE. Expert system verification and validation—a survey and tutorial. Artif Intell Rev. 1993;7(1):3–42.
Article Google Scholar
Benbasat I, Dhaliwal JS. A framework for the validation of knowledge acquisition. Knowl Acquis. 1989;1(2):215–33.
Article Google Scholar
Pandey B, Mishra RB. Knowledge and intelligent computing system in medicine. Comput Biol Med. 2009;39(3):215–30.
Article PubMed Google Scholar
Koutsojannis C, et al. Using machine learning techniques to improve the behaviour of a medical decision support system for prostate diseases. In: 2009 9th international conference on intelligent systems design and applications. 2009. p. 341–6.
Petrovic S, Mishra N, Sundar S. A novel case based reasoning approach to radiotherapy planning. Expert Syst Appl. 2011;38(9):10759–69.
Article Google Scholar
Keles A, et al. Neuro-fuzzy classification of prostate cancer using NEFCLASS-J. Comput Biol Med. 2007;37(11):1617–28.
Article PubMed Google Scholar
Gorman R. Expert system for management of urinary incontinence in women. In: Proceedings of the annual symposium on computer application in medical care. 1995. p. 527–31.
Hao ATH, et al. Nursing process decision support system for urology ward. Int J Med Inform. 2013;82(7):604–12.
Article PubMed Google Scholar
Lopes M, et al. Fuzzy cognitive map in differential diagnosis of alterations in urinary elimination: a nursing approach. Int J Med Inform. 2013;82(3):201–8.
Article PubMed Google Scholar
Petrucci K, et al. Evaluation of UNIS: urological nursing information systems. In: Proceedings of the annual symposium on computer application [sic] in medical care. Symposium on computer applications in medical care. 1991.
Boyington AR, et al. Development of a computer-based system for continence health promotion. Nurs Outlook. 2004;52(5):241–7.
Article PubMed Google Scholar
Koutsojannis C, Lithari C, Hatzilygeroudis I. Managing urinary incontinence through hand-held real-time decision support aid. Comput Methods Programs Biomed. 2012;107(1):84–9.
Article PubMed Google Scholar
Sucevic D, Ilic I. Uncertain knowledge processing in urology diagnostic problems based Expert System. In: 6th Mediterranean electrotechnical conference, proceedings vols 1 and 2. 1991. p. 741–3.
Altunay S, et al. A new approach to urinary system dynamics problems: evaluation and classification of uroflowmeter signals using artificial neural networks. Expert Syst Appl. 2009;36(3):4891–5.
Article Google Scholar
Gil D, Johnsson M. Using support vector machines in diagnoses of urological dysfunctions. Expert Syst Appl. 2010;37(6):4713–8.
Article Google Scholar
Koutsojannis C, Tsimara M, Nabil E. HIROFILOS: a medical expert system for prostate diseases. In: Zaharim A, Mastorakis N, Gonos I, editors. Proceedings of the 7th Wseas international conference on computational intelligence, man-machine systems and cybernetics. 2008. 254–259.
Pereira M, Schaefer M, Marques JB. Remote expert system of support the prostate cancer diagnosis. In: Conference proceedings of the annual international conference of the IEEE engineering in medicine and biology society. IEEE engineering in medicine and biology society. Conference, vol 5. 2004. p. 3412–5.
Torshizi AD, et al. A hybrid fuzzy-ontology based intelligent system to determine level of severity and treatment recommendation for Benign Prostatic Hyperplasia. Comput Methods Programs Biomed. 2014;113(1):301–13.
Article PubMed Google Scholar
Binik YM, et al. Intelligent computer-based assessment and psychotherapy - an expert system for sexual dysfunction. J Nerv Ment Dis. 1988;176(7):387–400.
Article CAS PubMed Google Scholar
Beligiannis G, et al. A GA driven intelligent system for medical diagnosis. In: Knowledge-based intelligent information and engineering systems, Pt 1, proceedings, vol 4251. 2006. p. 968–75.
Koutsojannis C, Hatzilygeroudis L. FESMI: a fuzzy expert system for diagnosis and treatment of male impotence. In: Knowledge-based intelligent information and engineering systems, Pt 2, proceedings, vol 3214. 2004. p. 1106–13.
Papageorgiou EI. Fuzzy cognitive map software tool for treatment management of uncomplicated urinary tract infection. Comput Methods Programs Biomed. 2012;105(3):233–45.
Article PubMed Google Scholar
Arlen AM, Alexander SE, Wald M, Cooper CS. Computer model predicting breakthrough febrile urinary tract infection in children with primary vesicoureteral reflux. J Pediatr Urol. 2016 Oct;12(5):288.e1-288.e5.
Article Google Scholar
Goyal NK, et al. Prediction of biochemical failure in localized carcinoma of prostate after radical prostatectomy by neuro-fuzzy. Indian J Urol. 2007;23(1):14–7.
Article PubMed PubMed Central Google Scholar
Ronco AL, Fernandez R. Improving ultrasonographic diagnosis of prostate cancer with neural networks. Ultrasound Med Biol. 1999;25(5):729–33.
Article CAS PubMed Google Scholar
Babaian RJ, et al. Performance of a neural network in detecting prostate cancer in the prostate-specific antigen reflex range of 2.5 to 4.0 ng/ml. Urology. 2000;56(6):1000–6.
Article CAS PubMed Google Scholar
Finne P, et al. Predicting the outcome of prostate biopsy in screen-positive men by a multilayer perceptron network. Urology. 2000;56(3):418–22.
Article CAS PubMed Google Scholar
Stephan C, et al. Multicenter evaluation of an artificial neural network to increase the prostate cancer detection rate and reduce unnecessary biopsies. Clin Chem. 2002;48(8):1279–87.
Article CAS PubMed Google Scholar
Djavan B, et al. Novel artificial neural network for early detection of prostate cancer. J Clin Oncol. 2002;20(4):921–9.
Article PubMed Google Scholar
Remzi M, et al. An artificial neural network to predict the outcome of repeat prostate biopsies. Urology. 2003;62(3):456–60.
Article PubMed Google Scholar
Kalra P, et al. A neurocomputational model for prostate carcinoma detection. Cancer. 2003;98(9):1849–54.
Article PubMed Google Scholar
Saritas I, Allahverdi N, Sert IU. A fuzzy expert system design for diagnosis of prostate cancer. In Proceedings of the 4th international conference conference on Computer systems and technologies: e-Learning (CompSysTech '03). Association for Computing Machinery, New York, NY, USA, 345–351.
Matsui Y, et al. The use of artificial neural network analysis to improve the predictive accuracy of prostate biopsy in the Japanese population. Jpn J Clin Oncol. 2004;34(10):602–7.
Article PubMed Google Scholar
Porter CR, et al. Model to predict prostate biopsy outcome in large screening population with independent validation in referral setting. Urology. 2005;65(5):937–41.
Article PubMed Google Scholar
Lee HJ, et al. Role of transrectal ultrasonography in the prediction of prostate cancer—artificial neural network analysis. J Ultrasound Med. 2006;25(7):815–21.
Article PubMed Google Scholar
Benecchi L. Neuro-fuzzy system for prostate cancer diagnosis. Urology. 2006;68(2):357–61.
Article PubMed Google Scholar
Stephan C, et al. Networks for prostate biopsy indication in two different patient Populations comparison of two different artificial neural. Urology. 2007;70(3):596–601.
Article PubMed Google Scholar
Kawakami S, et al. Development, validation, and head-to-head comparison of logistic regression-based nomograms and artificial neural network models predicting prostate cancer on initial extended biopsy. Eur Urol. 2008;54(3):601–11.
Article PubMed Google Scholar
Stephan C, et al. A -2 proPSA-based artificial neural network significantly improves differentiation between prostate cancer and benign prostatic diseases. Prostate. 2009;69(2):198–207.
Article CAS PubMed Google Scholar
Lee HJ, et al. Image-based clinical decision support for transrectal ultrasound in the diagnosis of prostate cancer: comparison of multiple logistic regression, artificial neural network, and support vector machine. Eur Radiol. 2010;20(6):1476–84.
Article PubMed Google Scholar
Meijer RP, et al. The value of an artificial neural network in the decision-making for prostate biopsies. World J Urol. 2009;27(5):593–8.
Article CAS PubMed Google Scholar
Saritas I, Ozkan IA, Sert IU. Prognosis of prostate cancer by artificial neural networks. Expert Syst Appl. 2010;37(9):6646–50.
Article Google Scholar
Lawrentschuk N, et al. Predicting prostate biopsy outcome: artificial neural networks and polychotomous regression are equivalent models. Int Urol Nephrol. 2011;43(1):23–30.
Article PubMed Google Scholar
Ecke TH, et al. Outcome prediction for prostate cancer detection rate with artificial neural network (ANN) in daily routine. Urol Oncol Semin Orig Investig. 2012;30(2):139–44.
Google Scholar
Filella X, et al. The influence of prostate volume in prostate health index performance in patients with total PSA lower than 10 μg/L. Clin Chim Acta. 2014;436:303–7.
Article CAS PubMed Google Scholar
Yuksel et al.: Application of soft sets to diagnose the prostate cancer risk. Journal of Inequalities and Applications 2013 2013:229.
Article Google Scholar
Samli MM, Dogan I. An artificial neural network for predicting the presence of spermatozoa in the testes of men with nonobstructive azoospermia. J Urol. 2004;171(6, Part 1):2354–7.
Article PubMed Google Scholar
Powell CR, et al. Computational models for detection of endocrinopathy in subfertile males. Int J Impot Res. 2007;20(1):79–84.
Article PubMed Google Scholar
Ramasamy R, et al. A comparison of models for predicting sperm retrieval before microdissection testicular sperm extraction in men with nonobstructive azoospermia. J Urol. 2013;189(2):638–42.
Article PubMed Google Scholar
Paya AS, et al. Development of an artificial neural network for helping to diagnose diseases in urology. In Proceedings of the 1st international conference on Bio inspired models of network, information and computing systems (BIONETICS '06). Association for Computing Machinery, New York, NY, USA, 9–es.
Gil D, et al. Application of artificial neural networks in the diagnosis of urological dysfunctions. Expert Syst Appl. 2009;36(3):5754–60.
Article Google Scholar
Wadie BS, et al. Application of artificial neural network in prediction of bladder outlet obstruction: a model based on objective, noninvasive parameters. Urology. 2006;68(6):1211–4.
Article PubMed Google Scholar
Wadie BS, Badawi AM, Ghoneim MA. The relationship of the international prostate symptom score and objective parameters for diagnosing bladder outlet obstruction. Part II: the potential usefulness of artificial neural networks. J Urol. 2001;165(1):35–7.
Article CAS PubMed Google Scholar
Tewari A, Narayan P. Novel staging tool for localized prostate cancer: a pilot study using genetic adaptive neural networks. J Urol. 1998;160(2):430–6.
Article CAS PubMed Google Scholar
Chang PL, et al. Evaluation of a decision-support system for preoperative staging of prostate cancer. Med Decis Making. 1999;19(4):419–27.
Article CAS PubMed Google Scholar
Batuello JT, et al. Artificial neural network model for the assessment of lymph node spread in patients with clinically localized prostate cancer. Urology. 2001;57(3):481–5.
Article CAS PubMed Google Scholar
Han M, et al. Evaluation of artificial neural networks for the prediction of pathologic stage in prostate carcinoma. Cancer. 2001;91(8):1661–6.
Article CAS PubMed Google Scholar
Mattfeldt T, et al. Prediction of postoperative prostatic cancer stage on the basis of systematic biopsies using two types of artificial neural networks. Eur Urol. 2001;39(5):530–6.
Article CAS PubMed Google Scholar
Matsui Y, et al. Artificial neural network analysis for predicting pathological stage of clinically localized prostate cancer in the Japanese population. Jpn J Clin Oncol. 2002;32(12):530–5.
Article PubMed Google Scholar
Zlotta AR, et al. An artificial neural network for prostate cancer staging when serum prostate specific antigen is 10 NG./ML. or less. J Urol. 2003;169(5):1724–8.
Article PubMed Google Scholar
Chiu JS, et al. Artificial neural network to predict skeletal metastasis in patients with prostate cancer. J Med Syst. 2009;33(2):91–100.
Article PubMed Google Scholar
Kim SY, et al. Pre-operative prediction of advanced prostatic cancer using clinical decision support systems: accuracy comparison between support vector machine and artificial neural network. Korean J Radiol. 2011;12(5):588–94.
Article PubMed PubMed Central Google Scholar
Regnier-Coudert O, et al. Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers. Artif Intell Med. 2012;55(1):25–35.
Article PubMed Google Scholar
Veltri RW, et al. Comparison of logistic regression and neural net modeling for prediction of prostate cancer pathologic stage. Clin Chem. 2002;48(10):1828–34.
Article CAS PubMed Google Scholar
Cosma G, et al. Prediction of pathological stage in patients with prostate cancer: a neuro-fuzzy model. PLoS ONE. 2016;11(6):e0155856.
Article PubMed PubMed Central CAS Google Scholar
Moul JW, et al. Neural-network analysis of quantitative histological factors to predict pathological stage in clinical stage-I nonseminomatous testicular cancer. J Urol. 1995;153(5):1674–7.
Article CAS PubMed Google Scholar
Poulakis V, et al. Prediction of clearance of inferior caliceal calculi with extracorporeal shock wave lithotripsy. Using an artificial neural network analysis. Urol A. 2002;41(6):583–95.
Article CAS Google Scholar
Hamid A, et al. Artificial neural networks in predicting optimum renal stone fragmentation by extracorporeal shock wave lithotripsy: a preliminary study. BJU Int. 2003;91(9):821–4.
Article CAS PubMed Google Scholar
Gomha MA, et al. Can we improve the prediction of stone-free status after extracorporeal shock wave lithotripsy for ureteral stones? A neural network or a statistical model? J Urol. 2004;172(1):175–9.
Article PubMed Google Scholar
Michaels EK, et al. Use of a neural network to predict stone growth after shock wave lithotripsy. Urology. 1998;51(2):335–8.
Article CAS PubMed Google Scholar
Naguib RNG, et al. Neural network analysis of combined conventional and experimental prognostic markers in prostate cancer: a pilot study. Br J Cancer. 1998;78(2):246–50.
Article CAS PubMed PubMed Central Google Scholar
Potter SR, et al. Genetically engineered neural networks for predicting prostate cancer progression after radical prostatectomy. Urology. 1999;54(5):791–5.
Article CAS PubMed Google Scholar
Porter C, et al. Artificial neural network model to predict biochemical failure after radical prostatectomy. Mol Urol. 2001;5(4):159–62.
Article CAS PubMed Google Scholar
Seker H, et al. A fuzzy logic based-method for prognostic decision making in breast and prostate cancers. IEEE Trans Inf Technol Biomed. 2003;7(2):114–22.
Article PubMed Google Scholar
Poulakis V, et al. Preoperative neural network using combined magnetic resonance imaging variables, prostate-specific antigen, and Gleason score to predict positive surgical margins. Urology. 2004;64(3):516–21.
Article PubMed Google Scholar
Poulakis V, et al. Preoperative neural network using combined magnetic resonance imaging variables, prostate specific antigen and Gleason score to predict prostate cancer stage. J Urol. 2004;172(4):1306–10.
Article PubMed Google Scholar
de Paula Castanho MJ, et al. Fuzzy expert system: an example in prostate cancer. Appl Math Comput. 2008;202(1):78–85.
Google Scholar
Botoca C, et al. Prediction of prostate capsule penetration using neural networks. In: Proceedings of the 8th Wseas international conference on computational intelligence, man-machine systems and cybernetics (Cimmacs '09). 2009. p. 108–11.
Castanho MJP, et al. Fuzzy expert system for predicting pathological stage of prostate cancer. Expert Syst Appl. 2013;40(2):466–70.
Article Google Scholar
Hu XH, et al. Risk prediction models for biochemical recurrence after radical prostatectomy using prostate-specific antigen and Gleason score. Asian J Androl. 2014;16(6):897–901.
Article PubMed PubMed Central Google Scholar
Tewari A, et al. Genetic adaptive neural network to predict biochemical failure after radical prostatectomy: a multi-institutional study. Mol Urol. 2001;5(4):163–9.
Article CAS PubMed Google Scholar
Borque A, et al. The use of neural networks and logistic regression analysis for predicting pathological stage in men undergoing radical prostatectomy: a population based study. J Urol. 2001;166(5):1672–8.
Article CAS PubMed Google Scholar
Tsao CW, et al. Artificial neural network for predicting pathological stage of clinically localized prostate cancer in a Taiwanese population. J Chin Med Assoc. 2014;77(10):513–8.
Article PubMed Google Scholar
Cummings JM, et al. Prediction of spontaneous ureteral calculous passage by an artificial neural network. J Urol. 2000;164(2):326–8.
Article CAS PubMed Google Scholar
Dal Moro F, et al. A novel approach for accurate prediction of spontaneous passage of ureteral stones: support vector machines. Kidney Int. 2006;69(1):157–60.
Article Google Scholar
Sun CC, Chang P. Prediction of unexpected emergency room visit after extracorporeal shock wave lithotripsy for urolithiasis - an application of artificial neural network in hospital information system. AMIA Annu Symp Proc. 2006;2006:1113.
Bagli DJ, et al. Artificial neural networks in pediatric urology: Prediction of sonographic outcome following pyeloplasty. J Urol. 1998;160(3):980–3.
CAS PubMed Google Scholar
Seçkiner I, et al. Use of artificial neural networks in the management of antenatally diagnosed ureteropelvic junction obstruction. Can Urol Assoc J. 2011;5(6):E152.
Article PubMed PubMed Central Google Scholar
Parekattil SJ, et al. Multi-institutional validation study of neural networks to predict duration of stay after laparoscopic radical/simple or partial nephrectomy. J Urol. 2005;174(4):1380–4.
Article PubMed Google Scholar
Vukicevic AM, et al. Evolutionary assembled neural networks for making medical decisions with minimal regret: application for predicting advanced bladder cancer outcome. Expert Syst Appl. 2014;41(18):8092–100.
Article Google Scholar
Serrano-Durba A, et al. The use of neural networks for predicting the result of endoscopic treatment for vesico-ureteric reflux. BJU Int. 2004;94(1):120–2.
Article PubMed Google Scholar
Naguib RNG, Qureshi KN, Hamdy FC, Neal DE. Neural network analysis of prognostic markers in bladder cancer. In: Proceedings of the 19th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Magnificent Milestones and Emerging Opportunities in Medical Engineering (Cat. No.97CH36136), 1997, vol.3, pp. 1007–9.
Qureshi KN, et al. Neural network analysis of clinicopathological and molecular markers in bladder cancer. J Urol. 2000;163(2):630–3.
Article CAS PubMed Google Scholar
Fujikawa K, et al. Predicting disease outcome of non-invasive transitional cell carcinoma of the urinary bladder using an artificial neural network model: results of patient follow-up for 15 years or longer. Int J Urol. 2003;10(3):149–52.
Article PubMed Google Scholar
Catto JWF, et al. Artificial intelligence in predicting bladder cancer outcome: a comparison of neuro-fuzzy modeling and artificial neural networks. Clin Cancer Res. 2003;9(11):4172–7.
PubMed Google Scholar
Abbod MF, et al. Artificial intelligence for the prediction of bladder cancer. Biomed Eng Appl Basis Commun. 2004;16(02):49–58.
Article Google Scholar
Catto JWF, et al. Neuro-fuzzy modeling: an accurate and interpretable method for predicting bladder cancer progression. J Urol. 2006;175(2):474–9.
Article PubMed Google Scholar
Cai T, et al. Artificial intelligences in urological practice: the key to success? Ann Oncol. 2007;18(3):604-U10.
Article CAS PubMed Google Scholar
Bassi P, et al. Prognostic accuracy of an artificial neural network in patients undergoing radical cystectomy for bladder cancer: a comparison with logistic regression analysis. BJU Int. 2007;99(5):1007–12.
Article PubMed Google Scholar
Catto JWF, et al. Neurofuzzy Modeling to determine recurrence risk following radical cystectomy for nonmetastatic urothelial carcinoma of the bladder. Clin Cancer Res. 2009;15(9):3150–5.
Article PubMed Google Scholar
El-Mekresh M, et al. Prediction of survival after radical cystectomy for invasive bladder carcinoma: risk group stratification, nomograms or artificial neural networks? J Urol. 2009;182(2):466–72.
Article PubMed Google Scholar
Kolasa M, et al. Application of artificial neural network to predict survival time for patients with bladder cancer. Comput Med Act. 2009;65:113–22.
Article Google Scholar
Buchner A, et al. Prediction of outcome in patients with urothelial carcinoma of the bladder following radical cystectomy using artificial neural networks. Ejso. 2013;39(4):372–9.
Article CAS PubMed Google Scholar
Wang G, et al. Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques. Comput Biol Med. 2015;63:124–32.
Article PubMed Google Scholar
Cai T, et al. Artificial intelligence for predicting recurrence-free probability of non-invasive high-grade urothelial bladder cell carcinoma. Oncol Rep. 2007;18(4):959–64.
PubMed Google Scholar
Buchner A, et al. Outcome assessment of patients with metastatic renal cell carcinoma under systemic therapy using artificial neural networks. Clin Genitourin Cancer. 2012;10(1):37–42.
Article PubMed Google Scholar
Marszall MP, et al. ANN as a prognostic tool after treatment of non-seminoma testicular cancer. Cent Eur J Med. 2012;7(5):672–9.
Google Scholar
Kuo R-J, et al. Application of a two-stage fuzzy neural network to a prostate cancer prognosis system. Artif Intell Med. 2015;63(2):119–33.
Article PubMed Google Scholar
Tanthanuch M, Tanthanuch S. Prediction of upper urinary tract calculi using an artificial neural network. J Med Assoc Thai. 2004;87(5):515–8.
PubMed Google Scholar
Cancer Research UK, https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/bladder-cancer#heading-Zero. Accessed May 2021.
Herr HW, et al. Defining optimal therapy for muscle invasive bladder cancer. J Urol. 2007;177(2):437–43.
Article CAS PubMed Google Scholar
von der Maase H, et al. Long-term-survival results of a randomized trial comparing gemcitabine plus cisplatin, with methotrexate, vinblastine, doxorubicin, plus cisplatin in patients with bladder cancer (Retracted article See vol. 16, pg. 1481, 2011). J Clin Oncol. 2005;23(21):4602–8.
Article PubMed CAS Google Scholar
Krongrad A, et al. Predictors of general quality of life in patients with benign prostate hyperplasia or prostate cancer. J Urol. 1997;157(2):534–8.
Article CAS PubMed Google Scholar
Han M, et al. A neural network predicts progression for men with Gleason score 3+4 versus 4+3 tumors after radical prostatectomy. Urology. 2000;56(6):994–9.
Article CAS PubMed Google Scholar
Parekattil SJ, Fisher HAG, Kogan BA. Neural network using combined urine nuclear matrix protein-22, monocyte chemoattractant protein-1 and urinary intercellular adhesion molecule-1 to detect bladder cancer. J Urol. 2003;169(3):917–20.
Article CAS PubMed Google Scholar
Djavan B, et al. Longitudinal study of men with mild symptoms of bladder outlet obstruction treated with watchful waiting for four years. Urology. 2004;64(6):1144–8.
Article PubMed Google Scholar
Kshirsagar A, et al. Predicting hypogonadism in men based upon age, presence of erectile dysfunction, and depression. Int J Impot Res. 2006;18(1):47–51.
Article CAS PubMed Google Scholar
Stephan C, et al. Clinical utility of human glandular kallikrein 2 within a neural network for prostate cancer detection. BJU Int. 2005;96(4):521–7.
Article CAS PubMed Google Scholar
Abbod MF, et al. Artificial intelligence technique for gene expression profiling of urinary bladder cancer. In: 2006 3rd international IEEE conference on intelligent systems. 2006.
Stephan C, et al. A (-5,-7) ProPSA based artificial neural network to detect prostate cancer. Eur Urol. 2006;50(5):1014–20.
Article PubMed Google Scholar
Stephan C, et al. Improved prostate cancer detection with a human kallikrein 11 and percentage free PSA-based artificial neural network. Biol Chem. 2006;387(6):801–5.
Article CAS PubMed Google Scholar
Stephan C, et al. An artificial neural network for five different assay systems of prostate-specific antigen in prostate cancer diagnostics. BJU Int. 2008;102(7):799–805.
Article CAS PubMed Google Scholar
Cinar M, et al. Early prostate cancer diagnosis by using artificial neural networks and support vector machines. Expert Syst Appl. 2009;36(3):6357–61.
Article Google Scholar
Stephan C, et al. Internal validation of an artificial neural network for prostate biopsy outcome. Int J Urol. 2010;17(1):62–8.
Article PubMed Google Scholar
Catto JWF, et al. The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression. Eur Urol. 2010;57(3):398–406.
Article CAS PubMed Google Scholar
Serati M, et al. Urinary symptoms and urodynamic findings in women with pelvic organ prolapse: is there a correlation? results of an artificial neural network analysis. Eur Urol. 2011;60(2):253–60.
Article PubMed Google Scholar
Gil D, et al. Predicting seminal quality with artificial intelligence methods. Expert Syst Appl. 2012;39(16):12564–73.
Article Google Scholar
Girela JL, et al. Semen parameters can be predicted from environmental factors and lifestyle using artificial intelligence methods. Biol Reprod. 2013;88(4):99–1.
Article CAS Google Scholar
Stephan C, et al. Multicenter evaluation of -2 proprostate-specific antigen and the prostate health index for detecting prostate cancer. Clin Chem. 2013;59(1):306–14.
Article CAS PubMed Google Scholar
Cai T, et al. Clinical importance of lymph node density in predicting outcome of prostate cancer patients. J Surg Res. 2011;167(2):267–72.
Article PubMed Google Scholar
Kim M, et al. Factors influencing nonabsolute indications for surgery in patients with lower urinary tract symptoms suggestive of benign prostatic hyperplasia: analysis using causal Bayesian networks. Int Neurourol J. 2014;18(4):198–205.
Article PubMed PubMed Central Google Scholar
Green WJF, et al. KI67 and DLX2 predict increased risk of metastasis formation in prostate cancer-a targeted molecular approach. Br J Cancer. 2016;115(2):236–42.
Article CAS PubMed PubMed Central Google Scholar
Logvinenko T, Chow JS, Nelson CP. Predictive value of specific ultrasound findings when used as a screening test for abnormalities on VCUG. J Pediatr Urol. 2015;11(4):176.e1-176.e7.
Article Google Scholar
Wells DM, Niederer J. A medical expert system approach using artificial neural networks for standardized treatment planning. Int J Radiat Oncol Biol Phys. 1998;41(1):173–82.
Article CAS PubMed Google Scholar
Loch T, et al. Artificial neural network analysis (ANNA) of prostatic transrectal ultrasound. Prostate. 1999;39(3):198–204.
Article CAS PubMed Google Scholar
Mattfeldt T, et al. Prediction of prostatic cancer progression after radical prostatectomy using artificial neural networks: a feasibility study. BJU Int. 1999;84(3):316–23.
Article CAS PubMed Google Scholar
Llobet R, et al. Computer-aided detection of prostate cancer. Int J Med Inform. 2007;76(7):547–56.
Article PubMed Google Scholar
Hassanien AE, Al-Qaheri H, El-Dahshan ESA. Prostate boundary detection in ultrasound images using biologically-inspired spiking neural network. Appl Soft Comput. 2011;11(2):2035–41.
Article Google Scholar
Matulewicz L, et al. Anatomic segmentation improves prostate cancer detection with artificial neural networks analysis of H-1 magnetic resonance spectroscopic imaging. J Magn Reson Imaging. 2014;40(6):1414–21.
Article PubMed Google Scholar
Gatidis S, et al. Combined unsupervised–supervised classification of multiparametric PET/MRI data: application to prostate cancer. NMR Biomed. 2015;28(7):914–22.
Article PubMed Google Scholar
Pantazopoulos D, et al. Comparing neural networks in the discrimination of benign from malignant lower urinary tract lesions. Br J Urol. 1998;81(4):574–9.
Article CAS PubMed Google Scholar
Xiao D, et al. 3D detection and extraction of bladder tumors via MR virtual cystoscopy. Int J Comput Assist Radiol Surg. 2016;11(1):89–97.
Article PubMed Google Scholar
Hurst RE, et al. Neural net-based identification of cells expressing the p300 tumor-related antigen using fluorescence image analysis. Cytometry. 1997;27(1):36–42.
Article CAS PubMed Google Scholar
Volmer M, et al. Artificial neural-network predictions of urinary calculus compositions analyzed with infrared-spectroscopy. Clin Chem. 1994;40(9):1692–7.
Article CAS PubMed Google Scholar
Pantazopoulos D, et al. Back propagation neural network in the discrimination of benign from malignant lower urinary tract lesions. J Urol. 1998;159(5):1619–23.
Article CAS PubMed Google Scholar
Lamb DJ, Niederberger CS. Artificial-intelligence in medicine and male-infertility. World J Urol. 1993;11(2):129–36.
Article CAS PubMed Google Scholar
Holzinger A, et al. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9(4):e1312.
Article PubMed PubMed Central Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
Article CAS PubMed Google Scholar
Lakkaraju HKE, Caruana R, Leskovec J. Interpretable and explorable approximations of black box models. 2017. Arxiv 1707.01154.
Bologna G, Hayashi Y. Characterization of symbolic rules embedded in deep DIMLP networks: a challenge to transparency of deep learning. J Artif Intell Soft Comput Res. 2017;7(4):265–86.
Article Google Scholar
Cabitza F, Zeitoun JD. The proof of the pudding: in praise of a culture of real-world validation for medical artificial intelligence. Ann Transl Med. 2019;7(8):161.
Article PubMed PubMed Central Google Scholar
Nagendran M, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689.
Article PubMed PubMed Central Google Scholar
Moons KGM, et al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med. 2019;170(1):W1–33.
Article PubMed Google Scholar
Cabitza F, Campagner A, Balsano C. Bridging the “last mile” gap between AI implementation and operation: “data awareness” that matters. Ann Transl Med. 2020;8(7):501.
Article PubMed PubMed Central Google Scholar
Kattan MW, Cowen ME, Miles BJ. Computer modeling in urology. Urology. 1996;47(1):14–21.
Article CAS PubMed Google Scholar
Eminaga O, Liao JC. Chapter 16—prospect and adversity of artificial intelligence in urology. In: Xing L, Giger ML, Min JK, editors. Artificial intelligence in medicine. London: Academic Press; 2021. p. 309–37.
Chapter Google Scholar
Chang TC, et al. Current trends in artificial intelligence application for endourology and robotic surgery. Urol Clin N Am. 2021;48(1):151–60.
Article Google Scholar
NICE. Prostate cancer: diagnosis and treatment CG175. National Institute for Health and Care Excellence. 2014
NICE. Prostate cancer: diagnosis and treatment CG28. 2008.
Eminaga O, et al. Diagnostic classification of cystoscopic images using deep convolutional neural networks. JCO Clin Cancer Inform. 2018;2:1–8.
Article PubMed Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

No sources of funding or any form of financial support of disclose.

Author information

Authors and Affiliations

Urological Department, NIHR Nottingham Biomedical Research Centre, School of Medicine, University of Nottingham, Nottingham, NG72UH, UK
Hesham Salem
University Hospitals of Derby and Burton NHS Foundation Trust, Royal Derby Hospital, University of Nottingham, Derby, DE22 3DT, UK
Hesham Salem & Jonathan N. Lund
School of Computer Science and Engineering, University of Westminster, London, W1W 6UW, UK
Daniele Soria
NIHR Nottingham Biomedical Research Centre, Sir Peter Mansfield Imaging Centre, School of Medicine, University of Nottingham, Nottingham, NG72UH, UK
Amir Awwad
Department of Medical Imaging, London Health Sciences Centre, University of Hospital, Schulich School of Medicine and Dentistry, Western University, London, ON, Canada
Amir Awwad

Authors

Hesham Salem
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Soria
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan N. Lund
View author publications
You can also search for this author in PubMed Google Scholar
Amir Awwad
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All listed authors have read and approved the final manuscript. All listed authors contributed sufficiently to take responsibility for the whole content of the manuscript following the criteria in ICJME guidelines of authorship rights and responsibilities. HS for conceptualisation, literature review, data curation, formal analysis, methodology and original writing, review, and editing. DS and JNL for supervision, writing review and editing. AA for field investigation, validation, draft review and editing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Amir Awwad.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

No competing interests or exclusive licenses used in preparing this manuscript. The authors indicated no potential conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Salem, H., Soria, D., Lund, J.N. et al. A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology. BMC Med Inform Decis Mak 21, 223 (2021). https://doi.org/10.1186/s12911-021-01585-9

Download citation

Received: 21 March 2021
Accepted: 08 July 2021
Published: 22 July 2021
DOI: https://doi.org/10.1186/s12911-021-01585-9

A systematic review of the applications of Expert Systems (ES) and machine learning (ML) in clinical urology

Abstract

Background

Methods

Results

Conclusion

Introduction

Methods

Search

Eligibility criteria

Data filtering

Results

Quantitative analysis

Decision support systems

Domains

Diagnosis prediction

Domains

Treatment outcome prediction

Domains

Recurrence and survival prediction

Domains

Predicting research variables

Domains

Image analysis

Domains

Qualitative analysis

Discussion

Conclusion

Availability of data and material

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

BMC Medical Informatics and Decision Making

Contact us