From: Deep learning approach to detection of colonoscopic information from unstructured reports
Year | Author | NLP method (tool) | NLP category |
---|---|---|---|
Setting | Dataset | Performance | |
Current study | Seong et al | Bi-LSTM-CRF, BioBERT | Deep learning-based NLP |
Samsung Medical Center | 280,668 colonoscopy reports  Training and Test: 1,000–5,000  Embedding: 280,668 | F1 score: 0.9564–0.9862 | |
2022 | Bae et al.[13] | SmartTA | Rule-based NLP (Commercial software) |
Seoul National University Hospital | 54,562 colonoscopy reports and pathology reports  Training: 2,000  Test: 1,000 | Accuracy: 0.99–1.0 | |
2021 | Vadyala et al. [41] | Bio-Bi-LSTM-CRF | Deep learning-based NLP |
Veterans Affair Medical Centers (VA) | 4,000 colonoscopy reports and pathology reports  Training: 3,200  Test: 400  Validation: 400 | F1 score: 0.85–0.964 | |
2020 | Fevrier et al. [40] | SAS PERL regular expression | Rule-based NLP (Commercial software) |
Kaiser Permanente Northern California (KPNC) | 401,566 colonoscopy reports and pathology reports  Training: 1,000  Validation: 3,000  Test: 397,566 | Cohen's κ: 0.93–0.99 | |
2020 | Karwa et al. [12] | Prolog | Rule-based NLP (Logic program language) |
Cleveland Clinic | 2,439 colonoscopy reports  Validation: 263 | Accuracy: 1.0 | |
2019 | Lee et al. [11] | Linguamatics I2E [42] | Rule-based NLP (Commercial software) |
Kaiser Permanente Northern California (KPNC) | 500 colonoscopy reports  Validation: 300 | Accuracy: 0.893–1.0 | |
2017 | Hong et al. [10] | SAS ECC [43] | Rule-based NLP (Commercial software) |
Samsung Medical Center (SMC) | 49,450 colonoscopy reports and pathology reports | Precision: 0.9927 Recall: 0.9983 | |
2017 | Carrell et al. [44] | HITEX [45] | Statistical NLP (Clinical NLP framework) |
University of Pittsburgh Medical Center (UPMC) | 3,178 colonoscopy reports and 1,799 pathology reports  Training: 1,051  Validation: 2,127 | F-measure: 0.57–0.99 | |
2015 | Raju et al. [46] | CAADRR | Rule-based NLP |
MD Anderson | 12,748 colonoscopy reports and pathology reports  Validation: 343 | Positive predictive value: 0.913 | |
2014 | Gawron et al. [47] | UIMA [48] | Statistical NLP (NLP framework) |
Northwestern University | 34,998 colonoscopy reports and 10,186 pathology reports  Validation: 200 | F1 score: 0.81–0.95 | |
2013–2015 | cTAKES [50] | Statistical NLP (Clinical NLP framework) | |
Veterans Administration medical center | 42,569 colonoscopy reports and pathology reports  Training: 250  Test: 500 | Accuracy: 0.87–0.998 | |
2011 | Harkema et al. [51] | GATE [52] | Statistical NLP (NLP framework) |
University of Pittsburgh Medical Center (UPMC) | 453 colonoscopy reports and 226 pathology reports | Accuracy: 0.89 (0.62–1.0) F-measure: 0.74 (0.49–0.89) Cohen’s κ: 0.62 (0.09–0.86) |