Skip to main content

Table 1 Previous studies on clinical NLP for colonoscopy reports

From: Deep learning approach to detection of colonoscopic information from unstructured reports

Year

Author

NLP method (tool)

NLP category

Setting

Dataset

Performance

Current study

Seong et al

Bi-LSTM-CRF, BioBERT

Deep learning-based NLP

Samsung Medical Center

280,668 colonoscopy reports

 Training and Test: 1,000–5,000

 Embedding: 280,668

F1 score: 0.9564–0.9862

2022

Bae et al.[13]

SmartTA

Rule-based NLP

(Commercial software)

Seoul National University Hospital

54,562 colonoscopy reports and pathology reports

 Training: 2,000

 Test: 1,000

Accuracy: 0.99–1.0

2021

Vadyala et al. [41]

Bio-Bi-LSTM-CRF

Deep learning-based NLP

Veterans Affair Medical Centers (VA)

4,000 colonoscopy reports and pathology reports

 Training: 3,200

 Test: 400

 Validation: 400

F1 score: 0.85–0.964

2020

Fevrier et al. [40]

SAS PERL regular expression

Rule-based NLP

(Commercial software)

Kaiser Permanente Northern California (KPNC)

401,566 colonoscopy reports and pathology reports

 Training: 1,000

 Validation: 3,000

 Test: 397,566

Cohen's κ: 0.93–0.99

2020

Karwa et al. [12]

Prolog

Rule-based NLP

(Logic program language)

Cleveland Clinic

2,439 colonoscopy reports

 Validation: 263

Accuracy: 1.0

2019

Lee et al. [11]

Linguamatics I2E [42]

Rule-based NLP

(Commercial software)

Kaiser Permanente Northern California (KPNC)

500 colonoscopy reports

 Validation: 300

Accuracy: 0.893–1.0

2017

Hong et al. [10]

SAS ECC [43]

Rule-based NLP

(Commercial software)

Samsung Medical Center (SMC)

49,450 colonoscopy reports and pathology reports

Precision: 0.9927

Recall: 0.9983

2017

Carrell et al. [44]

HITEX [45]

Statistical NLP

(Clinical NLP framework)

University of Pittsburgh Medical Center (UPMC)

3,178 colonoscopy reports and 1,799 pathology reports

 Training: 1,051

 Validation: 2,127

F-measure: 0.57–0.99

2015

Raju et al. [46]

CAADRR

Rule-based NLP

MD Anderson

12,748 colonoscopy reports and pathology reports

 Validation: 343

Positive predictive value: 0.913

2014

Gawron et al. [47]

UIMA [48]

Statistical NLP

(NLP framework)

Northwestern University

34,998 colonoscopy reports and 10,186 pathology reports

 Validation: 200

F1 score: 0.81–0.95

2013–2015

Imler et al. [8, 9, 49]

cTAKES [50]

Statistical NLP

(Clinical NLP framework)

Veterans Administration medical center

42,569 colonoscopy reports and pathology reports

 Training: 250

 Test: 500

Accuracy: 0.87–0.998

2011

Harkema et al. [51]

GATE [52]

Statistical NLP

(NLP framework)

University of Pittsburgh Medical Center (UPMC)

453 colonoscopy reports and 226 pathology reports

Accuracy: 0.89 (0.62–1.0)

F-measure: 0.74 (0.49–0.89)

Cohen’s κ: 0.62 (0.09–0.86)