Skip to main content

Table 3 Measured Precision, Recall and F1-score performances on the three NLP tasks implemented in the pipeline on test sets

From: LiSA: an assisted literature search pipeline for detecting serious adverse drug events with deep learning

 

AE-Drug relationship classification

Named Entity Recognition

Seriousness classification

 

P

R

F1

P

R

F1

P

R

F1

UMLSBERT

0.94

0.93

0.93

0.94

0.96

0.95

0.89

0.87

0.88

bioBERT

0.91

0.93

0.92

0.96

0.95

0.95

0.89

0.90

0.89

blueBERT

0.93

0.89

0.91

0.96

0.93

0.94

0.73

0.83

0.78

sciBERT

0.94

0.92

0.93

0.95

0.95

0.95

0.92

0.81

0.86

Bio_ClinicalBERT

0.94

0.92

0.93

0.97

0.92

0.94

0.68

0.93

0.79

BERT

0.90

0.89

0.90

0.95

0.92

0.93

0.76

0.74

0.75

PubMedBERT

0.95

0.90

0.92

0.96

0.95

0.96

0.87

0.91

0.89

  1. The best value per column is in bold. ThFor the drug/AE entity recognition task, the displayed metrics only concern the AE class. The best model was selected for each task, PubMedBERT for NER and seriousness classification, UMLSBERT for AE-Drug relationship classification