Deep learning approach to detection of colonoscopic information from unstructured reports

Table 8 Comparison between one-hot encoding and pre-trained word embedding

Labels	Bi-LSTM-CRF + one-hot encoding			Bi-LSTM-CRF + pre-trained word embedding
Labels	Precision	Recall	F1 score	Precision	Recall	F1 score
PROCEDURE NOTE ^a
SEDATION	0.9881	0.9953	0.9916	0.9888	0.9950	0.9918
SEDATIONLEVEL	0.9987	0.9938	0.9962	0.9985	0.9958	0.9971
MEDICATION	0.9991	0.9954	0.9972	1	0.9959	0.9980
DOSAGE	0.9929	0.9897	0.9913	0.9959	0.9920	0.9939
ANTISPASMODICS	0.9962	1	0.9981	0.9978	1	0.9989
DRE	0.9967	0.9990	0.9978	0.9958	0.9989	0.9973
PREPARATION	0.9892	0.9914	0.9903	0.9879	0.9928	0.9904
DEVICE	0.9991	0.9991	0.9991	0.9980	0.9979	0.9979
EXTENT	0.9883	0.9951	0.9916	0.9960	0.9967	0.9963
COLONOSCOPIC FINDINGS ^b
LESION	0.9881	0.9953	0.9916	0.9888	0.9950	0.9918
LOCATION	0.9987	0.9938	0.9962	0.9985	0.9958	0.9971
SHAPE	0.9991	0.9954	0.9972	1	0.9959	0.9980
COLOR	0.9929	0.9897	0.9913	0.9959	0.9920	0.9939
SIZE	0.9962	1	0.9981	0.9978	1	0.9989
NUMBER	0.9967	0.9990	0.9978	0.9958	0.9989	0.9973
BIOPSY	0.9892	0.9914	0.9903	0.9879	0.9928	0.9904
NEGATION	0.9991	0.9991	0.9991	0.9980	0.9979	0.9979
MICROAVG	0.9883	0.9951	0.9916	0.9960	0.9967	0.9963

^aProcedure note was written in semi-structured text. The best results are marked in bold
^bColonoscopic findings were written in free text. The best results are marked in bold

ISSN: 1472-6947