BMC Medical Informatics and Decision Making

Table 4 The data summary of the training and testing datasets

From: A pattern learning-based method for temporal expression extraction and normalization from multi-lingual heterogeneous clinical texts

Language	Dataset	#texts	#sentences	#clauses	#temp. Exp.	#ave. temp. exp. /text
Chinese	Training	276	7747	23,423	3525	12.77
	Testing	134	4196	11,827	1778	13.27
	Total	400	11,943	35,250	5303	13.26
English	Training	100	257	694	155	1.55
	Testing	300	787	1703	398	1.33
	Total	400	1044	2397	553	1.38

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com