BMC Medical Informatics and Decision Making

Table 2 The corpora used to generate the embeddings

From: Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches

	Swedish		Spanish
Corpora	Size	Vocabulary size	Size	Vocabulary size
Out-of-domain (gen)	2.89 GB	1 040 025	8.3 GB	1 000 655
General medical (genMed)	130 MB	118 683	176 MB	168 500
EHR	1.2 GB	300 825	1.1 GB	286 986

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com