Skip to main content
Fig. 1 | BMC Medical Informatics and Decision Making

Fig. 1

From: Ontology-driven and weakly supervised rare disease identification from clinical notes

Fig. 1

A pipeline for rare disease identification from clinical notes with ontologies and weak supervision. The upper horizontal lines (in ) show the proposed pipeline based on clinical notes (e.g. discharge summaries and radiology reports in US MIMIC-III and UK NHS Tayside) and ontologies, including two steps (Text-to-UMLS and UMLS-to-ORDO). No annotation data are needed, through a UMLS extraction tool, SemEHR, and weak supervision (WS) based on customised rules and BERT-based contextual representations (see details on WS in Fig. 2). The admission ID and ICD-9 codes (linked with dotted lines) are only available for the MIMIC-III data. The lower, dotted lines show a baseline approach purely based on manual ICD codes, also enhanced with ontology matching. (Figure adapted from [7])

Back to article page