Skip to main content
Fig. 2 | BMC Medical Informatics and Decision Making

Fig. 2

From: A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records

Fig. 2

Simple subsequence evaluation. A graphical representation of how the subsequence s=’abba’ is evaluated and the corresponding optimal distance threshold δosp(abba) is selected. On the top of the figure, a training set \(\hat {\mathcal {L}}\) is represented as a collection of negative (red squares) and positive (blue circles) sequences; within each class, sequences are arranged in alphabetical order. The total label entropy of \(\hat {\mathcal {L}}\) is equal to \(I\left (\hat {\mathcal {L}}\right) = 0.918\). On the bottom, the sequences in \(\mathcal {D}\) are arranged on an horizontal axis based on their distance Dist(abba,·) from the subsequence. Among all of the possible candidate thresholds δ∈[1,4], the two best are reported in the figure, namely δ=3 and δ=4, yielding an information gain of 0.459 and 0.317, respectively: therefore, δosp(abba):=3 is chosen as the optimal splitting distance for the subsequence ’abba’

Back to article page