Skip to main content
Fig. 3 | BMC Medical Informatics and Decision Making

Fig. 3

From: A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records

Fig. 3

Advanced subsequence evaluation. A graphical example of how lr works. The scenario is similar to that of Fig. 2: on the top, a labeled sequence dataset \(\hat {\mathcal {L}}\) is depicted, which contains both actual and empty sequences; the latter are marked by ∅, while the total entropy of \(\mathcal {\hat {L}}\) is equal to 0.881. Below the dataset representation, the sequences are arranged on the horizontal line reporting their distance from ’abba’: lr takes into account two cases, respectively marked with A and B. In A, lr places all ∅s to the left side of the split, by assigning Dist(abba,∅):=0; conversely, in B, empty sequences are placed to the right side, with a distance from the shapelet equal to \(Dist(\texttt {abba},\emptyset) := \max _{\mathcal {S}_{\alpha } \in \hat {\mathcal {L}}} = 4\). The optimal distance threshold is chosen based on the highest split gain among those obtained in A and B. The figure reports, for both cases, two threshold examples, namely δ=3 and δ=4: in particular, δ=4 turns out to be the best option, due to the gain (0.193) yielded in A. Therefore, δosp(abba):=4

Back to article page