GatewayNet: a form of sequential rule mining

Background The gateway hypothesis (and particularly the prediction of developmental stages in drug abuse) has been a subject of protracted debate since the 1970s. Extensive research has gone into this subject, but has yielded contradictory findings. We propose an algorithm for detecting both association and causation relationships given a discrete sequence of events, which we believe will be useful in addressing the validity of the gateway hypothesis. To assess the gateway hypothesis, we developed the GatewayNet algorithm, a refinement of sequential rule mining called initiation rule mining. After a brief mathematical definition, we describe how to perform initiation rule mining and how to infer causal relationships from its rules (“gateway rules”). We tested GatewayNet against data for which relationships were known. After constructing a transaction database using a first-order Markov chain, we mined it to produce a gateway network. We then discuss various incarnations of the gateway network. We then evaluated the performance of GatewayNet on urine drug screening data collected from the emergency department at LSU Health Sciences Center in Shreveport. A de-identified database of urine drug screenings ordered by the department between August 1998 and June 2011 was collected and then restricted to patients having at least one screening succeeding their first positive drug screening result. Results In the synthetic data, a chain of gateway rules was found in the network which demonstrated causation. We did not find any evidence of gateway rules in the empirical data, but we were able to isolate two documented transitions into benzodiazepine use. Conclusions We conclude that GatewayNet may show promise not only for substance use data, but other data involving sequences of events. We also express future goals for GatewayNet, including optimizing it for speed. Electronic supplementary material The online version of this article (10.1186/s12911-019-0810-3) contains supplementary material, which is available to authorized users.

initiation risk may be partially conditional on age and that risk progresses in stages [2]. At the height of the crack cocaine epidemic, Kandel and Yamaguchi reformed their model to account for its sudden appearance and found that a) cocaine precedes crack cocaine, and b) models using cocaine or crack cocaine exclusively had a poorer fit than those containing both [3].
O'Donnell and Clayton directly claimed a causal connection between marijuana and heroin use [4]. To support this, they note that marijuana and heroin are statistically associated, that marijuana precedes heroin use, and that this association is not spurious. O'Donnell and Clayton alleged that a large cohort of sociologists were skeptical of the gateway hypothesis at the time, and they argued that marijuana causes heroin use according to how sociologists understand causation [4].
Early criticism of causal predictions of the gateway hypothesis takes two major forms: that the evidence does not support the assertion or that the assertion is structurally flawed. In an attempt to replicate Kandel's work, Baumrind obtained a different pathway which implicated that tobacco succeeded cannabis (though both found that the use of socially-accepted substances precedes that of the unacceptable), noting that drug initiation order may be influenced by sociocultural aspects [5].
An additional form of criticism arose in the way that the conclusion itself was being formulated. In an article warning against drawing false conclusions of causation, Baumrind cites O'Donnell and Clayton as an exemplar of this [6]; she later comments that Guttman scales cannot be extrapolated into a sequence of development stages as was done in Kandel's work [5]. Vanyukov et al. argue that the gateway hypothesis may lack falsifiability and that the concept itself is vague [7].
Nonetheless, contemporary support of the gateway hypothesis is mixed. It is known that rats exposed to 9-tetrahydrocannabinol (THC, the primary active compound in cannabis) will increase self-administration of nicotine, heroin, and morphine [8][9][10], showing that cannabis can operate as a gateway drug outside of any particular cultural context. Conversely, it has been argued that the apparent progression is one of several, and that common liability to addiction may be enough to explain patterns in substance use [7]. One longitudinal study of New Zealand children concluded that although there was strong association with a diverse use of other drugs and that this may support a causal model, the underlying causal mechanisms are not well understood [11].
There are two major approaches involving longitudinal data used to assess drug use in human subjects: through self-reporting and through urine drug screening (UDS). In self-reporting studies, subjects are asked to inform investigators about their drug history. This method frequently tracks subjects from adolescence into adulthood to determine both trends in usage and initiation. However, it may be influenced by response bias common to interviews and surveys [12,13].
UDS detects metabolites associated with certain drugs use (usually via a panel assay). This offers a major advantage over self-reporting: it is possible to collect information that would otherwise be withheld in a selfreporting study. It also becomes possible to collect data from subjects who are unable to participate in interviews, such as infants (who are unlikely to consciously participate in drug use, but which may reveal drug use by parents).
The main disadvantage of this method is false positive results arising from misidentification of metabolites in urine. For instance, it is known that quinolone antimicrobials can create false positives for opiate presence [14,15]. Several forms of medication (both prescribed and overthe-counter) are known to trigger false positives in drug tests; ibuprofen, a common analgesic, may trigger false positives for phencyclidine (PCP), cannabinoids, and barbiturates in some screening panels [15].

Previous approaches
The goal of GatewayNet is to predict initiation events and select those relationships which may be causal; therefore, it is important to consider past approaches to this problem. It should be noted that the causation referred to here is not deterministic causation: observation does not support the idea that a gateway drug is always followed by its target. Instead, the idea of probabilistic causation (i.e., event a is likely to cause b) is considered [16].
Statistical treatment of this problem has been attempted in the literature. A simple method uses a linear probability model [11,17], such as the one suggested by Beenstock and Rahav to predict how cigarettes influenced cannabis use Eq. 2, where S nt and C nt are indicators of cigarettes and cannabis respectively by sample n at time t, X is a vector of personality characteristics, D y is the birth cohort for year y, and u nt accounts for unobserved error. The gateway hypothesis predicts that if C is a gateway into S, then β > 0 [17].
Hazard analysis has also been used to assess this problem [17]. In relation to the gateway hypothesis, hazard analysis attempts to ascertain the risk of initiating the use of another drug. Recently, latent transition analysis has been used to assess gateway relationships [18].
Bayesian inference is often used to assess claims of causation. For instance, a Bayesian method was applied to assess data from Norwegian young adults and yielded the conclusion that proneness and accessibility are important contributing factors to hard drug use [19]. Another potential avenue might be in the form of a Bayesian Belief Network (BBN), a directed acyclic graph describing the probability of condition b occurring given condition a [20]; however, the literature does not record such an application of BBNs to the gateway hypothesis.

Association rule mining
Association Rule Mining (ARM) is a well-known method where a set of items called a transaction can be mined to produce association rules of the form a → b, which is a prediction that when a is present, b will co-occur. A related strategy, known as sequential rule mining (SRM), can be used to predict that a will precede b in sequence. Algorithms which use SRM include the Co-occurrence Maps with Sequence PAttern Mining using Equivalent class (CM-SPADE) [21], Sequential PAttern Mining (SPAM) [22], and Closed Sequential Patterns (ClaSP) [23] algorithms.
Sequential rule mining is applicable to a problem such as the Gateway Hypothesis because the latter predicts a causal relationship; if a causes b, then it is necessary for a to precede b. Causation also implies that the first instance of b will not precede the first instance of a.
We claim three contributions to the literature: i) the application of sequential rule mining to the assessment of the Gateway hypothesis, ii) the use of these rules to construct a gateway network describing interaction between, and iii) the introduction of the certainty measure.

Implementation
To better understand the extent to which the Gateway Hypothesis manifests itself in drug use trends, we developed GatewayNet, an algorithm that constructs a directed, weighted graph of drug initiation events derived from a form of association rule mining. We then performed an evaluation against two data sets: a synthetic data set, and an empirical data set derived from UDS data.

Mathematical model
In the following paragraphs, the mathematical basis for GatewayNet (and in particular, initiation rule mining) are described. How this model is defined is critical to interpreting GatewayNet's results, so it is described in detail here.

Precedence Relations
Let E denote a set of events, S : t ∈ Z + → E denote a sequence of events called the history such that S t ⊆ E is the set of events occurring at some time t, a ⊆ E, and b ⊆ e. The predicate a ≺ b means "a precedes b" and is defined in Eq. 3.
It should be noted that a ≺ a may yield true under this definition. The operand a is called the antecedent, while b is called the subsequent.
For brevity, S * t = S 1 t and S * = S * |S| . The predicate a ⊆ b means "a initiates b" (an instance thereof being called an initiation rule) and is defined in Eq. 4. An initiation rule a ⊆ b has a degree which is the maximum between the number of elements in a and the number of elements in b Eq. 5.
Note that (unlike precedence relations) the initiation relation a → a is universally false. This relation can be further generalize d into windowed initiation. Let z ∈ Z+, z = z−1, and a z → b denote an initiation rule within window z. In this generalization, only the most recent z time points are searched for the antecedent in every time-step.
Because a 0 → b is trivially false according to Eq. 6, it has been redefined Eq. 7.
The purpose behind this generalization is to account for large gaps of time between two events. For instance, if an event occurs in S 1 and is not recorded thereafter, can it be said to be associated with an event a time τ ? With windowed initiation a τ − → b, this question can be answered.
It is trivial to show that the set of initiation z-windowed rules is a subset of the set of all initiation rules: the set of rules a 0 → b are equivalent to a → b and is vacuously a subset, and because S t−z t ⊆ S * t by definition, all initiation rules for z > 0 are also initiation rules. Thus, the rule a

Initiation rule mining
We elicit initiation rules de novo using a method we call initiation rule mining (IRM). IRM is similar in design to ARM: candidate rules are proposed, then based off of their support in a transaction database, are assessed for their validity. The primary difference is that rather than looking within the same transaction, IRM mines rules by looking between different transactions contained in a single history.
Let T represent a set of histories (the transaction database) and S ∈ T. One possible incarnation of T (the incarnation used by GatewayNet) is illustrated in Table 1.  Each record within the table is a Criteria must exist for candidate rules to be accepted or rejected, and several are traditionally used in ARM that apply here. Count Eq. 8 and support (Eqs. 9 and 10) are perhaps the most basic and may be used to filter out rules which run the risk of being statistically invalid [20,24]; however, high limits may preclude many relationships from being discovered. Confidence is a measure of how likely the rule occurs when its antecedent occurs Eq. 11 and may be a more suitable measure for this purpose. Lift Eq. 12 is a measure of interest which considers the case where a and b are independent [20]. Finally, conviction is the frequency that the rule makes an incorrect prediction Eq. 13 [20]. Thresholds for inclusion are expressed as l count , l sup , l conf , l lift , l conv , and h conv respectively.
The subset of candidate initiation rules for which these criteria met are called the set of mined rules. A rule is an element of the mined rules if and only if: As with ARM, the a priori principle may be used with IRM to reduce the number of candidates that must be considered when testing a proposed initiation rule for inclusion. An item set X is considered frequent if a) count(X) ≥ l count and b) sup(X) ≥ l sup . Let d max ∈ Z + be the maximum degree for which to mine rules. Thus, rule proposal can be implemented as shown in Fig. 1, where X ⊗ Y is the outer product of X and Y.

Gateway rules
Recall that the gateway hypothesis predicts that the probability that b will arise out of a is greater than the probability that it would happen due to some other circumstance. When when we say this, we say that a is a gateway into b and denote that relationship using a b. An initiation rule is known as a gateway rule (denoted a b) whenever the probability that a → b Eq. 14 is greater than the probability that any combination of the remaining antecedents will initiate b. This is equivalent to positing that a (either directly or indirectly) causes b.
A simple way of ensuring this condition is to calculate the proposed rule's certainty Eq. 1. The condition cert(a → b) = 1 means that the probability that the subsequent arose out of a is precisely 50%, or alternatively that 50% of the remaining instances arose out of a b. Therefore, by Eq. 1, we posit that a is the most likely cause of b when cert(a → b) > 1. When the limit of p(a → b) approaches 1, cert a → b approaches ∞: absolute certainty means that we posit b arises only from a This test is necessary (albeit not sufficient) for the assertion that a given event is the singular cause of another. Even in [1], this degree of causation is not predicted: cigarettes or alcohol leads to cannabis. This method could only be used to therefore test the idea that cannabis singularly leads to other illicit drugs.
To test the most general form of the gateway hypothesis, it must be the case that the association occurs by greater probability than chance alone. Thus, we suggest that that gateway rules can be established using the condition cert(a → b) > l cert , which is the maximum certainty for which we will reject a → b as causal. To satisfy this hypothesis, the l cert must be at least the threshold where we would admit chance occurrence Eq. 16.
Consider the transaction database in Table 1; if one calculates the count for all of the item sets and degree one rules in the transaction database, then the values provided in Table 2 can used to calculate support; for instance, I 1 has a support of 0.8 because it is involved in 4 or 5 histories. Initiation rule support can be calculated by finding all histories where Eq. 4 holds; because of this, I 1 → I 3 has a support of 0.6 ( Table 2).
Using this table, it is possible to derive the aforementioned metrics: for instance, lift( To determine whether or not this I 1 → I 3 is also a gateway rule, one calculates cert(I 1 → I 3 ) = (0.6)/(0.6−0.6) = 0.6/0. Although this value is undefined, it can be interpreted as approaching ∞; thus, I 1 I 3 can be said to hold.
rules over window z = 2 (Table 3). Because the history for ID 3 does not initiate I 2 within the window, the support for → I 2 ) ≤ 1 and therefore does not meet Eq. 1.

Visualization
The final phase that GatewayNet performs is visualization. This produces a directed graph which depicts relationships between initiation rules. Let G be a weighted digraph G = E , R , where events E constitute the graph's vertices, and rules R constitute edges between events. Let r ∈ R be a quadruple such that r ≡ e 1 ∈ E , e 2 ∈ E , w, c . Then for rule a → b, vertices {a, b} ∈ E , edge r a→b is defined by Eq. 17, and membership of a → b in G is defined by Eq. 18.

Synthetic data
To better characterize GatewayNet's behavior, we created a synthetic data set (Additional file 2) for which interaction is well characterized. This data set is explicitly constructed so that a complete history is available for each subject in the data set. The synthetic data was generated according to a mathematical model described in the following paragraphs. This was done for the purposes of validation; although we also tested against empirical data, it is important that we verify Table 3 Counts and supports for all item sets and initiation rules for z = 2 and of degree 1 in that GatewayNet is well-behaved. The forthcoming model describes a population which is fixated on events it considers preferential, but allows for experimentation with other events. Let E consist of events {I 1 , I 2 , ..., I n }, where n ∈ Z + , E 0 ≡ { } ∪ E, e 1 ∈ E 0 , and e 2 ∈ E 0 . A Markov chain of order 1 P (Additional file 1) is randomly constructed to represent transition probabilities from e 1 to e 2 (Table 4). Two real parameters are provided: the affinity f and interest s. Affinity represents the probability that a subject will be satisfied with e 1 and will ensure that the event occurs at time t + 1. Interest is a weight that represents the likelihood that the subject would independently ensure e 2 will occur. A special event, , represents the null event, which represents a transition from no event.
Naturally, each row in P must add to exactly 1.0; however, care must be taken to ensure that this criterion is met. Let P represent an |E 0 | × |E 0 | matrix. Each element of P is populated using Eq. 19: an event's self-transition e 1 → e 1 is simply represented by its affinity, while any other transition is randomly distributed from the remaining probability. Because the row sum may not add up to exactly 1, each element is then normalized across the row Eq. 20.
Because the generated history is considered to be a complete one, the initial state is always . Thus, time-point t = 1 is considered to be the first opportunity for which an initiation event can occur. At each time-point, between two and three initiations may occur. Each history may have up to 12 records in it; in total, we generated 56,578 simulated transactions over 8192 histories. Most of the events had support above 10% (Table 5). In total, 29,412 events were generated, corresponding to an average of 2.49 events per history.
We generated two gateway networks for the synthetic data: one for l sup = 0.20 (Fig. 3), and one for l sup = 0.025   (Fig. 4). In both instances, l conf = 0.5, and l lift = 1. Versions of the network without gateway rule highlighting, with gateway rule highlighting, and just the gateway rules were generated. Additionally, gateway networks were generated with window sizes of z = 1, z = 2, and z = 3 (Fig. 5). This was done to determine whether windowing had an effect on the synthetic data.

Empirical Data
Synthetic data is useful for evaluating the performance of GatewayNet since it is expected that some structures should arise within the output (therefore providing a method of validation). However, it should be noted that synthetic data does not necessarily model the real world; to test performance in that environment, an empirical data set was used. UDS data obtained from 71,312 patient between August 1998 and June 2011 over nearly 111,359 emergency room visits at LSU Health Sciences Center (the hospital portion now belongs to University Health) in Shreveport, LA. This hospital is a Level I trauma center that serves Fig. 3 The synthetic data set expressed as a gateway network. This network is generated for l sup = 0.05, l conf = 0.5, and l lift = 1.0. It is not apparent if any events qualify as gateway rules in this graph Fig. 4 The gateway network for l sup = 0.025. Expectedly, this network generates additional interactions. Color gradient represents the certainty of gateway rules the 7 parishes in LERN Region 7 (including the Shreveport/Bossier City area) [25]. Because Caddo Parish (where Shreveport resides) is adjacent to both the Texas and Arkansas borders, patients from east Texas and southern Arkansas are also frequently served.
Prior to processing, we removed demographic data and then assigned each patient a random identifier (ID) by first shuffling the list of patients, then assigning each patient in the shuffled list a sequential ID. Additionally, screening dates were converted to their corresponding Lilian day number. The day number was then scaled by 1440 (the number of minutes in the day) and the time of screening in minutes was added to the date. Finally, each patient's screening time was calibrated to the first by subtracting the first screening's timestamp.
This was done for to ensure that the screening time is expressed as an integer. Additionally, because methamphetamine and MDMA are amphetamines and methadone is an opiate, any instance of either was converted to this category prior to any processing. Because many patients only visited once or did not test positive for any drugs, we restricted the list of histories to those with at least two time-points and at least one positive result. Finally, a history was only accepted if there was at least one more time-point following the time-point of the first positive result. In total, 11,364 histories over 42,745 time-points remained.
This data was first processed using unwindowed IRM (Fig. 6). We set the parameters l count = 30, l sup = 0, l conf = 0.25. Minimum count was used instead of support because of the relatively few number of histories involving drug use ( Table 6).  In addition to performing unwindowed mining, we mined initiation rules within a window of 525,600 minutes (1 year) (Fig. 7). This was done to remove rules which were primarily supported by spurious positives. Opiates were sometimes administered to incoming patients or as a result of emergency surgery. Because of usage this arising from medical intervention rather than choice, we further removed rules of the form x → OPIATES (Fig. 8).

Results
Because we evaluated two data sets, we discuss the results for each data set separately. In the following subsection, we will discuss the results of applying GatewayNet to the synthetic data set. Afterwards, we discuss the results with respect to the empirical data from LSUHSC-S.

Synthetic data
A relatively simple network with multiple interactions was generated (Fig. 3). It would appear that with our synthetic data set, event I 1 initiates I 3 , I 5 and I 6 , while I 5 initiates I 6 and I 3 initiates I 5 . In this data set, I 1 , I 3 , and I 5 are gateway events.   Fig. 3, there appear to be no clear gateway; however, this is misleading. The set of actual gateway events predicted by our algorithm are I 1 , I 3 , and I 5 , (Fig. 9). This may not be very surprising: these are also the most frequent events (Table 5). This can be more easily seen by removing the nodes which do not correspond to gateway rules (Fig. 10).
It can be clearly seen that I 3 I 5 with a high degree of certainty (32.3254). However, it is also true that I 3 I 5 Fig. 9 The gateway network with gateway rules highlighted. From this graph, we can tell that I 1 , 1 3 , and I 5 are gateway events. Because of their blue color, I 1 serves as a strong gateway into I 6 and I 3 into I 5 .
Color gradient represents the certainty of gateway rules (albeit with a weaker certainty of 19.4058) Fig. 5. Additionally, Both I 1 (32.8393) and I 5 (4.0736) are gateways into I 6 . The 2 nd degree rules {I 1 , I 3 } → I 5 and {I 1 , I 5 } → I 6 were mined.
In this case, windowing did not effect the vertices in the gateway network; however, the edges reported did change (Fig. 5). Notably, the number of edges associated with I 2 and I 7 increased with the window size. However, some of the rules associated with these events are not mined in the unwindowed sample, possibly due to diminishing support as maximum count increases.

Empirical data
No gateway rules were observed in the data; however, some trends could be observed. In particular, tricyclic antidepressants (TCAs) and barbiturates both initiate benzodiazepine use (Fig. 6). This is not surprising; TCAs and barbiturates were once regularly prescribed, but have been replaced benzodiazepines (which have fewer risks). Although this does not qualify as a gateway event, it confirms that a known initiation even can be captured; the TCA relationship is also filtered out in the year-long window (Fig. 7).
Curiously, cocaine, opiates, and benzodiazepines initiate cannabinoids according to the data (Figs. 6, 7). In addition to barbiturates, amphetamines also initiate benzodiazepines (Figs. 6, 7). Before filtering out initiation rules involving opiates in the subsequent, cocaine, benzodiazepines, and amphetamines were also found to initiate opiates (Figs. 6, 7 and 8). The only drugs removed between the unwindowed and windowed variants were TCAs (Fig. 7).

Fig. 10
The gateway network with all non-gateway rules removed. Color gradient represents the certainty of gateway rules

Discussion
As with the "Results" section, this section discusses the results for the synthetic data and empirical data separately. In the following subsection, we will discuss whether or not mining the synthetic data yielded the expected results. Afterwards, we discuss observations noted with respect to mining the LSUHSC-S UDS data.

Synthetic data
The synthetic data showed a peculiar phenomenon: because l cert = 1, it might be expected that one (and only one) event may serve as a gateway into another. As it turns out, this expectation is unwarranted: a high certainty means that a may be necessary to explain an event, not that it is sufficient to do so. In this case, two hidden gateway rules {I 1 , I 3 } I 5 and {I 1 , I 5 } I 6 were discovered. But what does this mean?
Let us first consider {I 1 , I 3 } I 5 . One possibility is that I 1 and I 3 are co-requisite for the event. Consider the model from which the synthetic data is derived. In this model, an event for which there is high affinity will be quickly retained and will become recurrent; however, there is no reason that this might be the only event to occur within that time-point. Because of this, a history will show frequent experiments with other events: each event is an avenue for other events to occur alongside it.
With respect to drug use, this model is perhaps pessimistic: it predicts co-usage of one drug with other drugs, even when the user has high affinity with another one. In our model, frenetic experimentation seems to occur as the search for other events with high affinity continues. Is this a reasonable model of drug use? It is known that comorbidities in drug use often exist in reality. In Australasian countries, rates of experimentation of around 40% have been observed [11,26].
If this does reflect trends in drug use, then it is an interesting result, as our method would be able to detect this phenomenon. In the synthetic data, I 1 → I 5 at a probability of 95.099%, whereas I 3 → I 5 with a probability of 96.999%. This concedes the possibility that they are used in combination.
But as it turns out, that is not the only possible explanation, because without any windowing, a gateway rule may be formed so long as I 1 and I 2 precede I 5 at any time in the past. We should also note that I 1 I 2 with a probability of 94.069%. It possible that I 3 is directly responsible for the transition into I 5 , and in reality, it is probably so: the first-order Markov chain used to model the relationships between events cannot explicitly express {I 1 , I 3 } → I 5 , nor does it actually encode I 1 → I 3 → I 5 because it lacks the required history. Because of this, the latter is probably a better explanation.

Empirical data
The LSUHSC-S data did not exhibit any gateway rules, and there are several potential reasons for this. It could simply be that the gateway hypothesis does not manifest itself in the population (or at least this sample). This represents acceptance of the null hypothesis that the drugs involved do not progress in development stages as predicted by Kandel and Yamaguchi [1][2][3].
However, it must be stated that there is another source of error which is likely to be present in the data: the drug screenings are collected during trauma center visits, and this offers an incomplete usage history of each patient. In many countries, illicit drug use is a criminal offense, and even if it were not, drug use is commonly voluntary and this means that intoxication is probably a desirable state for the user. Because of this, we can expect that patients will not seek the help of the trauma center merely because they have consumed an illicit drug; instead, we would expect to see that these patients will do so because of the perceived risk of dying or due to circumstantially related incidents.
Because of this, we do not know the patient's drug history between intervention, and there may be biases due to the time it takes to clear metabolites from their systems. It is known, for instance, that some drug metabolites are excreted at different rates than others [27]. The cannabinoid metabolite 11-nor-9-carboxy-9tetrahydrocannabinol (THC-COOH) has a urinary halflife of about two days [28] compared to the 7.5 hour half-life of the cocaine metabolite benzoylecgonine [29]. This may mean that some initiation events will not be captured, particularly whenever screenings are separated by months or years.
Additionally, it was previously mentioned that all of the drugs tested by the panel (with the exception of cannabinoids) had a recognized medicinal use at the time. This true of opiates (for instance) to such a degree that initiations into opiates had to be filtered out of our data. However, we do not know if or when certain drugs were administered legitimately to patients (e.g., via prescription or surgical intervention prior to screening), and had gateway rules been found, they would have been suspect because of this.
We did, however, observe what is very likely to be this phenomenon in action. By finding that barbiturates and TCAs initiate benzodiazepines (Fig. 6), we were able to observe a known transition in medical practice. This initiation event was lost when initiation rules were limited to a year-long window (Fig. 7); since barbiturates and TCAs and benzodiazepines are antidepressants and anxiolytic respectively, it may have been the case that these were administered to patients en route as the result of psychiatric intervention. One may therefore predict that additional emergency intervention was simply not required within this window.
By removing rules with opiates in the rule's subsequent, we notice that the transition into cannabinoids remains (Fig. 8). Interestingly, this is suggested by Kandel because she initially hypothesizes that the association is bidirectional [1]; in fact, we generally see this prediction also holding for benzodiazepines and cocaine. However, it must be restated that we did not find any gateway rules and that this is association. One potential explanation for this is that cannabinoids may have be easier to obtain illicitly over the sampling period than alternatives.

Comparison to existing software
Several implementations of SRM are mentioned the "Discussion" section which may be compared to Gate-wayNet. As far as we are aware, no software utilizes the certainty measure, so any calculation thereof is extrinsic to other SRM software. However, a comparison can be made with existing software as long as support information of each item set mined is available.
To supply the other algorithms, we utilized SPMF, which implements both the SPADE and SPAM algorithms [30]. SPMF takes slightly different input than GatewayNet, and the input data was converted by assigning item labels to integer values and by combining each history into a single line. As SPMF does not have an option to set minimum count, this was achieved by using setting minimum support to 0.0005166 (30/58067 transactions). Both SPAM and SPADE produced equivalent output that varies only in output order and consists of 56,767,617 individual item sets. Because of this, the remaining analysis occurs on the SPAM output.
Unsurprisingly, both algorithms calculate the same number of 1-sets as GatewayNet does; however, special care must be taken to make the results comparable to GatewayNet's because SPMF outputs its item sets as a series of time-points. In order to perform this comparison, we first "trimmed" SPMF's item sets so that the repeated sequences at either extrema are truncated; then, each such item set with the maximum support is counted. In total, this reduced the number of item sets to 16,415,526. From this set, we calculated all frequent subsets in the data. This was first done by mining all subsets of length k for each SPMF item set S, for 0 < k < |S|. A rules et was then generated with a minimum support of 0.025% and propagated to GatewayNet's visualization software, gatewaynet-links. We found that no rule had a lift of 1 or greater; the graph has also reduced the set of gateway rules to I 1 I 3 and I 1 I 5 (Fig. 11). This might be explained by a loss of records due to SPMF's output. In that output, {A, B} represents a single transaction where this set of events occurs, and it is distinct from {{A}, ..., {B}, ...} or {{B}, ..., {A}, ...}. In contrast, Gate-wayNet considers such histories to correspond to the item Fig. 11 The gateway network in Fig. 6, except processed using SPMF. Two gateway rules, I 1 I 3 and I 1 I 5 remain highlighted, but other rules have dropped out because of incomplete item set counts set {A, B}. Because of this, many small sequences of low support may be culled by the support threshold, therefore altering each item set's frequency. Nonetheless, using SPMF as a basis for item set mining approaches the results of GatewayNet.

Conclusions
IRM (as implemented by GatewayNet) shows promise to demystify the Gateway Hypothesis, but it may also be useful in the prediction of any event (as our synthetic set demonstrates). The quality of data provided to Gate-wayNet will strongly affect its output; however, with good data, it may not only be able to highlight initiation events, but also actual gateway events as well.
The ability of GatewayNet to predict initiation rules is expectedly dependent on data quality. In our case, emergency room UDS yielded no support for the gateway hypothesis; however, due to the nature of the screening, a full patient history might not be available.
It should be noted that GatewayNet is not presently optimized for speed. The a priori algorithm is known to be sub-optimal for association rule mining, and other SRM algorithms have explored optimization techniques regarding speed. In the future, we would like to explore techniques such as FP-growth as avenues for improving runtime speed.