Main findings
This is the first study to find evidence of automation bias in the presence of e-prescribing CDS alerts. We found that when CDS was correct it reduced overall prescribing errors by 58.8%. This is consistent with prior literature showing that e-prescribing CDS can reduce prescribing errors [3–5]. However, when CDS was incorrect it increased errors by 86.6%. This increase was due to AB, that is, the ability of incorrect CDS to adversely influence participant prescribing decisions.
We found evidence of participants making omission errors, by failing to detect 28.7% more prescribing errors when CDS failed to provide alerts, compared to a control condition with no CDS. This finding was significant across all levels of task complexity and is potentially serious as the missed prescribing errors were classified as being of significant to potentially lethal severity, with most classified as serious severity.
Likewise, participants made commission errors, acting on clinically incorrect, false positive alerts, by not prescribing 56.9% more necessary medicines compared to the control condition. This was significant across all levels of task complexity.
These findings are consistent with and add to the research on automation bias in healthcare. Finding evidence of omission errors in the computer-aided detection of cancers in screening mammography [15, 16] and commission errors in the computerized interpretation of EKGs, [17] answering clinical questions assisted by CDS, [18] and deciding what to prescribe for clinical scenarios [19].
Interestingly, while participants were found to over-rely on automation, there was evidence of disagreement with the CDS provided to them. Participants’ overrode correct alerts and in doing so made prescribing errors which CDS was warning them to avoid. They also did not prescribe medicines which did not contain errors and for which there were no alerts. Reasons provided for overriding correct CDS alerts commonly referred to the condition for which the medicine was intended to treat (e.g. “VTE risk and pain management”, “vomiting”) or indicated that the medicine was regularly taken by the patient (e.g. “patient usual dose”). Participants commonly cited the lack of a true contraindication as the reason for overriding incorrect CDS alerts with many referring to the drug information. For example, “There is not any interaction listed on the drug information”. However, regular patient medicines and the condition treated were also mentioned as reasons for overriding incorrect CDS alerts. This suggests that not only did participants have trouble determining when CDS was wrong, but some also had trouble recognizing when it was right and that the alerts, or lack thereof, were beneficial and should be heeded.
Interruptions and task complexity did not impact automation bias
Interruptions did not affect the rate of AB errors nor did it affect errors rates in the control condition. However, interruptions are a complex phenomenon where multiple variables, including the characteristics of primary tasks, an individual’s cognitive state, the interruptions themselves, and the environment, may influence impact on clinical tasks and errors [22]. Despite clear evidence that interruptions can disrupt clinical tasks, their effects are complex, and may not always be detected [32].
Any impact of interruptions on prescribing errors was not detected in our experiment, replicating earlier results [33]. In our experiment, upon task resumption participants had ample time to recall their next action and the task environment provided cues to aid task resumption, for example, partly completed orders were visible on screen. One possible reason for not detecting an effect of interruptions was thus that disruption were minimized by these cues within the user interface [34]. This is consistent with observations from other studies of interruptions to computer-based tasks where participants were aided by the screen environment and were able to resume an interrupted task [35, 36]. Performance under cognitive load from more demanding competing tasks in a clinical environment may have resulted in a different outcome.
Contrary to expectations, the task complexity manipulation also had no effect on AB errors. This is in stark contrast to the findings of Bailey and Scerbo [25] who found performance on a system monitoring task deteriorated with increased task complexity, which they defined in terms of the cognitive demands placed on the participant. Monitoring tasks required the identification of critical deviations outside the normal operating range. Less complex tasks had participants monitor analogue gauges with marked critical regions. More complex tasks involved monitoring a display showing raw numbers where the subject had to remember the critical values for four different types of parameters.
Had the complexity manipulation altered the difficulty of prescribing task we would have expected to see a higher error rate in the high complexity control conditions. However, the observed difference was small and non-significant. This is in contrast to findings of Goddard et al. [19]. who found a significant effect for task difficulty, as classified by a panel of practitioners, on decision accuracy without CDS between medium and difficult scenarios. However, they found that task difficulty had no effect on commission errors.
The high error rate for both high and low levels of complexity in control conditions, with participants missing nearly half of all prescribing errors, seems to indicate that the difference in complexity between the two conditions may not have been large enough for differences in error rates to emerge.
Implications
When clinical decision support is right, it can reduce prescribing errors by providing an important opportunity to detect and recover from prescribing errors. However, the finding of automation bias suggests that this additional layer of defence weakens or, at worst, becomes a replacement for the clinician’s own efforts in error detection with error detection delegated to CDS, without adequate oversight.
An intuitive solution to the problem of AB is to produce CDS systems that are less prone to error. While this may reduce the overall error rate, highly accurate automation is known to increase the rate of AB [25]. In other words, when automation does fail, the clinician will be even less able to detect it.
A key problem is that users seem to have difficulty in determining when CDS should and should not be relied on. Indeed, human factors research reports an inverse relationship between measures of verification, such as viewing drug references, and AB commission errors [10, 11]. So far, interventions to counter AB have had little success [37–39]. These include a number specifically targeted at verification, such as exposure to automation failures; [10] training about AB; and providing prompts to verify [40]. Compounding this problem further are findings of a looking-but-not-seeing effect or inattentional blindness where participants have made AB errors despite accessing sufficient information to assess that automation was incorrect [12, 13].
Verification, the means by which a user can determine whether the CDS they receive is correct, is key to the mitigation of AB. However, the lack of successful interventions indicates that more research is needed on how to best assist users with this crucial task.
This study has established that there is a risk of automation bias in electronic prescribing with senior medical students, who will soon be entering clinical practice as junior doctors. In doing this, we have also demonstrated a methodology for detecting AB in e-prescribing. The true rates and effects of AB in working clinical settings will require further studies and indeed is likely to vary by clinicians’ experience and familiarity with medications, clinical setting, patient complexity, and the particular decision support system used. All this is future work. Likewise, the lack of an effect of task complexity, even in control conditions, was surprising and something future studies will need to address. This might be achieved by varying clinician experience with prescribing and e-prescribing systems. Complexity could also incorporate familiarity with medicines, varying between simple, commonly-used to complex, rarely-prescribed regimes.
Clinicians need to be mindful that CDS can and does fail [6]. Ideally, clinicians should make every effort to detect prescribing errors, allowing CDS to function as an independent check for errors rather than relying on it as a replacement of their own error detection efforts.
Limitations
Several limitations arise from the design of this study. While participants were instructed to approach the task as if they were treating a real patient, exercising all due care, the prescribing task was simulated, and prescribing errors were without consequence.
Also as an experiment, we cannot make any inferences about the true effect size or rate of AB in clinical settings as this will vary with, the user, the tasks being performed and the accuracy of the decision support provided. Likewise, the nature and incidence of the provided opportunities for prescribing errors may not be representative of those encountered in clinical practice.
The lack of a difference in prescribing errors between the low and high complexity control scenarios limited our ability to assess the impact of task complexity on AB.
Finally, the use of medical students with little experience in both prescribing medicines and using e-prescribing systems provides an indication of how CDS will impact new clinicians entering practice but limits generalisability for experienced prescribers or clinicians with e-prescribing experience.