Lost in the crowd? Using eye-tracking to investigate the effect of complexity on attribute non-attendance in discrete choice experiments

Background The provision of additional information is often assumed to improve consumption decisions, allowing consumers to more accurately weigh the costs and benefits of alternatives. However, increasing the complexity of decision problems may prompt changes in information processing. This is particularly relevant for experimental methods such as discrete choice experiments (DCEs) where the researcher can manipulate the complexity of the decision problem. The primary aims of this study are (i) to test whether consumers actually process additional information in an already complex decision problem, and (ii) consider the implications of any such ‘complexity-driven’ changes in information processing for design and analysis of DCEs. Methods A discrete choice experiment (DCE) is used to simulate a complex decision problem; here, the choice between complementary and conventional medicine for different health conditions. Eye-tracking technology is used to capture the number of times and the duration that a participant looks at any part of a computer screen during completion of DCE choice sets. From this we can analyse what has become known in the DCE literature as ‘attribute non-attendance’ (ANA). Using data from 32 participants, we model the likelihood of ANA as a function of choice set complexity and respondent characteristics using fixed and random effects models to account for repeated choice set completion. We also model whether participants are consistent with regard to which characteristics (attributes) they consider across choice sets. Results We find that complexity is the strongest predictor of ANA when other possible influences, such as time pressure, ordering effects, survey specific effects and socio-demographic variables (including proxies for prior experience with the decision problem) are considered. We also find that most participants do not apply a consistent information processing strategy across choice sets. Conclusions Eye-tracking technology shows promise as a way of obtaining additional information from consumer research, improving DCE design, and informing the design of policy measures. With regards to DCE design, results from the present study suggest that eye-tracking data can identify the point at which adding complexity (and realism) to DCE choice scenarios becomes self-defeating due to unacceptable increases in ANA. Eye-tracking data therefore has clear application in the construction of guidelines for DCE design and during piloting of DCE choice scenarios. With regards to design of policy measures such as labelling requirements for CAM and conventional medicines, the provision of additional information has the potential to make difficult decisions even harder and may not have the desired effect on decision-making. Electronic supplementary material The online version of this article (doi:10.1186/s12911-016-0251-1) contains supplementary material, which is available to authorized users.


Background
The use of discrete choice experiments (DCEs) in health care has increased dramatically over the past decade [1][2][3][4]. Arising from the disciplines of psychology and economics, the theoretical basis for DCEs can be found in random utility theory (RUT), developed by McFadden [5] and later Hanemann [6]. There is increasing evidence suggesting that decision making of the type emulated by DCEs is prone to diversions from the underlying theory [7,8], which assumes that consumers are both fully informed and make 'rational' (optimising) decisions.
Mainstream economic models typically assume that consumption choices can be improved simply by providing people with more and better information. There are, however, many situations where this assumption may not hold due to limits on information-processing capacity. For very complex problems, consumers may be boundedly (rather than fully) rational [9,10] and there is evidence to suggest that consumers attempting to evaluate all available information and all available options are increasingly likely to make mistakes through this process [11]. Many consumers will instead employ a 'satisficing' [12] or 'fast and frugal' [13] heuristic whereby the mental task of calculating the cost and consequences of all possible options is overwhelming; taking mental short-cuts to make decisions easier [14]. Recent findings from behavioural economics confirm that increases in the complexity of decision tasks may paralyse decision-making [15], although others argue that it is the nature of the information that is important, rather than the absolute amount [16]. One area of recent research activity focuses on so-called 'attribute non-attendance' (ANA) [17,18] which in simple terms means that individuals may either ignore or attach threshold values to certain product characteristics before considering them. In the presence of ANA, DCE data may not characterise the preferences of affected individuals and standard approaches to analysis may produce biased estimates of the relative importance of product attributes [19].
Empirically, two main methods have been employed to assess the existence and extent of ANA -(i) using qualitative methods such as think-aloud protocols alongside stated-preference surveys [20], in-depth interviews and other supplementary questioning [21] to directly question the respondent about their cognitive processing strategy in answering stated-preference surveys; and (ii) using quantitative models that allow the researcher some latitude for inference, such as latent-class models, to analyse stated-preference data [22][23][24][25]. From this growing literature, it does appear that ANA may in fact be important when assessing the validity of stated-preference studies [17,26] and that modelled coefficients should be adjusted accordingly. However, there are limitations when using both methods to reliably assess the presence and extent of any ANA.
Eye-tracking technology provides a novel alternative capable of directly measuring ANA without interfering with the decision making process or being constrained by computational limitations. First described in the 1970s [27], eye-tracking technology allows the researcher to record where and for how long a respondent to a computer-based survey focuses their eyes. This means that researchers can assess if, and for how long, each attribute or choice is focused relative to all else, including the sequence of focusing. If this information can be meaningfully interpreted, it may be used to determine whether attribute non-attendance is directly evident, whether systematic departures from the underlying theory can be identified, and ultimately, to inform how the predictive power of choice models can be improved to account for violations of the underlying assumptions.
A small number of research groups have begun exploring use of eye-tracking technology to understand decisionmaking. For example, Rasch et al. [28] use a combination of eye-tracking and facial electromyography to study affect in DCE decision making as it relates to marketing decisions; Arieli et al. [29] looks at decision-making under conditions of uncertainty (not in a DCE context); and, most relevant to the discussion here, Balcombe et al. [30] studied ANA within a DCE context as it relates to food nutrition labels. All of these studies found evidence of deviation from the underlying assumptions, acknowledging that work in this area is just beginning and there is still much to learn about the extent and effect that such deviations might have on the predictive ability of choice modelling.
Here, we make use of eye-tracking in simulated consumption decisions using a DCE framework to understand the process of consumer decision making in a complex, yet familiar, health environmentthe purchase of medicine to treat a minor ailment. To begin with, we assess the presence of ANA under varying conditions of complexity and framing (different ailments). We then look at whether particular product attributes are more prone to ANA than others. Next, we focus on the potential determinants of any ANA found. As suggested by Lagarde [25], information processing is "… likely to be influenced by the decision problem itself (e.g. its complexity), respondent specific characteristics (e.g. familiarity to the choice task, cognitive skills) and the broader context in which the choice task is taken (e.g. time pressure)". Using this framework, we aim to model ANA as a function of these influences in an attempt to identify their relative importance. Finally, we test the assumption made in previous work in this area [23] that respondents are consistent with their information processing rules, that is, "the decision on which attributes to consider does not change over the choices made by the same respondent" (page 205).

Study context
Our data was collected alongside the pilot study of a DCE which tests the effect of providing consumers with additional information in the form of (i) regulatory statements; and /or (ii) summary information in the form of a 'traffic light' logo, on the label of both 'complementary' (natural) or 'conventional' (pharmaceutical) medicines. Two different decision 'frames' were tested in the form of two common ailments: sleep problems and joint pain [see Additional file 1: Figures A1 and A2]. The design of the main DCE aimed to address a real and current policy issuewhether consumers make better (or different) medicine purchasing decisions if compulsory labelling changes are implemented in an attempt to simplify the purchasing decision [31].
Different wordings of the proposed regulatory statement have appeared in the literature or the media [32][33][34] [see Additional file 1: Figure A1 for descriptions]. Thus, we aim to test the potential effect on information processing of adding such statements to the already large amount of information that must be processed by consumers. As an alternative to regulatory statements, we also investigate the addition of a traffic-light advisory system, similar to what is being used on many foods [35,36], as a way of highlighting key information for consumers [see Additional file 1: Figure A2].

Participants
As geographical proximity was required (the eyetrackers were located at Monash University, Melbourne), a local recruitment strategy was necessary. Members of the University Staff (both academic and administrative) were invited to participate through a regular university e-newsletter. We focused on staff rather than undergraduate students (although PhD students were allowed to participate) so as to gain a more representative group in terms of demographics such as age and health status. Ethics approval was granted by Monash University [CF11/2535 -2011001482] and all participants provided written informed consent.

Choice scenarios
A DCE is one way of simulating the consumption choice and estimating how consumers may behave when characteristics (attributes) of the different choices (alternatives) are altered. By accounting statistically for the different levels of attributes presented, researchers can estimate the relative contributions of the different attributes towards the chosen alternative. The intention of the present study is not, however, to estimate partworth utilities and we were not constrained by considerations of efficiency or orthogonality that would motivate use of a formal experimental design when constructing choice scenarios. In the present study, we manually constructed choice scenarios (described below) to simulate the effect of complexity on decisionmaking and to allow observation and recollection of decision-processes using eye-tracking and semistructured interviews. 1 Methods and results from the larger DCE using an experimental design (permitting efficient estimation of part-worth utilities) are reported elsewhere [31].
The online survey included eight choice scenarios per respondent, split equally across the two health conditions. To test the influence of complexity of the choice scenario (and cognitive burden), we allowed the number of attributes presented in choice scenarios to vary from three to eight (see Table 2). Half the participants were presented with an increasing number of attributes (increasing complexity); the other half was shown a decreasing number of attributes (decreasing complexity). In an attempt to minimise unthinking / mechanical choice, levels of attributes were varied across choice scenarios to obtain as much attribute balance as possible within the constraints of the study design.
Participants in the present study were asked to consider one of two scenariosboth of which describe mild health conditions (insomnia or joint pain) for which a range of self-care options are available. These two conditions were chosen due to their prevalence in the general population as well as the availability of both complementary and conventional medicines for self-selection and treatment. Within each condition, participants were asked to choose between three alternatives -a conventional medicine, a complementary medicine and 'neither of these' (opt out option).
This study forms part of a larger, multi-disciplinary project focused on complementary and alternative medicine (CAM) use in people with chronic illness. The identification of attributes and levels for inclusion in the DCE choice scenarios therefore drew on qualitative work completed as part of the broader project, as well as a survey in the target population (N = 2,915) describing motivations for and use of CAM alongside conventional medicine [37][38][39]. A summary of all identified attributes and levels tested in the pilot is available in the Additional file 1: Table A1.
Some of the attributes, such as 'who recommended the product' and 'where it is available' , were arranged (formatted) in a number of boxes underneath the initial health scenario description and above the product label. The remaining attributes, apart from price, were displayed as part of a product label, designed to be as realistic as possible and to group related attributes. Price was displayed under the labels, to represent how items are usually displayed on shop shelves. An example scenario is available in the Additional file 1: Table A2. Choice scenarios were uploaded as an online survey. Participants were asked to complete the online survey on specialized computers with eye-tracking capabilities as their first task. No specific training materials were provided to participants apart from a general introduction and a practice DCE choice set (using a transport scenario) and no prior mention of the traffic light or regulatory statements was made before the survey commenced.

Measurement of attribute non-attendance (ANA)
Eye-tracking technology has evolved rapidly in recent years. Earlier prototypes required participants to wear bulky headwear and/or electrodes and stay in relatively uncomfortable positions for periods of time. Newer eye-trackers can be installed into regularlooking desktop computers and do not require the use of additional external hardware. For the present study, there was no requirement for headwear or electrodes and, apart from completion of a short calibration of each individual's eyes to the screen (about 30 s) and being asked to remain as still as possible during the survey to maximise the likelihood of being detected by the eye-tracker, participants should have remained relatively unaware that they are working on anything other than a regular computer. Informed consent was obtained from all participants to use the eye-tracking technology. Here we used a Tobii T120 eye-tracker and associated software (Studio Version 2.3.2.0) to formulate the raw data which was then exported and analysed in Stata 13 statistical software [40]. The eye-tracking data so obtained consists of fixations (unique observations for each time a participant focuses or fixates on anything within the calibrated screen) and saccades and allowed identification of area of fixations, duration of fixations and order of fixations. Data for pupil dilation was also available but not made use of in this analysis.
Using the specialised Tobii software, we can build a matrix of "areas of interest" (AOI) overlaying the image for each choice set. Each AOI represents one cell and here the cells of interest are alternativespecific attributes. An example of an AOI coded choice set is provided in the Additional file 1: Figure  A5. The software can then calculate a number of metrics for each AOI including the number of times each attribute was visited, how long each 'fixation' 2 (look) lasted and the size of the pupil. Given the large amount of data available, we limit our analysis here to the number of times an attribute was visited. From this we can calculate the inversewhether the attribute was fixated at all during the choice set. As the level of an attribute can only influence attendance to an attribute if that attribute is first fixated, here we leave aside attribute levels as predictors of ANA.

Statistical analysis
(i) Description of the existence, extent and variation of attribute non-attendance (ANA) across questions and attributes: We summarise the eye-tracking data to show the extent of attendance to each attribute across different questions and for questions with different levels of complexity (number of attributes). We present results for whether ANA occurs across both alternatives (CAM and conventional), before considering whether ANA occurs for each alternative taken individually. The ANA data is then disaggregated to describe ANA by attribute. (ii)Determination of the most likely contributors to ANA: Following Lagarde [25], we hypothesise that complexity has an independent and direct effect on ANA (increased complexity is associated with increased ANA). To test this hypothesis, we regress ANA on complexity while controlling for other characteristics of the decision problem (condition and direction 3 ), context (time pressure 4 ), and respondent characteristics. We estimate the effect of complexity on attribute non-attendance using both fixed and random effects panel regressions. Equation (1) specifies the model: where ANA ij (attribute non-attendance) is the number of attributes with zero fixations for participant i in choiceset j; α i captures individual-specific fixed/random effects controlling for observed and unobserved respondent characteristics; complexity j is the number of attributes present in choice-set t; condition j is a dummy indicator coded as 1 if choice-set j relates to the joint pain scenario (and 0 for the insomnia scenario); time_pressure is a dummy indicator of whether the appointment time was late (after 5.30 pm) 5 ; direction i is a dummy indicator of whether the participant received choice-sets ordered in increasing (forward) or decreasing (reverse) complexity; W i is the matrix of respondent characteristics; and ε ij is an idiosyncratic error. The intention here is not to estimate part-worth utilities and the parameter of primary interest is δ. Where δ is positive and significant, attribute nonattendance increases with complexity (as hypothesised). We also include complexity as a quadratic term to allow a non-linear relationship between ANA and complexity. Included in the matrix of respondent characteristics are dummy variables for gender; a continuous measure for age (and age squared to allow for non-linear effects); a dummy variable coded 1 for education levels below university level 6 ; and a dummy variable coded 1 for post-graduate students. 7 Also included is a dummy variable indicating if the participant reported using different CM products in the previous 12 months to account for prior experience and to proxy for a priori preferences. Three variables are included: i. vitamin (self-selected) = taken a vitamin, mineral or herbal supplement not prescribed by a medical doctor in the past 12 months; ii. vitamin (prescribed) = taken a vitamin, mineral or herbal supplement prescribed by a medical doctor in the past 12 months; iii. other CAM = used other complementary and alternative medicine products or therapies (here it includes Western herbal medicines; Chinese medicines; acupuncture or indigenous or traditional folk therapies) We hypothesised that participants' a priori preferences may make them more inclined towards choosing particular alternatives, and as the alternatives here are labelled (that is, they are specified to be 'conventional' and 'complementary' medicines rather than a generic option of 'Medicine A' versus 'Medicine B'), then we may also expect ANA to vary between alternatives, as well as between attributes. To account for this potential labelling effect, we also run the regression specified in Equation (1), but with ANA now 'alternative specific'that is, the dependent variable is now the number of attributes not attended to within an alternative, rather than across all alternatives. This is expressed in equations (2) and (3) below: Definitions of explanatory variables remain consistent with equation (1). (iii)Consistency with which decision rules are applied: Finally, we also test a previous assumption made by others investigating ANA [41] whereby participants are consistent with regard to which attributes they consider across choice sets (and by implication, which to ignore). To do this, we construct a measure of 'consistency' of individual i, detailed in Equation (4): where s is the proportion of attributes attended to in choice set j by individual i and S i is the mean of s for individual i. Here, a higher value indicates less consistency across choice sets and more deviation in terms of the number of available attributes attended/not attended to. We then regress consistency as the dependent variable with the same set of explanatory variables detailed in equations 1, 2 and 3, with the exclusion of complexity and condition (which are invariant when considering consistency across choice-sets), as detailed in Equation (5) below:

Results
Thirty-nine participants completed the survey using the eye-tracking technology. However, the quality of eyetracking data was insufficient in the case of seven participants, and their data is excluded in this analysis. 8 Table 1 details the participant characteristics. The majority of participants are female (75 %), highly educated and in higher income groups. The majority (75 %) also report having taken a self-selected vitamin, mineral or herbal product in the previous 12 months. We summarise attribute attendance by question in Table 2, first across the two alternatives combined and then for each alternative separately. For example, in the first line of Table 2 (for question 1) it can be seen that 32 participants (100 %) attended to all attributes in at least one of the alternatives but not all participants attended to every attribute in every alternative. 28 participants (88 %) attended to all attributes in the conventional medicine alternative and 29 participants (91 %) attended to all attributes in the conventional medicine alternative. It can be seen that attendance is relatively high for the first four questions, but drops from 100 % (all attributes attended to when considering combined alternatives) in question 1 down to 50 % in question 8. A similar pattern can be seen when considering each alternative separately; with the proportion declining as we move from question 1 to 8.
Across all participants, the mean number of attributes not attended to across all choice sets is 0.45 (sd 0.93, skewness 2.50, kutosis 9.61). For the conventional alternative the mean is 0.74 (sd 1.18, skewness 1.87, kurtosis 6.39) and for the CM alternative 0.75 (sd 1.12, skewness 1.82, kurtosis 6.04). The paired t-test for the mean difference of the two alternatives is significant (p = 0.05).
The effect of viewing the questions in forward (increasing complexity) compared to reverse order is shown in Figure 1. Mean ANA is zero for questions 1 and 3 and is lower in all questions framed by the 'joint' scenario as compared with the corresponding 'insomnia' question (that is, mean ANA is less in Q1 cf Q2, Q3 cf Q4, Q5 cf Q6 and Q7 cf Q8 in the forward order and the contrary is true for the reverse order). There is slightly less ANA at question 8 by those participants who completed the survey in reverse order, however for questions 3 to 6 there is higher mean ANA for reverse order participants. In general, there is higher ANA for the questions with more attributes, irrespective of the order in which the survey was seen. Mean ANA by alternative is shown in Figure 2. Both figures show a relatively large ANA increase/drop between questions 4 and 5 (or, for reverse order, between questions 5 and 4) which is where the product labels appear/disappear for the first time, greatly increasing/decreasing the amount of information to be considered. The trends in ANA across questions for both alternatives are similar. The mean time taken to answer each choice set is shown in Figure 3 and shows that, on average, more time was spent on answering question 1 if the survey was shown in forward order, and more time on question 8 if the survey was seen in reverse order. The total curve (forward and reverse order curves combined) is broadly u-shaped, with the time taken dropping steeply if the survey is seen in forward order (from question 1 to 2) or in reverse order (from question 8 to 7).
We then look to see if there are particular attributes which are more prone to ANA than others and this is presented in Table 3. Notably, price was missed by just over 16 % of participants on average for the 5 questions in which it was available, a phenomenon that has been found by others [25] and a concern for willingness-topay estimates from DCEs. Other attributes that appeared more likely to be missed included where the product was available (by up to 19 % of participants) and the caution and warnings on the labels (by up to 22 and 31 % of participants, respectively). The traffic light was missed by 15 and 22 % of participants in question 7 and 8 respectively.
Results from the main regressions are presented in Table 4. Our main interest is the relationship between ANA and complexity, which shows a positive and significant main effect for models 1-4, with a negative and significant quadratic term (that is, ANA is increasing with complexity but at a diminishing rate over the number of attributes we tested here). The fixed and random effects models (models 1 and 2, respectively) provided similar estimates and tests for the appropriateness of using the random effects model did not reject the null that results are consistent (see the footnote to Table 4 for details). We also re-run the model after centring the mean of complexity at zero and although the beta coefficients on complexity differ, the sign and significance are unchanged.
ANA was less likely for the joint scenario and more likely for participants who had a late appointment (both significant at the 10 % level in model 2), although the effect of the late appointment was not robust to different cut-off times. The order in which the survey was completed was not found to be associated with ANA. Some variation was shown in the relationship between sociodemographic variables and alternative-specific ANA: lower levels of education were associated with higher ANA in the conventional medicine alternative and those who had taken a vitamin prescribed by a medical doctor in the previous 12 months were more likely to miss attributes in the CM alternative. For a participant to have attended to an attribute, they had to have one or more fixations on that attribute, irrespective of whether they looked at the levels of the attribute in both choices Note: The 'neither of these' option did not have any attributes specified and is excluded from this analysis Figure 1 Mean attribute non-attendance by question order The mean for the measure of consistency across the sample was 0.016 (sd 0.020, skewness 1.76, kutosis 5.84), with 10 participants having a mean of zero (that is, they were entirely consistent in terms of how many attributes were missed across all choice sets). In terms of the consistency regression (model 5), younger age was associated with greater consistency, although as shown by the positive and significant coefficient on the corresponding quadratic term, this effect decreases as age increases.

Discussion and conclusions
This paper adds to the growing literature regarding attribute non-attendance in DCEs and to our knowledge is the first to explicitly focus on the relationship between complexity and ANA for decicions regarding health service utilisation Our results show there is a strong positive and statistically significant relationship between ANA and complexity and that this relationship is robust to a number of different model specifications. Importantly, we find that complexity is the strongest predictor of ANA when other possible influences, such as time pressure, ordering effects, survey specific effects and socio-demographic variables (including proxies for prior experience of the decision problem) are considered. We also find that ANA, as well as the consistency with which attribute attendance is applied across choice sets, does show some evidence of heterogeneity across different socioeconomic variables, specifically for education and age. Like others, we do find considerable departure from the assumptions underpinning RUT which assumes consumers maximise their utility based on all available information [25,30]. Similar to Balcombe [30], we found that full attendance to all attributes across all choice sets is unusual, however, ANA was significantly less present for choice sets with fewer attributes. 9 The interpretation of this finding should be taken within the context of this particular study. In general participants reported being engaged with the survey and although many stated that the choice sets with more information took longer to process, the information itself was not difficult to understand. Most also reported that they thought all attributes were potentially relevant to their decision and there were no recommendations to remove particular attributes (only to change one of the levels of one of the attributes).
What has yet to be clearly determined in the literature is whether, and the extent to which, utility functions The corresponding questions, whether seen in forward or reverse order, are combined here and presented as if the forward order has been seen by the participant (ie. question 1 data in the forward order and question 8 data in the reverse order has been aggregated) b Dosage was considered to be a fixed attribute (the levels did not change)it was included for realism c Denominator is 31 participants in question 8 due to missing eye-tracking data for participant 124 should be adjusted for ANA. The present study was conducted alongside the pilot for a DCE and varied the number of attributes across choice sets to identify the effect of complexity on ANA. As a consequence, we observed limited variation across attribute levels for some attributes and could not account for the effect of all attributes when estimating utility functions. Lagarde [25] found that whilst willingness-to-pay estimates were sensitive to ANA, the behavioural prediction of DCE models was not affected by ANA. One explanation for this may be that consumers are so accustomed to using heuristics or decision rules in complex or uncertain situations that they are well practised in seeking out information that will be useful to them in their final decision (in essence, conferring zero utility for any attributes superfluous to their needs). Thus, reading attribute and alternative labels may be sufficient for some consumers to decide if the subsequent information available is worthwhile attending to or not. We did, however, find evidence that ANA differed across alternatives, although the mean effect was shown to be small. While we cannot rule out here that this effect may also represent left-right logographical ordering, differences in socio-demographic determinants of alternativespecific ANA such as prior use of a prescribed vitamin are perhaps more consistent with a CM-CAM effect than a left-right effect. In any event, the effect of alternativespecific ANA on utility functions, as compared to 'total' ANA for a given attribute is worthy of further consideration (regardless of whether it represents a CM-CAM or left-right effect). Alternative-specific ANA may also offer additional insights into the decision processing strategy used by participants during DCEs.
Other results were also interesting. As seen in Figure 1, ANA was consistently lower for the questions framed by the 'joint' scenario (questions 1, 3, 5 & 7 in the forward order) compared with the corresponding 'insomnia' questions. This may indicate a framing effect, whereby participants were more likely to not attend to attributes in the insomnia questions, perhaps due to strongly formed opinions as to how each ailment 'should' be treated (prior experience) or strong preferences for natural or conventional medicines in specific contexts. Aside from the framing effect, the general trend for more ANA in questions with more attributes supports the notion that increased complexity is linked with more ANA irrespective of the order in which questions were seen. The time taken to answer each question (Fig. 3) broadly displays a 'U' shape for the combined forward and reverse order surveys (total sample line), perhaps suggesting a learning effect which means the time taken decreases to a point before fatigue starts to increase. However, the forward curve consistently shows longer times taken for questions 1, 3, 5 & 7 (joint scenario questions) compared with the corresponding insomnia questions (which interestingly corresponds to lower ANA for the joint questions compared with the insomnia questions in Fig. 1). This is not seen for the reverse order curve which shows consistently decreasing times taken for questions 8 to 2, increasing slightly again for the final question 1. It is not apparent why a framing effect might be present only in the forward order survey and this is worthy of further consideration. The finding that 'consistency' with regard to the number of attributes attended to across choice sets decreased with age may be potentially explained by a decrease in cognitive function over time, although this cannot be tested here. Results are not consistent with the assumption made by Hole [41] that the decision of which attribute/s to consider is stable across choice sets and are instead more supportive of the notion that this varies over choice sets, as suggested by others [26].
This study also has some important implications for the design of DCEs measuring health and health-care preferences more generally. This study, which also acted as a pilot for a larger DCE, highlights the design complexity of some of the scenarios encountered by health researchers and raises further questions about how the qualitative properties of the survey, such as the description of attributes and levels, presentation of choice sets and clarity of instructions may impact on ANA. When combined with findings regarding the effect of ANA on utility estimates, our findings regarding the effect of complexity on ANA should permit identification of the point at which adding complexity (and realism) to DCE choice scenarios becomes self-defeating.
One of the obvious limitations of this analysis is the small and unrepresentative sample size. Despite avoiding an entire undergraduate student population, the recruited sample remained better educated and from higher socioeconomic circumstances than the general population. The majority (75 %) of participants reported self-selection of a vitamin, mineral or herbal product in the previous 12 months which is higher than reports in the literature for Australian populations [42]. For this presumably less 'boundedly' rational sample, we might expect additional information to evoke fewer changes in information processing than for the general population [43]. Therefore, our results are likely to underestimate ANA in the general population. Additionally, we only tested complexity over a range of 3-8 attributes, which is the upper limit of attributes reported to be routinely included in DCEs in the health setting [2]. It must be remembered that some attributes are only seen in two questions (for example, the regulatory statements and traffic light logos are only seen in questions 7 & 8). Thus, caution should be exercised in drawing conclusions regarding the effect of additional attributes in other DCE studies. Further, we did not set out to test the effect of the location (page orientation) of attributes as it relates to ANA, whereby there may be a systematic difference due to orientation alone (eg. the bottom of the page may be more prone to ANA).
The rapid advancements in eye-tracking technology over recent years have meant that this technology is likely to be used more extensively to investigate questions of information processing across a range of disciplines, including in health economics. Alongside this, methodological questions also need to be answered regarding the use of other available metrics (fixations, saccades, pupil dilation), the definitions applied (for example, ANA) and how these may be linked to neurological process to provide greater insight into decision-making processes. Recent progress on this front suggests that the full potential of combining eye-tracking data with more familiar qualitative and quantitative data is yet to be realised.

Endnotes
1 This study was also used to pilot test the attributes and levels of the DCE for use in a subsequent study using a larger sample size. 2 The eye-tracker collects raw data every 16.7 milliseconds and assigns to each data point a location. A fixation filter is then applied to determine if each data point is a 'fixation' or 'saccade' (for two points to be considered as part of the same fixation, the distance between two data points must be below a minimal threshold). We used the default 'Clear-View' settings for the I-VT (Velocity Threshold Identification) fixation filter [Tobii Studio 2.X, Release 2.2, User Manual (2010). http://www.tobii.com/]. 3 Dummy variables indicating (i) whether the choice set relates to the joint pain scenario or to the insomnia scenario and (ii) whether the survey was seen in increasing order of complexity (forward) or decreasing order of complexity (reverse) were included in the model. 4 The time taken to complete each choice set was recorded during the experiment, however, this measure is likely to reflect complexity and respondent characteristics rather than context. A suitable proxy for time pressure was therefore identified. As the appointment time for each participant varied, we reasoned that appointments later in the day were more likely to be associated with greater time pressure as changes in traffic conditions and outside work activities are more likely to be given higher consideration around this time. 5 This cut-off was chosen as it is a time when most people have finished work for the day. Only three individuals were classified as having a late appointment using this definition. The robustness of the cut-off is tested during the analysis and reported in the results section. 6 Due to the sample being drawn from a university, this variable is also likely to indicate professional (non-academic) staff status. 7 Undergraduate students were excluded from participating. 8 The eye-tracking software provides a percentage of the time over the duration of the survey for which eyetracking data was collected. If participants did not remain still enough, for example, and data was not able to be captured for some of the time, the percentage was less than 100 %. As a general rule, we excluded participants for this analysis if their percentage tracked was 50 % or less, however, this is an overall figure which includes time spent on both the DCE choice sets and introduction/demographics sections, and it was relaxed in the case of six participants where it was deemed there was sufficient data capture during the DCE section for them to be included. 9 It is also important to note that Balcombe used a different definition of ANA whereby meeting or exceeding the threshold of two fixations per attribute defined attendance, whereas we used the stricter definition of zero fixations to define non-attendance.