1. Neuroscience
Download icon

Choice history biases subsequent evidence accumulation

  1. Anne E Urai  Is a corresponding author
  2. Jan Willem de Gee
  3. Konstantinos Tsetsos
  4. Tobias H Donner  Is a corresponding author
  1. University Medical Center Hamburg-Eppendorf, Germany
  2. University of Amsterdam, Netherlands
Research Article
  • Cited 3
  • Views 1,443
  • Annotations
Cite this article as: eLife 2019;8:e46331 doi: 10.7554/eLife.46331

Abstract

Perceptual choices depend not only on the current sensory input but also on the behavioral context, such as the history of one’s own choices. Yet, it remains unknown how such history signals shape the dynamics of later decision formation. In models of decision formation, it is commonly assumed that choice history shifts the starting point of accumulation toward the bound reflecting the previous choice. We here present results that challenge this idea. We fit bounded-accumulation decision models to human perceptual choice data, and estimated bias parameters that depended on observers’ previous choices. Across multiple task protocols and sensory modalities, individual history biases in overt behavior were consistently explained by a history-dependent change in the evidence accumulation, rather than in its starting point. Choice history signals thus seem to bias the interpretation of current sensory input, akin to shifting endogenous attention toward (or away from) the previously selected interpretation.

https://doi.org/10.7554/eLife.46331.001

Introduction

Decisions are not isolated events, but are embedded in a sequence of choices. Choices, or their outcomes (e.g. rewards), exert a large influence on subsequent decisions (Sutton and Barto, 1998; Sugrue et al., 2004). This holds even for low-level perceptual choices (Fernberger, 1920; Rabbitt and Rodgers, 1977; Treisman and Williams, 1984). In most perceptual choice tasks used in the laboratory, the decision should only be based on current sensory input, the momentary ‘evidence’ for the decision. Thus, most work on their computational and neurophysiological mechanisms has largely focused on the transformation of sensory evidence into choice (Shadlen and Kiani, 2013). Yet, perceptual decisions are strongly influenced by experimental history: whether or not previous choices led to positive outcomes (Rabbitt and Rodgers, 1977; Dutilh et al., 2012), the confidence in them (Desender et al., 2018), and the content of the previous choice (i.e. which stimulus category was selected; Akaishi et al., 2014; Fründ et al., 2014; Urai et al., 2017). The latter type of sequential effect, which we call ‘choice history bias’, refers to the selective tendency to repeat (or alternate) previous choices. It is distinct and dissociable from effects of reward, performance feedback or subjective error awareness in previous trials.

Choice history biases are prevalent in human (Fründ et al., 2014; Urai et al., 2017), monkey (Gold et al., 2008) and rodent (Busse et al., 2011; Odoemene et al., 2018) perceptual decision-making. Remarkably, this holds even for environments lacking any correlations between stimuli presented on successive trials – the standard in psychophysical laboratory experiments. Choice history biases vary substantially across individuals (Abrahamyan et al., 2016; Urai et al., 2017). Neural signals reflecting previous choices have been found across the sensorimotor pathways of the cerebral cortex, from sensory to associative and motor regions (Gold et al., 2008; de Lange et al., 2013; Akaishi et al., 2014; Pape and Siegel, 2016; Purcell and Kiani, 2016a; St John-Saaltink et al., 2016; Thura et al., 2017; Hwang et al., 2017; Scott et al., 2017).

By which mechanism are choice history signals incorporated into the formation of a decision? Current models of perceptual decision-making posit the temporal accumulation of sensory evidence, resulting in an internal decision variable that grows with time (Bogacz et al., 2006; Gold and Shadlen, 2007; Ratcliff and McKoon, 2008; Brody and Hanks, 2016). When this decision variable reaches one of two decision bounds, a choice is made and the corresponding motor response is initiated. In this framework, a bias can arise in two ways: (i) by shifting the starting point of accumulation toward one of the bounds or (ii) by selectively changing the rate at which evidence for one versus the other choice alternative is accumulated. Figure 1 illustrates these two biasing mechanisms for a simple and widely used form of accumulation-to-bound model: the drift diffusion model (DDM). Similar principles apply to more complex accumulation-to-bound models. The starting point shift can be thought of as adding an offset to the perceptual interpretation of the current sensory evidence. By contrast, the evidence accumulation bias corresponds to biasing that perceptual interpretation toward one of the two stimulus categories.

Two biasing mechanisms within the DDM.

The DDM postulates that noisy sensory evidence is accumulated over time, until the resulting decision variable y reaches one of two bounds (dashed black lines at y = 0 and y = a) for the two choice options. Repeating this process over many trials yields RT distributions for both choices (plotted above and below the bounds). Gray line: example trajectory of the decision variable from a single trial. Black lines: mean drift and resulting RT distributions under unbiased conditions. (a) Choice history-dependent shift in starting point. Green lines: mean drift and RT distributions under biased starting point. Gray-shaded area indicates those RTs for which starting point leads to choice bias. (b) Choice history-dependent shift in drift bias. Blue lines: mean drift and RT distributions under biased drift. Gray shaded area indicates those RTs for which drift bias leads to choice bias. (c) Both mechanisms differentially affect the shape of RT distributions. Conditional bias functions (White and Poldrack, 2014), showing the fraction of biased choices as a function of RT, demonstrate the differential effect of starting point and drift bias shift.

https://doi.org/10.7554/eLife.46331.002

It is currently unknown which of those two principal mechanisms accounts for the choice history biases observed in overt behavior. Previous theoretical accounts have postulated a shift in the starting point of the decision variables toward the bound of the previous choice (Yu and Cohen, 2008; Zhang et al., 2014; Glaze et al., 2015). This is based on the assumption that the representation of the decision variable decays slowly, leaving a trace of the observer’s choice in the next trial (Cho et al., 2002; Gao et al., 2009; Gao et al., 2009; Bonaiuto et al., 2016). However, choice history biases might also originate from a slower (i.e. tens of seconds) across-trial accumulation of internal decision variables – analogous to the accumulation of external outcomes in value-based decisions (Sutton and Barto, 1998; Sugrue et al., 2004). Previous experimental work on perceptual choice history biases either did not analyze the within-trial decision dynamics (Busse et al., 2011; de Lange et al., 2013; Akaishi et al., 2014; Fründ et al., 2014; Urai et al., 2017; Braun et al., 2018), or only tested for starting point biases, not accumulation biases (Cho et al., 2002; Gold et al., 2008; Yu and Cohen, 2008; Gao et al., 2009; Wilder et al., 2009; Bode et al., 2012; Jones et al., 2013; Zhang et al., 2014).

Here, we untangle how history-dependent changes in evidence accumulation and starting point contribute to history biases in overt choice behavior. Across a range of perceptual choice tasks, we found that individual differences in choice repetition are explained by history-dependent biases in accumulation, not starting point. Thus, the interaction between choice history and decision formation seems to be more complex than previously thought: choices may bias later evidence accumulation processes towards (or away from) the previous chosen perceptual interpretation of the sensory input.

Results

We fit different bounded-accumulation models to human behavioral data:choices and response times (RT). The DDM estimates model parameters from joint choices and RT distributions, and provides good fits to behavioral data from a large array of two-choice task (Ratcliff and McKoon, 2008). We estimated the following parameters: non-decision time (the time needed for sensory encoding and response execution), starting point of the decision variable, separation of the decision bounds, mean drift rate, and a stimulus-independent constant added to the mean drift. We refer to the latter parameter (termed ‘drift criterion’ by Ratcliff and McKoon, 2008) as ‘drift bias’.

Within the DDM, choice behavior can be selectively biased toward repetition or alternation by two mechanisms: shifting the starting point, or biasing the drift toward (or away from) the bound for the previous choice (Figure 1). These biasing mechanisms are hard to differentiate based on the proportion of choices alone, but they are readily distinguishable by the relationship between choice bias and RT (Figure 1c). Specifically, the conditional bias function (White and Poldrack, 2014) shows the fraction of choice repetitions as a function of their RT (binned in quantiles). A shift in starting point is most influential early in the decision process: it affects the leading edge of the RT distribution and shifts its mode. It predicts that the majority of history-dependent choice biases occur on trials with fast RTs (Figure 1c, green). A drift bias is instead accumulated along with the evidence and therefore grows as a function of elapsed time. Thus, drift bias strongly affects the trailing edge of the RT distribution with only a minor effect on the mode, altering choice fractions across the whole range of RTs (Figure 1c, blue). History-dependent changes in bound separation or mean drift rate may also occur, but they can only change overall RT and accuracy: those changes are by themselves not sufficient to bias the accumulation process toward one or the other bound, and thus toward choice repetition or alternation (see Figure 4—figure supplement 1).

We fit different variants of the DDM (Figure 3—figure supplement 1) to data from six experiments. These covered a range of task protocols and sensory modalities commonly used in studies of perceptual decision-making (Figure 2a): two alternative forced-choice, two interval forced-choice, and yes-no (simple forced choice) tasks; RT and so-called fixed duration tasks; visual motion direction and coherence discrimination, visual contrast and auditory detection; and experiments with and without single-trial performance feedback. As found in previous work (Fründ et al., 2014; Abrahamyan et al., 2016; Urai et al., 2017), observers exhibited a wide range of idiosyncratic choice history biases across all experiments (Figure 2b,c). To ensure that the DDM is an appropriate (simplified) modeling framework for these data, we first fit a basic version of the DDM that contained the above-described parameters, without allowing bias parameters to vary with choice history. We then fit the DDM while also allowing starting point, drift bias, or both to vary as a function of the observer’s choice on the previous trial.

Behavioral tasks and individual differences.

(a) Schematics of perceptual decision-making tasks used in each dataset. See also Materials and methods section Datasets: behavioral tasks and participants. (b) Distributions of individual choice history biases for each dataset. Gray bars show individual observers, with colored markers indicating the group mean. (c) Each individual’s tendency to repeat their choices after correct vs. error trials. The position of each observer in this space reflects their choice- and outcome-dependent behavioral strategy.

https://doi.org/10.7554/eLife.46331.003

The DDM fits matched several aspects of the behavioral data (Figure 3—figure supplement 1). First, RT distributions matched the model predictions reasonably well (shown separately for each combination of stimuli and choices in Figure 3—figure supplement 1, darker colors indicate predicted RTs obtained through model simulations). Second, for the fits obtained with a hierarchical Bayesian fitting procedure (see Figure 3—figure supplement 1 and Materials and methods), used for Figures 35, the R^ for group-level parameters ranged between 0.9997 and 1.0406 across datasets, indicating good convergence of the sampling procedure (Wiecki et al., 2013). Third, individual drift rate estimates correlated with individual perceptual sensitivity (d’, Figure 3—figure supplement 1a) and monotonically increased with stronger sensory evidence (Figure 3—figure supplement 1a). In fixed duration tasks, the decision-maker does not need to set a bound for terminating the decision (Bogacz et al., 2006), so the bounded diffusion process described by the DDM might seem inappropriate. Yet, the success of the DDM in fitting these data was consistent with previous work (e.g. Ratcliff, 2006; Bode et al., 2012; Jahfari et al., 2012) and might have reflected the fact that observers set implicit decision bounds also when they do not control the stimulus duration (Kiani et al., 2008; but see Tsetsos et al., 2015).

History-dependent accumulation bias, not starting point bias, explains individual differences in choice repetition behavior

Models with history-dependent biases better explained the data than the baseline model without such history dependence (Figure 3a), corroborating the observation that observers’ behavior showed considerable dependence on previous choices (Figure 2f). The model with both history-dependent starting point and drift bias provided the best fit to five out of six datasets (Figure 3a), based on the Akaike Information Criterion (AIC; Akaike, 1974 - note that we obtained the same results when instead using the hierarchical Deviance Information Criterion).

Figure 3 with 2 supplements see all
Model comparison and simulations.

(a) For each dataset, we compared the AIC between models where drift bias, starting point bias or both were allowed to vary as a function of previous choice. The AIC for a model without history dependence was used as a baseline for each dataset. Lower AIC values indicate a model that is better able to explain the data, taking into account the model complexity; a ΔAIC of 10 is generally taken as a threshold for considering one model a sufficiently better fit. (b) Conditional bias functions (Figure 1c; White and Poldrack, 2014). For the history-dependent starting point, drift bias and hybrid models, as well as the observed data, we divided all trials into five quantiles of the RT distribution. Within each quantile, the fraction of choices in the direction of an individual’s history bias (repetition or alternation) indicates the degree of choice history bias. Error bars indicate mean ± s.e.m. across datasets. (c) Choice bias on slow response trials can be captured only by models that include history-dependent drift bias. Black error bars indicate mean ± s.e.m. across datasets, bars indicate the predicted fraction of choices in the first and last RT quantiles.

https://doi.org/10.7554/eLife.46331.004

The above model comparison pointed to the importance of including a history-dependency into the model. We further examined the ability of each model to explain specific diagnostic features in the data (Palminteri et al., 2017) that distinguished starting point from drift bias. A history-dependent shift in the starting point leads to biased choices primarily when responses are fast (early RT quantiles), whereas a history-dependent shift in drift leads to biased choices across all trials, including those with slow responses (Figure 1). We simulated choices and RTs from the three different model variants and computed so-called ‘conditional bias functions’ (White and Poldrack, 2014): the fraction of choices in line with each observer’s choice repetition tendency (i.e. repetition probability), in each quantile of their RT distribution. For observers whose choice repetition probability was >0.5, this was the fraction of repetitions; for the other observers, this was the fraction of alternations. Consistent with a shift in drift bias, observers exhibited history-dependent choice biases across the entire range of RTs (Figure 3b). In particular, the biased choices on slow RTs could only be captured by models that included a history-dependent shift in drift bias (Figure 3c, blue and dark green bars).

We used the parameter estimates obtained from the full model (with both history-dependent starting point and drift bias) to investigate how history-dependent variations in starting point and drift bias related to each individual’s tendency to repeat their previous choices. We call each bias parameter’s dependence on the previous choice its ‘history shift’. For instance, in the left vs. right motion discrimination task, the history shift in starting point was computed as the difference between the starting point estimate for previous ‘left’ and previous ‘right’ choices, irrespective of the category of the current stimulus. The history shift in drift bias, but not the history shift in starting point, was robustly correlated to the individual probability of choice repetition (Figure 4a, significant correlations indicated with solid regression lines). In five out of six datasets, the correlation with the history shift in drift bias was significantly stronger than the correlation with the history shift in starting point (Figure 4b, Δρ values).

Figure 4 with 5 supplements see all
Individual choice history biases are explained by history-dependent changes in drift bias, not starting point.

(a) Relationship between individual choice repetition probabilities, P(repeat), and history shift in starting point (left column, green) and drift bias (right column, blue). Parameter estimates were obtained from a model in which both bias terms were allowed to vary with previous choice. Horizontal and vertical lines, unbiased references. Thick black crosses, group mean ± s.e.m. in both directions. Black lines: best fit of an orthogonal regression (only plotted for correlations significant at p<0.05). (b) Summary of the correlations (Spearman’s ρ) between individual choice repetition probability and the history shifts in starting point (green; left) and drift bias (blue; right). Error bars indicate the 95% confidence interval of the correlation coefficient. Δρ quantifies the degree to which the two DDM parameters are differentially able to predict individual choice repetition (p-values from Steiger’s test). The black diamond indicates the mean correlation coefficient across datasets. The Bayes factor (BF10) quantifies the relative evidence for the alternative over the null hypothesis, with values < 1 indicating evidence for the null hypothesis of no correlation, and >1 indicating evidence for a correlation.

https://doi.org/10.7554/eLife.46331.007

We quantified the total evidence by computing a Bayes factor for each correlation (Wetzels and Wagenmakers, 2012), and multiplying these across datasets (Scheibehenne et al., 2016). This further confirmed that individual choice history biases were not captured by history shifts in starting point, but consistently captured by history shifts in drift (Figure 4b). Specifically, the Bayes factor for the history shift in starting point approached zero, indicating strong evidence for the null hypothesis of no correlation. The Bayes factor for the history shift in drift indicated strong evidence for a correlation (Kass and Raftery, 1995).

Correlations between estimated history shifts in starting point and drift bias were generally negative (mean ρ: −0.2884, range −0.4130 to −0.0757), but reached statistical significance (p<0.05) in only one dataset. The combined Bayes Factor (BF10) was 0.0473, indicating strong evidence for H0. We thus remain agnostic about the relationship between the history shifts of both parameters.

The same qualitative pattern of results was obtained with an alternative fitting procedure (non-hierarchical G2 optimization, Figure 4—figure supplement 2a), as well as a model that allowed for additional across-trial variability in non-decision time (Figure 4—figure supplement 2b). Letting non-decision time vary with each level of sensory evidence strength (in the two datasets including multiple such levels) did not change the pattern of model comparison and correlation results (Figure 4—figure supplement 2c). These findings are thus robust to specifics of the model and fitting method. The Visual motion 2IFC #2 also included pharmacological interventions in two sub-groups of participants (see Materials and methods); we found the same effects for both drug groups as well as the placebo group (Figure 4—figure supplement 3). A significant positive correlation between history shift in drift bias and P(repeat) was present for two sub-groups of participants, defined as ‘repeaters’ and ‘alternators’ (based on P(repeat) being larger or smaller than 0.5, respectively; Figure 4—figure supplement 4).

The lack of a correlation between history-dependent starting point shifts and individual choice repetition is surprising in light of previous accounts (Yu and Cohen, 2008; Gao et al., 2009). History shifts in starting point were mostly negative (a tendency toward choice alternation) across participants, regardless of their individual tendency toward choice repetition or alternation (Figure 4—figure supplement 5, significant in two out of six datasets). This small but consistent effect likely explains why our formal model comparison favored a model with both history-dependent drift and starting point over one with only drift bias (see also Discussion). Critically, only the history-dependent shift in drift bias accounted for individual differences in choice repetition (Figure 4).

History-dependent accumulation bias explains individual choice repetition behavior irrespective of previous choice outcome

In four out of six tasks, participants received explicit outcome feedback (correct, error) after each choice. It is possible that participants experienced positive feedback as rewarding and (erroneously) assumed that a rewarded choice is more likely to be rewarded on the next trial. Manipulations of reward (probability or magnitude) have been found to change starting point (Voss et al., 2008; Leite and Ratcliff, 2011; Mulder et al., 2012), but might also bias drift (Liston and Stone, 2008; Afacan-Seref et al., 2018; Fan et al., 2018). Given that there were far more correct (i.e. rewarded) choices than errors, the history-dependent drift bias could reflect the expectation of reward for the choice that was correct on the previous trial.

Two findings refute this idea. First, the same result holds in the two datasets without single-trial outcome feedback (Figure 4a, bottom row), implying that external feedback is not necessary for history shifts in drift bias. Second, we found similar results when separately estimating the model parameters (history shift in starting point and drift bias) and model-free measures (choice repetition probability) after both correct and error trials (Figure 5a). Across datasets, individual repetition probability was best explained by history shifts in drift bias, not starting point, after both correct (Figure 5b) and error (Figure 5c) trials. Thus, even erroneous choices bias evidence accumulation on the next trial, in the same direction as correct choices. Indeed, most participants were predominantly biased by their previous choice (95 ‘stay’, 30 ‘switch’), while a third was biased by a combination of the previous choice and its correctness (26 ‘win-stay lose-switch’, 42 ‘win-switch lose-stay’; Figure 2c).

History shift in drift bias explains individual choice behavior after both error and correct decisions.

As in Figure 4, but separately following correct (black) and error (red) trials. Post-correct trials were randomly subsampled to match the trial numbers of post-error trials. (a) Relationship between repetition probability and history shifts in starting point and drift bias, separately computed for trials following correct (black circles) and error (red squares) responses. (b) Summary of correlations (as in Figure 4c) for trials following a correct response. Error bars indicate the 95% confidence interval of the correlation coefficient. (c) Summary of correlations (as in Figure 4c) for trials following an error response. (d) Difference in correlation coefficient between post-correct and post-error trials, per dataset and parameter. Δρ quantifies the degree to which the two DDM parameters are differentially able to predict individual choice repetition (p-values from Steiger’s test). The black diamond indicates the mean correlation coefficient across datasets. The Bayes factor (BF10) quantifies the relative evidence for the alternative over the null hypothesis, with values < 1 indicating evidence for the null hypothesis of no correlation, and >1 indicating evidence for a correlation.

https://doi.org/10.7554/eLife.46331.013

Correlations tended to be smaller for previous erroneous choices. However, directly comparing the correlation coefficients between post-correct and post-error trials (after subsampling the former to ensure equal trial numbers per participant) did not allow us to refute nor confirm a difference (Figure 5d). In sum, history-dependent drift biases did not require external feedback about choice outcome and were predominantly induced by the previous choice. These choice history-dependent biases in evidence accumulation were accompanied by effects on drift rate and boundary separation (Figure 4—figure supplement 1), in line with previous work on post-error slowing (Dutilh et al., 2012; Goldfarb et al., 2012; Purcell and Kiani, 2016a).

Accumulation bias correlates with several past choices

Does the history shift in evidence accumulation depend on events from one past trial only? Recent work has exposed long-lasting choice history biases that span several trials and tens of seconds (Urai et al., 2017; Braun et al., 2018; Hermoso-Mendizabal et al., 2018). We thus estimated the influence of past events on the evidence accumulation process in a more comprehensive fashion. We fit a family of models in which correct and incorrect choices from up to six previous trials were used as predictors, and estimated their contribution to current starting point and drift bias.

Inclusion of further lags improved the model’s ability to account for the data, up to a lag of 2–4 after which model fits (ΔAIC) began to deteriorate (Figure 6—figure supplement 1). In 4/6 datasets, the best-fitting model contained only history-dependent changes in drift, not starting point, over a scale of the previous 2–4 trials. In the other two datasets, the best-fitting model was a hybrid where both drift and starting point varied as a function of choice history, up to two to trials into the past (Figure 6—figure supplement 1). We computed ‘history kernels’ across the different lags, separately for starting point and drift bias. These are analogous to the kernels obtained from a history-dependent regression analysis of the psychometric function that ignores decision time (Fründ et al., 2014), and which have been widely used in the recent literature on choice history biases (Fründ et al., 2014; Urai et al., 2017; Braun et al., 2018). To interpret these group-level kernels in light of substantial individual variability, we expressed each regression weight with respect to individual repetition probability at lag 1 (i.e. switching the sign for alternators).

Previous choices shifted drift bias in line with individual history bias across several trials, whereas starting point did not consistently shift in the direction of history bias. The hybrid models showed that the effect of choice history on drift bias decayed over approximately three past trials (Figure 6a), with a slower decay than for starting point (Figure 6a). The regression weights for past trials (from lag two through each dataset’s best-fitting lag) for drift bias – but not starting point - significantly correlated with the probability of repeating past choices at these same lags (Figure 6b). This was true after both correct and error trials (Figure 6b), similarly to the effects at lag 1 (Figure 5b–c).

Figure 6 with 1 supplement see all
Choice history affects drift bias over multiple trials.

(a) History kernels, indicating different parameters’ tendency to go in the direction of each individual’s history bias (i.e. sign-flipping the parameter estimates for observers with P(repeat)<0.5). For each dataset, regression weights from the best-fitting model (lowest AIC, Figure 6—figure supplement 1) are shown in thicker lines; thin lines show the weights from the largest model we fit. Black errorbars show the mean ± s.e.m. across models, with white markers indicating timepoints at which the weights are significantly different from zero across datasets (p<0.05, FDR corrected). Black lines show an exponential fit Vt=Ae-t/τ to the average. (b) Correlations between individual P(repeat) and regression weights, as in Figure 5b–c. Regression weights for the history shift in starting point and drift bias were averaged from lag two until each dataset’s best-fitting lag. P(repeat) was corrected for expected repetition at longer lags given individual repetition, and averaged from lag two to each dataset’s best-fitting lag. Δρ quantifies the degree to which the two DDM parameters are differentially able to predict individual choice repetition (p-values from Steiger’s test). The black diamond indicates the mean correlation coefficient across datasets. The Bayes factor (BF10) quantifies the relative evidence for the alternative over the null hypothesis, with values < 1 indicating evidence for the null hypothesis of no correlation, and >1 indicating evidence for a correlation.

https://doi.org/10.7554/eLife.46331.014

In sum, the biasing effect of choice history on evidence accumulation is long-lasting (longer than the effects on starting point), dependent on preceding choices several trials into the past, but independent of their correctness. This analysis corroborates the previous findings from our simpler models focusing on only the preceding trial, and further dissociate the effects of choice history on starting point and evidence accumulation.

History-dependent accumulation bias explains individual choice repetition behavior irrespective of specifics of bounded-accumulation models

We next set out to test the generality of our conclusions and gain deeper mechanistic insight into the nature of the dynamic (i.e. time-increasing) bias. We used a variety of bounded-accumulation models with more complex dynamics than the standard DDM. We focused on the preceding trial only, which our previous analyses had identified as exerting the same effect on history bias as the longer lags (Figure 6). These models included variants of the DDM (i.e. a perfect accumulator) with more complex dynamics of the bias or the decision bounds, as well as variants of a leaky accumulator (Busemeyer and Townsend, 1993; Usher and McClelland, 2001; Brunton et al., 2013). We focused on the Visual motion 2AFC (FD) dataset because it entailed small random dot stimuli (diameter 5° of visual angle), leading to large within- and across-trial fluctuations in the sensory evidence which we estimated through motion energy filtering (Adelson and Bergen, 1985; Urai and Wimmer, 2016; Figure 7—figure supplement 1). These fluctuating motion energy estimates were used as time-varying sensory input to the models, providing key additional constraints over and above nominal sensory evidence levels, choices and RT distributions (Brunton et al., 2013).

We first re-fit the standard DDM where the two biasing parameters were allowed to vary with previous choice (see Figure 1), now using single-trial motion energy estimates and a non-hierarchical fitting procedure (see Materials and methods). This made these fits directly comparable to both the hierarchical fits in Figures 34, and the more complex models described below. As expected (Figure 3a), the data were better explained by a history-dependent bias in the drift, rather than the starting point (Figure 7b1). In these non-hierarchical fits, the hybrid DDM (i.e. both bias terms free to vary as a function of previous choice) lost against the drift bias-only model (indicated by its higher AIC). Yet thi hybrid model allowed for a direct comparison of the correlations between these (jointly fit) bias parameters and individual choice repetition probability. As in our previous analysis (Figure 4), individual choice repetition probability was more strongly predicted by drift than starting point bias (Figure 6c1).

Figure 7 with 3 supplements see all
Extended dynamic models of biased evidence accumulation.

(a) Model schematics. In the third panel from the left, the stimulus-dependent mean drift is shown in black, overlaid by the biased mean drift in color (as in Figure 1a,b). (b) AIC values for each history-dependent model, as compared to a standard (left) or dynamic (right) DDM without history. The winning model (lowest AIC value) within each model class is shown with a black outline. (c) Correlation (Spearman’s ρ) of parameter estimates with individual repetition behavior, as in Figure 4b. Error bars, 95% confidence interval. ***p<0.0001, **p<0.01, n.s. p>0.05. (d) Within-trial time courses of effective bias (cumulative bias as a fraction of the decision bound) for the winning leaky accumulator model. Effective bias time courses are indistinguishable between both dynamical regimes (λ < 0 and λ > 0) and are averaged here.

https://doi.org/10.7554/eLife.46331.016

A previous study of reward effects on speeded decisions reported that reward asymmetries induced supra-linear bias dynamics (Afacan-Seref et al., 2018). Temporal integration of a constant drift bias produces a linearly growing effective bias in the decision variable (Figure 1b), whereas integration of a ramping drift bias produces a supra-linear growth of effective bias (Figure 7a, yellow). In our data, a standard DDM with constant drift bias provided a better fit than DDMs with either a ramping drift bias, or a combination of constant and ramping drift bias (Figure 7b2). Furthermore, in the latter (hybrid) model, the constant drift bias best predicted individual choice repetition behavior (Figure 7c2), in line with the constant accumulation bias inferred from the standard DDM fits. For the fits shown in Figure 7b2/c2, we used the same fitting protocol as for the standard DDM, in which the time-varying sensory evidence fluctuations during stimulus presentation were replaced by their average over time to compute a single-trial drift rate (called ‘default protocol’, Materials and methods section Extended bounded accumulation models: General assumptions and procedures). The same qualitative pattern of results also held for another fitting protocol (‘dynamic protocol’, see Materials and methods), in which the time-varying sensory evidence was fed into the integrator (ΔAIC relative to no-history model: −1103, –985, −995, for constant drift bias, ramping drift bias, and hybrid, respectively; correlation with P(repeat): ρ(30)= 0.5458, p=0.0012; ρ(30)= 0.3600, p=0.0429 for constant and ramping drift bias, respectively). We next used this dynamic protocol for a set of more complex dynamical models.

It has been proposed that decision bounds might collapse over time, implementing an ‘urgency signal’ (Figure 6a, middle; Churchland et al., 2008; Cisek et al., 2009). Indeed, adding collapsing bounds substantially improved our model fits (Figure 7b3). This indicates the presence of a strong urgency signal in this task, which had a relatively short stimulus presentation (750 ms) and a tight response deadline (1.25 s after stimulus offset). Critically, a history-dependent drift bias best fit the data (Figure 7b3) and captured individual choice repetition behavior (Figure 7c3) also in the DDM with collapsing bounds. In other words, while there is evidence for collapsing bounds in this dataset, our conclusion about the impact of history bias on decision formation does not depend on its inclusion in the model.

In the brain, a neural representation of the momentary sensory evidence feeds into a set of accumulators. These consist of circuits of excitatory and inhibitory populations of cortical neurons, which give rise to persistent activity and competitive winner-take-all dynamics (Usher and McClelland, 2001; Wang, 2002). Under certain parameter regimes, these circuit dynamics can be reduced to lower-dimensional models (Bogacz et al., 2006; Wong, 2006). In such models, the effective accumulation time constant 1/λ (with λ being the effective leak) results from the balance of leak within each accumulator (due to self-excitation and passive decay) and mutual inhibition between two accumulators encoding different choices (Usher and McClelland, 2001). Evidence accumulation can then be biased through an internal representation of the sensory input, or through the way this sensory representation is accumulated (Figure 7a, right). We here used a reduced competing accumulator model, where the decision variable was computed as the difference of two leaky accumulators (Busemeyer and Townsend, 1993; Zhang and Bogacz, 2010; see also Brunton et al., 2013) to compare these two accumulation biases and a biased accumulator starting point.

We fit a family of bounded, leaky accumulator models, in which the starting point of the accumulators, their input, or their effective leak λ could be biased as a function of previous choice (Figure 7a, right). Note that a bias of the accumulator starting point would also translate into an accumulation bias, due to the model dynamics (see Materials and methods section Extended bounded accumulation models: General assumptions and procedures). Even so, comparing this regime with other two biasing mechanism was informative. Also note that we here use the term ‘leaky accumulator model’ to denote that the model dynamics consisted of a free effective leak parameter λ, without implying that λ < 0 (corresponding to activation decay). Our fits allowed λ to take either negative (‘forgetful’ regime) or positive (‘unstable’ regime) values (Figure 7—figure supplement 1d; see also Brunton et al., 2013). Critically, in order to test for choice history-dependent accumulation bias, we allowed λ of each accumulator to vary as a function of the previous choice, before computing the difference between the two accumulator activations. Choice-history dependent biases in accumulator starting point or accumulator input were directly applied to the accumulator difference (akin to starting point and drift bias within the DDM). Due to the simplicity of its dynamics, the DDM cannot distinguish between input and leak bias. Indeed, when simulating behavior of leaky accumulator models with either of these two accumulation biases and fitting it with the DDM, both input and λ bias loaded onto DDM drift bias (Figure 7—figure supplement 2). Critically, the leaky accumulator with biased accumulator input best explained the data, among all the models considered (Figure 7b4). Furthermore, the individually estimated input bias predicted individual choice repetition (Figure 7c4). This suggests that choice history might specifically bias the internal representation of sensory evidence feeding into the evidence accumulation process.

Dynamics of effective bias signal approximates rational combination of prior information with current evidence

Taken together, fits and simulations of more complex models provided additional insight into the mechanism underlying choice history bias. They also corroborated the conclusion that choice history biases are mediated by a biased accumulation of evidence, rather than a biased starting point. As a final step, we estimated the time course of the effective bias, computed as the fraction of cumulative bias signal and bound height (Hanks et al., 2011). We simulated this signal based on the group average parameters for the best-fitting leaky accumulator model (Figure 7d). In this leaky accumulator (with collapsing bound), the effective bias accelerated (Figure 7d).

The reader may notice that these (supra-linear) effective bias dynamics are similar to those predicted by the DDM with a ramping drift bias (Figure 7a, left). Thus, the observation that the latter model lost by a wide margin against the two models with more complex dynamics (Figure 7b, see also Materials and methods) is likely due to features of the data other than the (relatively small) selective history bias. Specifically, the RT distributions were strongly shaped by the urgency signal incorporated by the bound collapse. In the overall best-fitting model (leaky accumulator with collapsing bounds and input bias, Figure 7b5), this effective bias depends on the combined effect of two non-linear signals: (i) the cumulative bias resulting from the accumulation of biased input and (ii) the hyperbolically collapsing bound. In the current fits, the effective bias was dominated by the strong bound collapse, but in different circumstances (with weaker urgency signal and for λ < 0), a biased input leaky accumulator can produce a decelerating effective bias. Combination of a biased input with some starting point and or leak bias can further change the dynamics. The key observation is that, regardless of the modeling framework used, we identified an effective bias signal that grew steadily throughout decision formation, in line with the main conclusion drawn from the basic fits of the standard DDM.

Our results are in line with the idea the impact of choice history bias on decision formation grows as a function of elapsed time. This observation might be surprising, as prior information (here: about the previous choice) does not change over time. Yet, previous work has identified a principled rationale for such a time-dependent combination of prior and evidence. When evidence reliability changes from trial to trial, prior information (bias) should be weighted more strongly when sensory evidence is unreliable (Hanks et al., 2011; Moran, 2015). This can be achieved by increasing the weight of the prior throughout the trial, using elapsed time as a proxy for evidence reliability. This prediction was confirmed experimentally for explicit manipulations of prior probability of the choice options (Hanks et al., 2011). Indeed, within the framework of the DDM, this way of combining prior information with current evidence maximizes reward rate (Moran, 2015; see also Drugowitsch and Pouget, 2018). Only when evidence reliability is constant across trials should prior information be incorporated as a static bias (i.e. starting point). Evidence reliability likely varied from trial to trial across all our experiments (Moran, 2015), due to variations in the external input (i.e. mean drift rate in the DDM), originating from stochastically generated stimuli, or internal factors (i.e. drift rate variability in the DDM), such as the inherent variability of sensory cortical responses (Arieli et al., 1996; Faisal et al., 2008). In particular, the dataset from Figure 7 entailed strong trial-to-trial variations in the external input (Figure 7—figure supplement 1). Thus, the dynamics of the effective bias signal uncovered in Figure 7d suggest that participants combined prior information with current evidence in a rational fashion.

Discussion

Quantitative treatments of perceptual decision-making commonly attribute trial-to-trial variability of overt choices to noisy decision computations (Shadlen et al., 1996; Renart and Machens, 2014; Wyart and Koechlin, 2016). Those accounts typically assume that systematic decision biases remain constant over time. Instead, the choice history biases studied here vary continuously over the course of the experiment, as a function of the previous choices (and choice outcome information). Our current results indicate that choice history explains trial-to-trial variability specifically in evidence accumulation, in a number of widely used perceptual choice tasks. Ignoring such trial-to-trial variations will lead to an overestimation of the noise in the evidence accumulation process and resulting behavior.

History biases in perceptual choice have long been known in perceptual psychophysics (Fernberger, 1920) and neuroscience (Gold et al., 2008). However, the underlying dynamic mechanisms have remained elusive. We here show that individual differences in overt choice repetition behavior are explained by the degree to which choices bias the evidence accumulation, not the starting point, of subsequent decisions. This accumulation bias is associated with choices made several trials into the past, and it grows steadily as the current decision unfolds. This insight calls for a revision of current models of choice history biases (Yu and Cohen, 2008; Zhang et al., 2014).

It is instructive to relate our results to previous studies manipulating the probability of the occurrence of a particular category (i.e. independently of the sequence of categories) or the asymmetry between rewards for both choices. Most of these studies explained the resulting behavioral biases in terms of starting point shifts (Leite and Ratcliff, 2011; Mulder et al., 2012; White and Poldrack, 2014; Rorie et al., 2010; Gao et al., 2011; but only for decisions without time pressure, see Afacan-Seref et al., 2018). Yet, one study with variations of evidence strength found an effect of asymmetric target probability on accumulation bias (Hanks et al., 2011) similar to the one we here identified for choice history. In all this previous work, biases were under experimental control: probability or reward manipulations were signaled via explicit task instructions or single-trial cues (in humans) or block structure (in animals). By contrast, the choice history biases we studied here emerge spontaneously and in an idiosyncratic fashion (Figure 2e), necessitating our focus on individual differences.

Our modeling addressed the question of how prior information is combined with new evidence during decision formation (see in particular the section Dynamics of effective bias signal approximates rational combination of prior information with current evidence). But why did participants use choice history as a prior for their decisions? In all our experiments, the sensory evidence was uncorrelated across trials – as is the case in the majority of perceptual choice tasks used in the literature. Thus, any history bias can only reduce performance below the level that could be achieved, given the observer’s sensitivity. It may seem irrational that people use history biases in such settings. However, real-world sensory evidence is typically stable (i.e. auto-correlated) across various timescales (Yu and Cohen, 2008). Thus, people might (erroneously) apply an internal model of this environmental stability to randomized laboratory experiments (Yu and Cohen, 2008), which will push them toward choice repetition or alternation (Glaze et al., 2015). Indeed, people flexibly adjust their choice history biases to environments with different levels of stability (Glaze et al., 2015; Kim et al., 2017; Braun et al., 2018), revealing the importance of such internal models on perceptual decision-making. In sum, with our conclusions from the time course of the effective bias signal, these considerations suggest that participants may have applied a rational strategy, but based on erroneous assumptions about the structure of the environment.

While we found that choice history-dependent variations of accumulation bias were generally more predictive of individual choice repetition behavior, the DDM starting point was consistently shifted away from the previous response for a majority of participants (i.e. negative values along x-axis of Figure 4a). This shift was statistically significant in three out or six datasets (Figure 4—figure supplement 5a), and might explain the advantage of the dual parameter model over the pure drift-bias model in our model comparisons (Figure 3a). The starting point shift may be due to at least two scenarios, which are not mutually exclusive. First, it might reflect a stereotypical response alternation tendency originating from neural dynamics in motor cortex – for example, a post-movement ‘rebound’ of beta-band oscillations (Pfurtscheller et al., 1996). Indeed, previous work found that beta rebound is related to response alternation in a perceptual choice task, which precluded (in contrast to our tasks) motor planning during evidence accumulation (Pape and Siegel, 2016). This stereotypical response alternation tendency (via starting point) may have conspired with the more flexible history bias of evidence accumulation (via drift bias) to shape choice behavior. Because starting point shifts will predominantly contribute to fast decisions, this scenario is consistent with the average choice alternation tendency we observed for RTs < 600 ms (Figure 4—figure supplement 5c). Because the response alternation tendency in motor cortex is likely to be induced only by the immediately preceding response, this scenario is also consistent with the shorter timescales we estimated for the starting point effects (1.39 trials) than the drift rate effects (2.38 trials; Figure 6a, exponential fits). Second, the starting point shift may also reflect decision dynamics more complex than described by the standard DDM: non-linear drift biases (Figure 7—figure supplement 2, third column) or biases in the leak of decision accumulators (Figure 7—figure supplement 3, third column). Both give rise to opposite effects on drift bias and starting point bias when fit with the standard DDM, thus yielding negative correlations between DDM starting point and drift bias estimates. Such negative correlations were present in our data, but weak and not statistically significant (Spearman’s rho −0.4130 to −0.0757, combined BF10= 0.0473). It is possible that both of the scenarios discussed here conspired to yield the starting point effects observed in model comparisons and individual parameter estimates. Future work is needed to illuminate this issue, for example through manipulating decision speed and/or the delays between subsequent motor responses, and modeling choice-related neural dynamics in motor cortex.

We propose that choice history biases evidence accumulation, but there are alternative scenarios. First, it is possible that participants’ choices were due to computations altogether different from those incorporated in the bounded accumulation models assessed here. All our models imply simple neural accumulators with persistent activity. At least on a subset of trials, participants may make fast guesses (Noorbaloochi et al., 2015), or engage in automatic decision processing (Servant et al., 2014; Ulrich et al., 2015) or post-accumulation biases (Erlich et al., 2015). The decision computation may also entail noise-driven attractor dynamics (Wang, 2002; Braun and Mattia, 2010) possibly with sudden ‘jumps’ between neural activity states (Latimer et al., 2015), instead of linear accumulation to a threshold level. Even if the accumulation dynamics postulated in our models cannot be reduced to the dynamics of single neurons, the history-dependent accumulation bias we inferred here would constitute a valid description of the collective computational properties of the neural system producing choice behavior. Second, within bounded accumulation models, any directed change in the decision variable can be mimicked by some selective (i.e. asymmetric) change in one of the decision bounds. For example, combining the DDM with a linearly collapsing bound for the favored choice and a linearly expanding bound for the other choice has the same effect on choice fractions and RT distributions as a drift bias. We are not aware of any empirical evidence for such asymmetric changes in decision bounds. Decision-related cortical ramping activity seems to always reach a fixed level just before motor response, irrespective of prior probabilities (Hanks et al., 2011) or speed-accuracy trade-offs (Hanks et al., 2014; Murphy et al., 2016). Instead, the build-up of this activity is biased by prior information (Hanks et al., 2011).

A plausible mechanism underlying the choice history-dependent shift in accumulation bias is a bias of the neural representations of the sensory evidence towards (or away from) a previously selected category (Nienborg and Cumming, 2009; St John-Saaltink et al., 2016; Urai and Wimmer, 2016). This is precisely the ‘input bias’ scenario entailed in our best fitting model (Figure 7). The primate brain is equipped with powerful machinery to bias sensory representations in a top-down fashion (Desimone and Duncan, 1995; Reynolds and Heeger, 2009). In the laboratory, these top-down mechanisms have been probed by explicitly instructing subjects to shift their attention to a particular sensory feature or location. Such instructions induce biased activity states in regions of prefrontal and parietal association cortex, which are propagated down the cortical hierarchy to sensory cortex via selective feedback projections, where they boost the corresponding feature representations and suppress others (Desimone and Duncan, 1995). The same prefrontal and parietal regions accumulate sensory evidence and seem to carry choice history signals. It is tempting to speculate that choice history signals in these regions cause the same top-down modulation of sensory cortex as during explicit manipulations of attention. In other words, agents’ choices might be one factor directing their top-down attention under natural conditions, in a way analogous to explicit attention cues in laboratory tasks. An alternative, but related possibility is that the direction of selective attention fluctuates spontaneously during the course of a perceptual choice experiment, preferentially sampling features supporting one choice for a streak of trials, and then switching to sampling support for the other category. The corresponding top-down modulations would bias evidence accumulation and choice in a serially correlated fashion. These ideas are not mutually exclusive and can be tested by means of multi-area neurophysiological recordings combined with local perturbations.

A growing body of evidence points to the interplay of multiple timescales for neural computation in the cortex. One line of behavioral work has revealed effective (within-trial) evidence accumulation over timescales ranging from a few hundred milliseconds (Kiani et al., 2008; Tsetsos et al., 2015) to several seconds (Tsetsos et al., 2012; Wyart et al., 2012; Cheadle et al., 2014). Another line of work, including the current study, revealed the slow accumulation of internal decision variables or external outcome information across trials (tens of seconds) to build up time-varying biases, or priors (Sugrue et al., 2004; Abrahamyan et al., 2016; Purcell and Kiani, 2016b; Braun et al., 2018). Relatedly, neurophysiological work on ongoing activity has inferred multiple hierarchically organized timescales in different cortical regions (Honey et al., 2012; Murray et al., 2014; Chaudhuri et al., 2015; Runyan et al., 2017; Scott et al., 2017). The history-dependent evidence accumulation biases that we have uncovered here might index the interplay between these different effective timescales, with long-timescale accumulators at higher stages biasing short-timescale accumulators at intermediate stages of the cortical hierarchy.

Materials and methods

Datasets: behavioral tasks and participants

Request a detailed protocol

We analyzed six different datasets, four of which were previously published. These spanned different modalities (visual or auditory), decision-relevant sensory features (motion direction, contrast, tone presence, motion coherence), and tasks (detection or discrimination). In each dataset, the number of participants was determined to allow for robust estimation of the original effects of interest. No participants were excluded from the analyses.

Those tasks where the decision-relevant sensory evidence was presented until the observer generated a response were called response time (RT) tasks; those tasks where the sensory evidence was presented for a fixed duration, and its offset cues the observer’s response, were called fixed duration (FD) tasks in line with the terminology from Mazurek et al. (2003). These two protocols have also been termed ‘free response protocol’ and ‘interrogation protocol’ (Bogacz et al., 2006). In all datasets, stimulus strength (i.e., decision difficulty) was kept constant, or varied systematically across levels, within all main experimental sessions that were used for fitting the DDM.

2AFC visual motion direction discrimination task (RT)

Request a detailed protocol

These data were previously published (Murphy et al., 2014), and are available at https://doi.org/10.5061/dryad.tb542. The study was approved by the ethics committee of the Leiden University Cognitive Psychology department, and all subjects provided written informed consent before taking part. Twenty-six observers (22 women and 4 men, aged 18–29) performed a motion direction (left vs. right) discrimination task. Stationary white dots were presented on a black screen for an interval of 4.3–5.8 s. After this fixation interval, the decision-relevant sensory evidence was presented: some percentage of dots (the ‘motion coherence’ level) moved to the left or the right. The coherence was individually titrated to yield an accuracy level of 85% correct (estimated from a psychometric function fit) before the start of the main experiment, and kept constant afterwards. The moving dots were presented until observers indicated their choice with a button press. After the response, the fixation cross changed color for 700 ms to indicate single-trial feedback. Each observer performed 500 trials of the task in one session. We refer to this task as ‘Visual motion 2AFC (RT)’.

2AFC visual motion direction discrimination task (FD)

Participants and informed consent

Request a detailed protocol

Thirty-two participants (aged 19–35 years, 43 women and 21 men) participated in the study after giving their informed consent. The experiment was approved by the ethical review board of the University Medical Center Hamburg-Eppendorf (PV4714).

Task and procedure

Request a detailed protocol

Observers performed a fixed duration version of the random dot motion discrimination (up vs. down) task in the MEG scanner. White dots were displayed on a gray background screen, with a density of 6 dots/degree2, resulting in 118 dots on the screen at each frame. The stimuli were confined to a circle of 2.5° radius, which was placed in the lower half of the visual field at 3.5° from fixation. After a fixation interval of 0.75–1.5 s, random dot motion stimuli (0, 3, 9, 27 or 81% motion coherence) were displayed for 750 ms. Signal dots moved with a speeds of 11.5 degree/s, and noise dots were randomly displaced within the circle on each frame. We used the single-trial dot coordinates to construct time courses of fluctuating external evidence (see Materials and methods section Motion energy filtering and psychophysical kernels; Figure 7—figure supplement 1a–c). Observers received auditory feedback 1.5–2.5 s after their response, and the ISI started 2–2.5 s after feedback. Observed performed 1782 trials over three sessions, in which the stimulus transition probability varied (0.2, 0.5 or 0.8) between blocks of 99 trials. To maximize trial counts for the non-hierarchical leaky accumulator fits, we here collapsed across blocks. We refer to this task as ‘Visual motion 2AFC (FD)’.

Visual motion coherence discrimination 2IFC task (FD): dataset 1

Request a detailed protocol

These data were previously published in Urai et al. (2017), and are available at http://dx.doi.org/10.6084/m9.figshare.4300043. The ethics committee at the University of Amsterdam approved the study, and all observers gave their informed consent before participation. Twenty-seven observers (17 women and 10 men, aged 18–43) performed a two-interval motion coherence discrimination task. They viewed two consecutive intervals of random dot motion, containing coherent motion signals in a constant direction towards one of the four diagonals (counterbalanced across participants) and judged whether the second test interval (variable coherence) contained stronger or weaker motion than the first reference (constant coherence) interval. After a fixation interval of 0.5–1 s, they viewed two consecutive intervals of 500 ms each, separated by a delay of 300–700 ms. The decision-relevant sensory evidence (i.e. the difference in motion coherence between intervals), was chosen pseudo-randomly for each trial from the set (0.625, 1.25, 2.5, 5, 10, 20, 30%). Observers received auditory feedback on their choice after a delay of 1.5–2.5 s. After continuing to view noise dots for 2–2.5 s, stationary dots indicated an inter-trial interval. Observers self-initiated the start of the next trial (range of median inter-trial intervals across observers: 0.68–2.05 s). Each observer performed 2500 trials of the task, divided over five sessions. We refer to this task as ‘Visual motion 2IFC (FD) #1’.

2IFC visual motion coherence discrimination task (FD): dataset 2

Participants and informed consent

Request a detailed protocol

Sixty-two participants (aged 19–35 years, 43 women and 19 men) participated in the study after screening for psychiatric, neurological or medical conditions. All subjects had normal or corrected to normal vision, were non-smokers, and gave their informed consent before the start of the study. The experiment was approved by the ethical review board of the University Medical Center Hamburg-Eppendorf (PV4648).

Task protocol

Request a detailed protocol

Observers performed five sessions, of which the first and the last took place in the MEG scanner (600 trials divided over 10 blocks per session) and the three sessions in between took place in a behavioral lab (1500 trials divided over 15 blocks per session). The task was as described above for ‘Visual motion 2IFC (FD) #1’, with the following exceptions. The strength of the decision-relevant sensory evidence was individually titrated to an accuracy level of 70% correct, estimated from a psychometric function fit, before the start of the main experiment and kept constant for each individual throughout the main experiment. Each stimulus was presented for 750 ms. In the MEG sessions, auditory feedback was presented 1.5–3 s after response, and an inter-trial interval with stationary dots started 2–3 s after feedback. Participants initiated the next trial with a button press (across-subject range of median inter-trial interval duration: 0.64 to 2.52 s, group average: 1.18 s). In the training sessions, auditory feedback was presented immediately after the response. This was followed by an inter-trial interval of 1 s, after which the next trial started. In this experiment, three sub-groups of observers received different pharmacological treatments prior to each session, receiving placebo, atomoxetine (a noradrenaline reuptake inhibitor), or donepezil (an acetylcholinesterase inhibitor). These groups did not differ in their choice history bias and were pooled for the purpose of the present study (Figure 4—figure supplement 3). We refer to this task as ‘Visual motion 2IFC (FD) #2’.

Visual contrast yes/no detection task (RT)

Request a detailed protocol

These data were previously published (de Gee et al., 2014), and are available at https://doi.org/10.6084/m9.figshare.4806559. The ethics committee of the Psychology Department of the University of Amsterdam approved the study. All participants took part after giving their written informed consent. Twenty-nine observers (14 women and 15 men, aged 18–38) performed a yes/no contrast detection task. During a fixation interval of 4–6 s, observers viewed dynamic noise (a binary noise pattern that was refreshed each frame, at 100 Hz). A beep indicated the start of the decision-relevant sensory evidence. On half the trials, a vertical grating was superimposed onto the dynamic noise; on the other half of trials, only the dynamic noise was shown. The sensory evidence (signal +noise or noise-only) was presented until the observers reported their choice ('yes', grating was present; or 'no', grating was absent), or after a maximum of 2.5 s. The signal contrast was individually titrated to yield an accuracy level of 75% correct using a method of constant stimuli before the main experiment, and kept constant throughout the main experiment. Observers performed between 480–800 trials over 6–10 sessions. Six observers in the original paper (de Gee et al., 2014) performed a longer version of the task in which they also reported their confidence levels and received feedback; these were left out of the current analysis, leaving 23 subjects to be included. We refer to this task as ‘Visual contrast yes/no (RT)’.

Auditory tone yes/no detection task (RT)

Request a detailed protocol

These data were previously published (de Gee et al., 2017) and are available at https://doi.org/10.6084/m9.figshare.4806562. All subjects gave written informed consent. The ethics committee of the Psychology Department of the University of Amsterdam approved the experiment. Twenty-four observers (20 women and four men, aged 19–23) performed an auditory tone detection task. After an inter-trial interval of 3–4 s, decision-relevant sensory evidence was presented: on half the trials, a sine wave (2 KHz) superimposed onto dynamic noise (so-called TORCS; McGinley et al., 2015) was presented; on the other half of trials only the dynamic noise was presented. The sensory evidence was presented until the participant reported their choice button press or after a maximum of 2.5 s. No feedback was provided. Each individual’s signal volume was titrated to an accuracy level of 75% correct using an adaptive staircase procedure before the start of the main experiment, and kept constant throughout the main experiment. Participants performed between 1320 and 1560 trials each, divided over two sessions. We refer to this task as ‘Auditory yes/no (RT)’.

Model-free analysis of sensitivity and choice history bias

Request a detailed protocol

We quantified perceptual sensitivity in terms of signal detection-theoretic d’ (Green and Swets, 1966):

(1) d=Φ1(H)Φ1(FA)

where Φ was the normal cumulative distribution function, H was the fraction of hits and FA the fraction of false alarms. In the 2AFC and 2IFC datasets, one of the two stimulus categories was arbitrarily treated as signal absent. Both H and FA were bounded between 0.001 and 0.999 to allow for computation of d’ in case of near-perfect performance (Stanislaw and Todorov, 1999). We estimated d’ separately for each individual and, for the two datasets with varying difficulty levels, for each level of sensory evidence.

We quantified individual choice history bias in terms of the probability of repeating a choice, termed P(repeat), regardless of the category of the (previous or current) stimulus. This yielded a measure of bias that ranged between 0 (maximum alternation bias) and 1 (maximum repetition bias), whereby 0.5 indicated no bias.

Drift diffusion model (DDM) fits

General

This section describes the general DDM, with a focus on the biasing mechanisms described in Results and illustrated in Figure 1 (Ratcliff and McKoon, 2008). Ignoring non-decision time, drift rate variability, and starting point variability (see below), the DDM describes the accumulation of noisy sensory evidence:

(2) dy=svdt+cdW

where y is the decision variable (gray example traces in Figure 1), s is the stimulus category (coded as -1,1), v is the drift rate, and cdW is Gaussian distributed white noise with mean 0 and variance c2dt (Bogacz et al., 2006). In an unbiased case, the starting point of the decision variably y0=z, is situated midway between the two decision bounds 0 and a:

(3) y0= z=a2

where a is the separation between the two decision bounds. A bias in the starting point is implemented by an additive offset zbias from the midpoint between the two bounds (Figure 1a):

(4) y0= z=a2+zbias

A drift bias can be implemented by adding a stimulus-independent constant vbias, also referred to as drift bias (Ratcliff and McKoon, 2008), to the (stimulus-dependent) mean drift (Figure 1b). This adds a bias to the drift that linearly grows with time:

(5) dy=(sv+vbias)dt+cdW

We allowed both bias parameters to vary as a function of observers’ previous choice. These two biasing mechanisms result in the same (asymmetric) fraction of choices, but they differ in terms of the resulting shapes of RT distributions (Figure 1). In previous work, zbias and vbias have also been referred to as ‘prior’ and ‘dynamic’ bias (Moran, 2015) or ‘judgmental’ and ‘perceptual’ bias (Liston and Stone, 2008).

Estimating HDDM Bias parameters

Request a detailed protocol

We used hierarchical drift diffusion modeling as implemented in the HDDM toolbox (Wiecki et al., 2013) to fit the model and estimate its parameters. As recommended by the HDDM toolbox, we specified 5% of responses to be contaminants, meaning they arise from a process other than the accumulation of evidence - for example, a lapse in attention (Ratcliff and Tuerlinckx, 2002). We fit the DDM to RT distributions for the two choice categories, conditioned on the stimulus category for each trial (s in Equation 2) - a procedure referred to as ‘stimulus coding’. This fitting method deviates from a widely used expression of the model, where RT distributions for correct and incorrect choices are fit (also called ‘accuracy coding’). Only the former can fit decision biases towards one choice over the other.

First, we estimated a model without history-dependence. Overall drift rate, boundary separation, non-decision time, starting point, and drift bias were estimated for each individual (Figure 3—figure supplement 1). Across-trial variability in drift rate and starting point were estimated at the group-level only (Ratcliff and Childers, 2015). For the datasets including variations of sensory evidence strength (Visual motion 2AFC (FD) and Visual motion 2IFC (FD) #1), we separately estimated drift rate for each level of evidence strength. This model was used to confirm that the DDM was able to fit all datasets well, and to serve as a baseline for model comparison.

Second, we estimated three different models of history bias, allowing (i) starting point, (ii) drift or (iii) both to vary as a function of the observer’s immediately preceding choice (thus capturing only so-called first-order sequential effects; cf Gao et al., 2009; Wilder et al., 2009). The effect of the preceding choice on each bias parameter was then termed its ‘history shift’. For example, for the visual motion direction discrimination task we separately estimated the starting point parameter for trials following ‘left’ and ‘right’ choices. The difference between these two parameters then reflected individual observers’ history shift in starting point, computed such that a positive value reflected a tendency towards repetition and a negative value a tendency towards alternation. The history shift in drift bias was computed in the same way.

HDDM regression models

Request a detailed protocol

We estimated the effect of up to six previous stimuli and choices on history bias using a HDDM regression model. We first created a design matrix X with dimensions trials x 2 * lags, which included pairs of regressors coding for previous stimuli and choices (coded as -1,1), until (and including) each model’s lag. Two distinct replicas of X were then used as design matrices to predict drift bias (Xv) and starting point (Xz). Drift bias was defined as v ~ 1+s+Xv, where 1 captured an overall bias for one choice over the other and s indicated the signed stimulus strength. Starting point was defined as z ~ 1+Xz, with a logistic link function 11+e-X.

After fitting, parameter estimates were recombined to reflect the effect of previous correct (choice + stimuli) or error (choice – stimuli) trials. We sign-flipped the weight values for alternators (i.e. those participants with a repetition tendency at lag one < 0.5); this makes all the panels in Figure 6 a interpretable as a change in each parameter in the direction of individual history bias.

HDDM model fitting procedures

Request a detailed protocol

The HDDM (Wiecki et al., 2013) uses Markov-chain Monte Carlo sampling for generating posterior distributions over model parameters. Two features of this method deviate from more standard model optimization. First, the Bayesian MCMC generates full posterior distributions over parameter estimates, quantifying not only the most likely parameter value but also the uncertainty associated with that estimate. Second, the hierarchical nature of the model assumes that all observers in a dataset are drawn from a group, with specific group-level prior distributions that are informed by the literature (Figure 3—figure supplement 1; Wiecki et al., 2013). In practice, this results in more stable parameter estimates for individual subjects, who are constrained by the group-level inference. Note that we also repeated our model fits with more traditional G2 optimization (Ratcliff and Tuerlinckx, 2002) and obtained similar results (Figure 4—figure supplement 2a).

For each variant of the model, we ran 30 separate Markov chains with 5000 samples each. Of those, half were discarded as burn-in and every second sample was discarded for thinning, reducing autocorrelation in the chains. This left 1250 samples per chain, which were concatenated across chains. Individual parameter estimates were then estimated from the posterior distributions across the resulting 37500 samples. All group-level chains were visually inspected to ensure convergence. Additionally, we computed the Gelman-Rubin R^ statistic (which compares within-chain and between-chain variance) and checked that all group-level parameters had an R^ between 0.9997 and 1.0406.

Formal comparison between the different model variants was performed using the Akaike Information Criterion (Akaike, 1974): AIC=-2+2k, where  is the total loglikelihood of the model and k denotes the number of free parameters. The AIC was computed for each observer, and summed across them. Lower AIC values indicate a better fit, while taking into account the complexity of each model. A difference in AIC values of more than 10 is considered evidence for the winning model to capture the data significantly better. The conclusions drawn from AIC also hold when using the Deviance Information Criterion for the hierarchical models.

Conditional bias functions

Request a detailed protocol

For each variant of the model and each dataset, we simulated data using the best-fitting parameters. Specifically, we simulated 100 responses (choices and RTs) for each trial performed by the observers. These predicted patterns for the ‘baseline model’ (without history-dependence) were first used to compare the observed and predicted patterns of choices and RTs (Figure 3—figure supplement 2).

We used these simulated data, as well as the participants’ choices and RTs, to visualize specific features in our data that distinguish the different biased models (Palminteri et al., 2017). Specifically, we computed conditional bias functions (White and Poldrack, 2014) that visualize choice history bias as a function of RTs. Each choice was recoded into a repetition (1) or alternation (0) of the previous choice. We then expressed each choice as being either in line with, or against the observer’s individual bias (classified into ‘repeaters’ and ‘alternators’ depending on choice repetition probability). Note that given the transformation of the data (sign-flipping the bias data for alternators in order to merge the two groups), the fact the average P(bias)>0.5 is trivial, and would occur for any generative model of history bias. Conditional bias functions instead focus on the effect of choice history bias as a function of time within each trial, the shape of which distinguishes between different bias sources (Figure 1c).

To generate these conditional bias functions, we divided each (simulated or real) observer’s RT distribution into five quantiles (0.1, 0.3, 0.5, 0.7 and 0.9) and computed the fraction of biased choices within each quantile. The shape of the conditional bias functions for models with z and vbias confirm that z predominantly produces biased choices with short RTs, whereas vbias leads to biased choices across the entire range of RTs (Figure 3b).

Motion energy filtering and psychophysical kernels

Request a detailed protocol

For the Visual motion 2AFC (FD) dataset, we used motion energy filtering (using the filters described in Urai and Wimmer, 2016) to reconstruct the time-course of fluctuating sensory evidence over the course of each individual trial, averaging over the spatial dimensions of the display (Figure 7—figure supplement 1a, b). These single-trial traces then served as the time-resolved input to a set of extended DDM and leaky accumulator models (Figure 7). Specifically, filtering the stimuli at 60 Hz (the refresh rate of the LCD projector) resulted in 45 discrete samples for the 750 ms viewing period of each trial. The first 13 samples of the motion energy filter output (first 200 ms of the viewing interval) corresponded to the ‘rise time’ of the filter (Kiani et al., 2008), yielding outputs that were a poor representation of the actual motion energy levels (see also Figure 7—figure supplement 1a). In order to prevent those uninterpretable filter outputs from contributing, we discarded the first 15 samples (250 ms) before model fitting (see below). Using constant interpolation, we expanded the remaining 30 samples onto 150 samples, which, given that the simulation Euler step was 5 ms (dt= 0.005), corresponded to a 750 ms long input time series. In the model descriptions below we denote the input time series with M={Mt:tT} and T={1,2,,150}.

We also used these motion energy traces to construct so-called psychophysical kernels. Within each stimulus identity (motion direction and coherence, excluding the easiest 81% coherence trials), we subtracted the average motion energy traces corresponding to ‘up’ vs. ‘down’ choices. The resulting trace represents the excess motion energy that drives choices, over and above the generative stimulus coherence (Figure 7—figure supplement 1c).

Extended bounded accumulation models

General assumptions and procedures

Request a detailed protocol

In the 2AFC (FD) visual motion experiment participants viewed the stimulus for 0.75 s (hereafter called ‘viewing period’) and could respond only after the stimulus offset. This required specifying the input to the evidence accumulation process. In the models described below, we used separate simulation protocols, based on different assumptions about this input. In the ‘dynamic’ protocol, where the input was the time-varying sensory evidence from each trial, the accumulation process was assumed to start at stimulus onset, and responses could happen during the motion viewing interval. The average activity of the accumulator(s) at stimulus offset served as input for accumulation during the post-offset period. For fitting models using this protocol, empirical RTs were calculated relative to the stimulus onset. Motion energy estimates were used as time-resolved input to the model.

By contrast, in the ‘default’ protocol, the motion energy fluctuations were averaged across the viewing interval excluding the filter rise time (i.e. from 250 to 750 s after stimulus offset), and the average motion energy was then used as a single-trial drift rate for the accumulation process. In other words, the accumulation-to-bound dynamics only took place during the post-offset period. Accordingly, when fitting models with this protocol, the empirical RTs were calculated relative to stimulus offset. Using this protocol was necessary for replicating our basic result from the standard DDM fits: For the ‘dynamic’ protocol, any starting point bias would turn into a drift bias because it would feed into accumulation process after stimulus offset, precluding the comparison between the two forms of bias. Thus, we used only the default protocol for the standard DDM fits, which aimed at differentiating between starting point and accumulation biases. For comparison, we also used the same simulation protocol when fitting an extended DDM with a both a constant and a ramping component in the drift bias (see below). We then switched to the more realistic dynamic protocol for the subsequent models with more complex dynamics.

The AIC scores of models using the default protocol were generally lower (i.e. better) compared to the respective models that used the dynamic protocol. This difference is likely due to the fact that the dynamic protocol is more constrained by using as input to the models the exact motion energy traces rather than just their mean for each trial. AIC is blind to such latent flexibility differences that do not map onto differences in number of parameters. Thus, AIC may have ‘under-penalized’ models in the default protocol relative to those in the dynamic protocol.

In all models and in both simulation protocols, model predictions were derived via Monte Carlo simulation. The variance of the processing noise was set to c2=1. One simulation time-step corresponded to 5 ms (Euler step, dt=0.005). Finally, in the standard protocol the accumulation process could last for a maximum of 300 time-steps (or 1500 ms) and in the dynamic protocol for a maximum of 450 time-steps (or 2250 ms). After these time points, the process timed-out and a response was assigned to the alternative according to the state of the diffusion variable (e.g. in the standard DDM right if y>a2 and left if y<a2).

DDM variants with default simulation protocol

Request a detailed protocol

For all basic DDM variants described in this section, we used the default simulation protocol: the time-averaged motion energy for each trial provided the drift-rate (v) driving the subsequent diffusion process. DDM models had five generic parameters: threshold (a), noise scaling (g), non-decision time (Ter), drift-rate variability (sv) and starting-point variability (sz).

Naïve DDM. We denote with ythe state of the diffusion variable. At time 0:

(6) y0=z=a2+U(-sz,sz)

where U was a uniform random variable (rectangular distribution) in the (-sz,sz) range. The evolution of y was described by:

(7) dy=gv¨dt+cdW

Above, g was the scaling parameter that controls the signal-to-noise-ration (given that c is fixed at 1). The variable v¨ was the effective drift-rate, that is a Gaussian variable with N(m,sz2) where sz was the drift-rate variability and m was the average of the motion energy on each trial. A response was generated when the decision variable y exceeded a (right choice) or surpassed 0 (left choice). The moment that either of these boundaries was crossed plus a non-decision time Ter, determined the per-trial RT.

Starting point DDM. This model was the same as the naïve model but with an extra parameter zbias such that at time 0:

(8) y0=a2+U-sz,sz+zbiasprev

The variable prev here encoded the previous choice (1: right, -1: left). If zbias was positive the model implemented repetition and if negative it implemented alternation.

Drift bias DDM. Same as the naïve model but with an extra biasing parameter vbias such that:

(9) dy=(gv¨+vbiasprev)dt+cdW

Hybrid DDM. This version combined the starting point DDM and drift bias DDM using two biasing parameters.

Simple Ramping DDM. This model was the same as the naïve model but with an extra parameter sramp such that:

(10) dy=(gv¨+sramptprevtmax)dt+cdW

where t denoted time elapsed in terms of Monte-Carlo time-steps and tmax = 300 time-steps, which was the maximum duration that a given trial could run for.

Hybrid Ramping DDM. Same as the naïve model but with 2 extra parameters sramp and sconstant such that:

(11) dy=(gv¨+(sconstant+srampttmax)prev)dt+cdW  

This model thus implemented a drift bias that is nonzero at the start of the trial (sconstant), and also linearly increases until the end of the trial (with slope sramp).

Extended models with dynamic simulation protocol

Request a detailed protocol

For all subsequently described models, we used the dynamic simulation protocol (see section General Assumptions and Procedures), with the motion energy time courses serving as input to the accumulation process. To illustrate the details of the dynamic protocol, we next describe how the decision variable was updated in the case of the naïve DDM. The decision variable during the viewing period evolved according to the following differential equation:

(12) dy(t)=gMtdt+cdW

where Mt was the value of the input signal at time t. Following stimulus offset (at T), after 150 time-steps, the diffusion variable carried on being updated as follows:

(13) dy(t)=y(T)T+cdW

In other words, after the stimulus disappeared, accumulation was driven by the average evidence accumulated up to the point of stimulus offset. This post-stimulus accumulation could continue for a maximum of 300 extra time-steps, at which point the process timed-out.

Simple and Hybrid Ramping DDM. This model was the same as the above Simple and Hybrid Ramping DDMs, only now fit by using the dynamic simulation protocol (i.e. the ramping drift-criterion bias is applied for the viewing period only and, following stimulus offset, the decision variable is updated according to Equation 13).

Dynamic DDM with collapsing bounds

In the ‘collapsing bounds’ DDM models, a response was generated when the diffusion variable (y) exceeds bup (right choice) or surpasses bdown (left choice). The two thresholds, bup and bdown, vary over time as follows:

(14.1) bupt=a-att+ca/2a
(14.2) bdown(t)=att+c0a/2

In the above, the notation xminmax indicates that x was clamped such that x[min,max].

The moment that either of these boundaries was reached, plus a non-decision time Ter, determined the per-trial RT. The dynamic DDM model had five basic parameters: threshold initial value (a), threshold collapse rate (c), noise scaling (g), non-decision time (Ter), and starting-point variability (sz).

Starting point dynamic DDM

Request a detailed protocol

Here, the state of the diffusion variable was initialized according to Equation 8. Thus, the starting point model had 6 free parameters (the five basic ones plus the starting point bias, zbias).

Drift-bias dynamic DDM

Request a detailed protocol

The diffusion variable at time 0 was initialized according to Equation 8. Also, the diffusion variable in the viewing period was not updated according to Equation 9 but according to:

(15) dy(t)=(gMt+vbiasprev)dt+cdW

The drift-bias model had the five basic parameters plus the drift-bias parameter (vbias). Finally, the hybrid dynamic DDM had two biasing parameters (zbias and vbias) and overall seven free parameters. The diffusion variable was initialized according to Equation 8 and evolved in the viewing period according to Equation 12 and in the post-stimulus period according to Equation 13.

Leaky Accumulator Models – General

Request a detailed protocol

The leaky accumulator model was based on models described before (Busemeyer and Townsend, 1993; Zhang and Bogacz, 2010), constituting an extension of the DDM:

(16) dy=(sv+λy)dt+cdW

where the rate of change of y now also depends on its current value, with a magnitude controlled by the additional parameter λ, the effective leak which reflects the time constant of the accumulation process.

We defined three dynamic variants (c.f. dynamic DDM above) of the leaky accumulator model in order to account for history biases. These different biasing mechanisms were further crossed with two different bound regimes: static or collapsing bounds, as described for the DDM above.

Leaky Accumulator with Starting Point Bias

Request a detailed protocol

Here, the diffusion variable was initiated according to Equation 8. During the viewing period, it was updated according to:

(17.1) dy(t)=(λy(t)+gMt)dt+cdW

After stimulus offset, accumulation continued according to:

(17.2) dy(t)=λy(t)+y(T)T+cdW

Leaky Accumulator with Input Bias

Request a detailed protocol

Here, the diffusion variable was initiated according to Equation 6. The evolution of the decision variable during the viewing period was described by:

(18) dy(t)=(λy(t)+gMt+vbiasprev)dt+cdW

After stimulus offset accumulation continued according to Equation 17.2. Responses were determined by a static threshold crossing mechanism, as in the standard DDM models described above.

The third leaky accumulator model we defined, the λ-bias model, accounted for history biases by introducing an asymmetry in the dynamics of evidence accumulation. In this model, we followed a different implementation in order to enable biasing the effective leak (λ) parameter: we reformulated the model to describe two separate accumulators that integrate the sensory evidence. We define the diffusion variable as y=yA-yB, with yA and yB being two independent accumulators coding for the right and left choice. The two accumulators were initialized as follows:

(19.1) yA0=U-sz,sz
(19.2) yB0=0

Starting point variability was thus applied only to one accumulator, which was equivalent to applying this variability on their difference (diffusion variable y).

During the viewing period the two accumulators were updated according to:

(20.1) dyA(t)=[λAyA(t)+gfA(Mt)]dt+cdW2
(20.2) dyB(t)=[λByB(t)+gfB(Mt)]dt+cdW2

The variance of the processing noise applied to each accumulator was divided by two such as the processing variance of the accumulators’ difference (variable y) is c2, as in the DDM.

The functions fA and fB were threshold linear functions, with fA setting negative values to 0 and fB setting positive values to 0. Specifically:

(20.3) fA(x)={x, if x>00, if x0
(20.4) fB(x)={0, if x>0x, if x0

Thus, the yA accumulator 'listened' only to the negative values of the input stream while the yB only to positive values. The effective leak parameters for each accumulator were defined as follows:

(20.5) λA=λ+fA(prev)λbias
(20.6) λB=λ+fB(prev)λbias

Leaky Accumulator with Static Bounds

Request a detailed protocol

A response was initiated when the difference between the two accumulators (y) exceeded a positive threshold +a (right choice) or surpassed a negative threshold –a (left choice). These leaky accumulator models had one biasing parameter each as well as the following five basic parameters: threshold value (a), effective leak (λ), noise scaling (g), non-decision time (Ter), and starting-point variability (sz).

Leaky Accumulator with Collapsing Bounds

Request a detailed protocol

We implemented versions of the leaky accumulator models described above using collapsing bounds. For the input and starting point bias models, the time-varying bounds are described in Equations 14.1 and 14.2. For the λ bias model, collapsing bounds had the same functional form but their asymptote was set to 0 (mirroring the fact that in this model the neutral point of the y=yA-yB decision variable was at 0, rather than at a/2 as in all other models involving a single accumulator):

(21.1) bupt=a-att+c0a
(21.2) bdown(t)=att+c-a-a0

Model fitting procedures

Request a detailed protocol

We fit the extended models using a Quantile Maximal Likelihood (QMPE) approach. Under this approach, empirical RT values are classified into bins defined by the 0.1, 0.3, 0.5, 0.7 and 0.9 quantiles of the RT distribution (six bins overall). RT quantiles were derived separately for the various coherence levels. We excluded the 81% coherence trials and pooled together the 0% and 3% coherence trials as RT quantiles in these trials were not distinguishable. This resulted in quantiles for each of three difficulty levels (0% and 3%, 9% and 27%), for each of the two responses (correct/error), and for two history conditions (motion direction in current trial consistent or inconsistent with the previous response), leading to 6 bins x 3 coherence x 2 response x two history = 72 bins per participant. Denoting the number of empirical observations in a particular bin k by nk and the probability predicted by the model to derive a response in a particular bin k by Pk, the likelihood L of the data given the model is defined as:

(22) L=kPknk

We applied a commonly used multi-stage approach to fit our simulation-based models (e.g. Teodorescu et al., 2016). First, each fitting session started by generating 20 random parameter sets, drawn from a uniform distribution bounded by the range of each parameter. To improve the precision of likelihood estimates, we generated 10 synthetic trials for each experimental trial, replicating the trials for a given participant. We then computed the likelihood of the model parameters given the data. The parameter set with the best fit out of the initial 20 was used as the starting point for a standard optimization routine (fminsearchbnd function in Matlab, which implements a constrained version of the Nelder-Mead simplex algorithm). In total, we ran 50 of such fitting sessions, each with a different random seed. Second, we chose the best-fitting parameter set from each of the 50 sessions and recomputed the likelihood while replicating 20 synthetic trials for each experimental trial. Third, the five best-fitting of these 50 sets were used as starting points fminsearchbnd, which further refined the local minima of the fit. Fourth, we recalculated the likelihood of the single best parameter set in simulations with 30 synthetic trials for each experimental trial (see Equations 6). For each model f, AIC values were calculated at the group level:

(23) AICf=2SNIn(s)+2mf

where N is the total number of participants and s is the participants index. s denotes the maximum likelihood estimate for each participant. Finally, mf is the number of free parameters for a given model f.

Effective bias signal

Request a detailed protocol

We calculated the effective bias signal (as in Hanks et al., 2011) for the winning leaky accumulator model with collapsing bounds (Figure 7d). We assumed that the current choice is biased in the direction of the previous choice (repetition bias). We arbitrarily set the previous choice to ‘right’ (prev = 1), which means that the biasing mechanisms pushes the decision variable closer to the upper boundary. In both models, the effective bias signal at time t was obtained by dividing the value of the cumulative bias signal by the value of the upper bound on that moment.

We took the average of the absolute input bias parameter, so as to emulate a repetition bias. Participants were divided in two groups based on the sign of the fitted parameter λ. We calculated the effective bias signal in two instances: a) by averaging parameters across participants with λ > 0, and b) by averaging parameters across participants with λ < 0. Because the time courses were very similar in these two cases, in Figure 6d we show the average of the two effective bias signals.

Model simulations

Request a detailed protocol

We simulated various biasing mechanisms within the frameworks of the DDM and the leaky accumulator models. Per biasing mechanism, we simulated 100K traces in timesteps of 10 ms using Equations 2 (DDM) and Equation 18 (leaky accumulator).

For the DDM simulations (Figure 7—figure supplement 3), the main parameters were: boundary separation = 1; drift rate = 1; non-decision time = 0.1; starting point = 0.5 (expressed as a fraction of the boundary separation); drift bias = 0; drift rate variability = 0.5. We simulated three levels of starting point bias (0.56, 0.62 and 0.68), three levels of constant drift bias (0.2, 0.5 and 0.8), three levels of a time-dependent linear increase in drift bias (1.5/s, 2.5/s and 3.5/s), three levels of constant drift bias (0.2, 0.5 and 0.8) in combination with hyperbolically collapsing bounds (given by Equation 16 and using c = 3), and three levels of one time-dependent collapsing and one expanding bound: 0.2/s, 0.5/s and 0.8/s.

For the leaky accumulator simulations (Figure 7—figure supplement 2), the main parameters for each accumulator were: input = 1; boundary = 0.42; λ = -2.5; starting point = 0; input bias = 0. The negative λ’s determined that the accumulators were self-excitatory in nature (as opposed to leaky). We choose this to match the primacy effects observed in the data (Figure 7—figure supplement 1d). We simulated three levels of starting point bias (0.05, 0.10 and 0.15), three levels of input bias (0.2, 0.5 and 0.8), and three levels of λ-bias between the two accumulators: (-3 vs -2, -4 vs -1, and -5 vs 0).

We then fit DDM models separately to each of the simulated datasets and fit the parameters boundary separation, drift rate, non-decision time, starting point, drift bias and drift rate variability.

Statistical tests

Request a detailed protocol

We quantified across-subject correlations between P(repeat) and the individual history components in DDM bias parameter estimates using Spearman’s rank correlation coefficient ρ. The qualitative pattern of results does not depend on the choice of a specific correlation metric. Even though individual subject parameter estimates are not independent due to the hierarchical nature of the HDDM fit, between-subject variance in parameter point estimates can reliably be correlated to an external variable - in our case, P(repeat) - without inflation of the false positive rate (Katahira, 2016). The difference between two correlation coefficients that shared a common variable, and its associated p-value, was computed using Steiger’s test (Steiger, 1980).

We used Bayes factors to quantify the strength of evidence across our different datasets. We first computed the Bayes factor for each correlation (between P(repeat) and the history shift in starting point, and between P(repeat) and the history shift in drift bias) (Wetzels and Wagenmakers, 2012). We then multiplied these Bayes factors across datasets to quantify the total evidence in favor or against the null hypothesis of no correlation (Scheibehenne et al., 2016). BF10 quantifies the evidence in favor of the alternative versus the null hypothesis, where BF10 = 1 indicates inconclusive evidence to draw conclusions from the data. BF10 <1/10 or >10 is taken to indicate substantial evidence for H0 or H1 (Kass and Raftery, 1995).

Data and code availability

Request a detailed protocol

All behavioral data, model fits and analysis code are available under a CC-BY 4.0 license at https://doi.org/10.6084/m9.figshare.7268558. Analysis code is also available on GitHub (https://github.com/anne-urai/2018_Urai_choice-history-ddm; copy archived at https://github.com/elifesciences-publications/2018_Urai_choice-history-ddmUrai and de Gee, 2019).

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
    Signal Detection Theory and Psychophysics
    1. DM Green
    2. JA Swets
    (1966)
    John Wiley and Sons.
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
    Bayes factors
    1. RE Kass
    2. AE Raftery
    (1995)
    Journal of the American Statistical Association 90:773–795.
    https://doi.org/10.1080/01621459.1995.10476572
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
    What cognitive processes drive response biases? A diffusion model analysis
    1. FP Leite
    2. R Ratcliff
    (2011)
    Judgment and Decision Making 6:651–687.
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88
    Calculation of signal detection theory measures
    1. H Stanislaw
    2. N Todorov
    (1999)
    Behavior Research Methods, Instruments, & Computers 31:137–149.
    https://doi.org/10.3758/BF03207704
  89. 89
  90. 90
  91. 91
    Reinforcement Learning: An Introduction
    1. RS Sutton
    2. AG Barto
    (1998)
    Cambridge: A Bradford Book.
  92. 92
  93. 93
  94. 94
  95. 95
  96. 96
  97. 97
  98. 98
  99. 99
  100. 100
  101. 101
  102. 102
  103. 103
  104. 104
  105. 105
    Decomposing bias in different types of simple decisions
    1. CN White
    2. RA Poldrack
    (2014)
    Journal of Experimental Psychology: Learning, Memory, and Cognition 40:385–398.
    https://doi.org/10.1037/a0034851
  106. 106
    HDDM: hierarchical bayesian estimation of the Drift-Diffusion model in python
    1. TV Wiecki
    2. I Sofer
    3. MJ Frank
    (2013)
    Frontiers in Neuroinformatics, 7, 10.3389/fninf.2013.00014, 23935581.
  107. 107
    Sequential effects reflect parallel learning of multiple environmental regularities
    1. M Wilder
    2. M Jones
    3. MC Mozer
    (2009)
    In: Y Bengio, D Schuurmans, J. D Lafferty, C. K. I Williams, A Culotta, editors. Advances in Neural Information Processing Systems 22. Curran Associates, Inc. pp. 2053–2061.
  108. 108
  109. 109
  110. 110
  111. 111
    Sequential effects: superstition or rational behavior?
    1. AJ Yu
    2. JD Cohen
    (2008)
    Advances in Neural Information Processing Systems 21:1873–1880.
  112. 112
    Sequential effects: a bayesian analysis of prior bias on reaction time and behavioral choice
    1. S Zhang
    2. CH Huang
    3. Y Aj
    (2014)
    Cognitive Science Society.
  113. 113

Decision letter

  1. Timothy Verstynen
    Reviewing Editor; Carnegie Mellon University, United States
  2. Barbara G Shinn-Cunningham
    Senior Editor; Carnegie Mellon University, United States
  3. Timothy Verstynen
    Reviewer; Carnegie Mellon University, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Choice history biases subsequent evidence accumulation" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Timothy Verstynen as the Reviewing Editor and Reviewer #3, and the evaluation has been overseen by Barbara Shinn-Cunningham as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

This manuscript by Urai and colleagues reports a thorough analysis of how choice history impacts decision processes during perceptual decision-making. Analyzing data from 6 experiments (4 previously published), the authors asked whether choice history modifies the parameters of the drift diffusion model in a reliable and meaningful way. They consistently find that the influence of past choices on future decisions is best explained by a bias in drift rate toward the previously selected decision (regardless of accuracy). While there was evidence of a minor influence of history on the starting point bias, this effect was largely dwarfed by the modulation of the drift rate. The authors contrast their results with prior work suggesting that choice history primarily shifts the starting point of the decision process

This is an extremely well-written paper, supported by a sound, thorough, and sophisticated set of analyses and strikingly clear visualization. The authors' dedication to replicating the computational mechanism is laudable, and the (five-times-replicated) results give strongly convincing evidence that choice history biases the drift-rate in perceptual decision-making. That being said, there are several open questions that remain unresolved.

All reviewers were enthusiastic about the paper, however, they also identified several issues that need to be addressed before the study can be accepted.

Essential revisions:

1) Dual parameter models. All three reviewers thought that there needed to be a more in depth investigation of the dual parameter models.

Reviewer 1 pointed out that the authors repeatedly conclude that history shifts in drift rate, but not starting point, account for choice history effects, but they show that a model that lets both of these parameters vary with choice history fits better than either one alone in 5 out of 6 datasets. This is potentially interesting, and the paper would benefit from a more nuanced/accurate interpretation of this result. As it is currently written, the narrative is that either of these mechanisms could account for choice history effects, but only drift rate does. While drift rate seems to have a stronger effect, the authors need to present an intuition about why the model including both as free parameters fits the best. What does the starting point add, that the drift rate doesn't account for? How do the starting point and drift rate interact? Is it the case that one (starting point) accounts for choice history effects on fast RT trials, and the other (drift rate) accounts for effects on slower RT trials? Delving deeper here could make the paper richer. To this point, in Figure 3B, what is the performance of a model including both z and vbias?

Reviewer 1 also points out that, in Figure 6, for the model that incorporates trial history, it looks like only 2 of 6 datasets are best fit by a model including both z and vbias, and most of the rest favor vbias. Does this reflect their parameterization of z? Did the authors explore the possibility that z and vbias might integrate over different timescales? The Materials and methods section reports that they fit the same regressors (X) for vbias and z in the HDDM regression models; if that's not the case, the authors should clarify. z seems to help model performance when it reflects one trial back, why is that not the case for longer timescales?

Reviewer 2 pointed out that the authors appear to treat choice history effects as monolithic with respect to mechanism. That is, they test whether influences of previous choices are universally (across the tasks explored) accounted for by certain model parameters and not others, whether those relate to response repetition versus alternation. This assumption may ultimately be justified, but the paper does not offer support for it from the literature or from the data examined. It would be appropriate to include a set of analyses that split the drift and starting point biases in two, with one set of fit parameters increasing the bias towards a given response after it is chosen, and the other set decreasing that bias. (The authors should select whether to do this in the standard DDM framework or one of their better-performing models, but they certainly do not need to test this across their wide set of models.) How does such a model compare to their current version that does not distinguish between these response biases? Do the currently observed relative influences of drift and starting point hold across both directions of bias?

Finally, reviewer 3 pointed out that the original model comparisons (Figure 3) show that in 4 of the 6 experiments a dual parameter model best explains behavior. While the authors make a clear case that choice bias drives drift-rate bias changes, it could be that another factor (e.g., accuracy) is also modifying starting point bias. In fact, the previous papers cited above show that accuracy biases boundary height. Given this previous literature, the authors should consider what other factors may independently be manipulating starting point bias (and boundary height).

2) Choice history effects. All three reviewers requested a more detailed elaboration of the nature of the choice history effects.

Reviewer 1 pointed out that, in Figure 4, when the authors compare history shifts in z and vbias to each subjects' choice repetition, are they only looking at trials where the current evidence and the previous choice were congruent (meaning, if I chose right on trial t-1, and right evidence > left evidence on trial t)? If so, this should be clarified in the text. If not, is this why the z parameter does so poorly? How would these plots look for trials when previous choice and current evidence are congruent vs. incongruent?

Reviewer 2 strongly recommended revising the Introduction and the start of Results to enumerate the choice history biases that have previously been observed, and the approach of the current analyses in clustering these together (this was not clear on first read).

Reviewer 2 also pointed out that it wasn't clear what distinction the authors were drawing between the prior reported effects and the current experiments – they say that the current ones "emerge spontaneously and in an idiosyncratic fashion" but estimates of prior probability could also change spontaneously and idiosyncratically. Perhaps the distinction they are drawing (alluded to in the following paragraph) is between environments that naturally engender a rational basis for choice history biases (e.g., autocorrelation between responses) versus those that do not. This could still predict that the same rational biases that produce starting-point biases in other experiments could result in shorter term history effects in experiments like the current one, if participants treat short runs of the same response as weak evidence for autocorrelation. The current findings suggest that this is not the case for these data. Also it is unclear whether the authors find any cumulative effects of choice history if they look at how many times a given response was given prior to the current response (this is similar to the current lagged analyses but assumes a specific cumulative effect).

Reviewer 3 pointed out that the authors' lack evidence to justify their claim that the accuracy of the preceding trials does not influence choice history effects on the drift rate. This inference is largely based on the fact that the drift rate correlations with repetition probability pass a significance threshold (Bayes factor and p-value) both when the previous trial was correct and when it was incorrect (Figure 5). However, the magnitude of the likelihood ratios (i.e., BF10's) look consistently smaller for previous error trials than previous correct trials. No direct comparison on the magnitude of these effects is run, thus the actual null hypothesis as to whether they are different was never evaluated.

3) Alternative models. Both reviewers 2 and 3 had concerns about possible alternative mechanisms.

Reviewer 2 pointed out that the authors currently interpret choice history as being causal on changes in drift across trials, but they do not address potential third variables that could commonly influence both. For instance, could autocorrelated changes in drift rate across trials (due to some unrelated generative function, e.g., slowly drifting over the experiment) drive the observed choice history biases? There isn't a strong intuition that this should be the case but it is a plausible mechanism (i.e., as an autocorrelated alternative to typical formulations of drift variability) and it would be easy enough to simulate this in order to rule it out. Similarly, could an alternate form of autocorrelation that reflects regression-to-the-mean on drift rate (i.e., larger drift rates are more likely to be followed by smaller ones) produce these biases?

Reviewer 3 pointed out that, several papers have looked at how reinforcement learning mechanisms target specific parameters of accumulation-to-bound processes (see Frank et al., 2015; Pedersen et al., 2017; Dunovan and Verstynen, 2019). The critical difference between these studies and the current project is that all three found evidence that selection accuracy targets the boundary height parameter itself. Not only is it relevant to link the current study to this previous work, but it also begs the question as to why the authors did not test the boundary height parameter as well. The motivation for comparing models featuring starting point and drift-rate bias terms, respectively, makes sense. While the current analyses point to an accuracy-independent effect of choice bias (see point #2 for comments on accuracy analysis), given the previous findings showing the dependence between of the boundary height and selection accuracy a model testing this effect should be considered. Alternatively, given that the decision boundary is a relevant parameter and accuracy has been shown to modulate the boundary, the authors should give their rationale for its exclusion, in combination with a convincing response to the critique of the accuracy analysis (see below).

Frank MJ, Gagne C, Nyhus E, Masters S, Wiecki TV, Cavanagh JF, et al. fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning. J Neurosci. 2015;35(2):485-494

Pedersen ML, Frank MJ, Biele G. The drift diffusion model as the choice rule in reinforcement learning. Psychonomic bulletin and review. 2017;24(4):1234-1251.

Dunovan K, and Verstynen T. "Errors in action timing and inhibition facilitate learning by tuning distinct mechanisms in the underlying decision process." J Neurosci (2019): 1924-18.

4) Leaky competing accumulators. Reviewer 3 had concerns about the LCA simulations. The extension of the findings to models with collapsing bounds is interesting as a robustness analysis. However, the incorporation of the LCA model seems less straightforward. If input bias is the LCA parameter most similar to the drift rate in the DDM, then what are we to make of model 4 (LCA without collapsing bounds) that shows λ-bias being the best model (Figure 7B)? It is completely left out of Figure 7C. All in all, this analysis seemed to simply "muddy the water" on the main results.

5) Mechanism. Reviewer 3 also requested more details on the proposed mechanism. The idea that changes in a history-dependent drift-rate bias correspond to shifts in endogenous attention from one interpretation to another could be further developed. In particular it seems to imply a more explicit (or at least non-procedural) mechanism for trial-wise changes in drift rate bias. But one could imagine many other mechanisms driving this effect as well (see papers referenced above). As it stands, the connection between shifts in attention and drift-rate bias does seem to be an intuitively plausible explanation (one of many), but a further description of the authors' line of thought would help to convince the reader.

https://doi.org/10.7554/eLife.46331.032

Author response

Essential revisions:

1) Dual parameter models. All three reviewers thought that there needed to be a more in depth investigation of the dual parameter models.

Reviewer 1 pointed out that the authors repeatedly conclude that history shifts in drift rate, but not starting point, account for choice history effects, but they show that a model that lets both of these parameters vary with choice history fits better than either one alone in 5 out of 6 datasets. This is potentially interesting, and the paper would benefit from a more nuanced/accurate interpretation of this result. As it is currently written, the narrative is that either of these mechanisms could account for choice history effects, but only drift rate does. While drift rate seems to have a stronger effect, the authors need to present an intuition about why the model including both as free parameters fits the best.

Thank you for raising this important point. We reply to each of your specific questions pertaining to this issue in the following. Upfront, we did not intend to claim that only drift bias “accounts for choice history effects” in a general sense. What we do want to claim (and what we think is strongly supported by our data) is more specific: that history-dependent shifts in drift bias, not starting point, explain individual differences in overt choice repetition behavior. Even so, the previous choice consistently shifts the average starting point towards negative values. We now realize that the latter aspect was not sufficiently reflected in our previous presentation of the results. We have now elaborated on the starting point effects by means of (i) additional analyses and (ii) a new paragraph in the Discussion section.

Specifically, we added the following new analyses (described in more detail below):

- Statistical tests of the average history shift in starting point (Figure 4—figure supplement 5A);

- The relationship between RT and overall repetition bias (rather than individual choice history bias; Figure 4—figure supplement 5C);

- The average correlation between individual drift bias and starting point estimates for each dataset (subsection “History-dependent accumulation bias, not starting point bias, explains individual differences in choice repetition behaviour”, last paragraph);

- Simulations of various DDM and leaky accumulator models, and DDM fits of these synthetic data (Figure 7—figure supplement 2 and 3).

Furthermore, we added the following Discussion paragraph:

“While we found that choice history-dependent variations of accumulation bias were generally more predictive of individual choice repetition behavior, the DDM starting point was consistently shifted away from the previous response for a majority of participants (i.e., negative values along x-axis of Figure 4A). […] Future work is needed to illuminate this issue, for example through manipulating decision speed and/or the delays between subsequent motor responses, and modeling choice-related neural dynamics in the motor cortex.”

What does the starting point add, that the drift rate doesn't account for?

Please see the above Discussion paragraph for our general take on this issue. The starting point shift might either reflect a “real mechanism” (but with little impact on individual choice repetition behavior), or actual limitations of the standard DDM. To explore the second possibility, we have extended our simulations of bounded accumulation models. Figure 7—figure supplement 2 and 3 now show, for each model:

1) The conditional bias function when simulating data with varying bias strength;

2) The recovered parameter estimates for z and vbias, when fitting a hybrid model;

3) DBIC values from a model without history for z-only, vbias-only, and the hybrid model;

4) Correlations between synthetic individuals’ P(bias), and the parameter estimates for z and vbias, as estimated from the hybrid models.

This analysis reveals two specific mechanisms that can produce both the lowest BIC for a hybrid model, together with an opposite effect on DDM drift bias and starting point:

1) A nonlinearly increasing dynamic bias (i.e. ramping drift bias) in an extended perfect accumulator model (Figure 7—figure supplement 2);

2) A leak bias in an accumulator model (Figure 7—figure supplement 3), to which we point in the Discussion paragraph (quoted above).

How do the starting point and drift rate interact?

We used two approaches to illuminate this question. First, we correlated the parameter estimates for history shift in starting point with the history shift in drift bias across observers. Correlations were generally negative (mean Spearman’s rho: -0.2884, range -0.4130 to 0.0757), the correlation coefficient only reached significance at p < 0.05 in one dataset. The combined Bayes Factor BF10 was 0.0473, indicating strong evidence for H0 (Wetzels and Wagenmakers, 2012). This is now reported in the section "History-dependent accumulation bias, not starting point bias, explains individual differences in choice repetition behavior”. Second, we used simulations (Figure 7—figure supplement 2 and 3) to unravel which modes of decision dynamics may in principle give rise to such a negative correlation: this pattern can arise from a non-linearly increasing dynamic bias (i.e., ramping drift bias Figure 7—figure supplement 2, third column), or from a leaky accumulation process with a bias in effective leak (Figure 7—figure supplement 3, third column).

Is it the case that one (starting point) accounts for choice history effects on fast RT trials, and the other (drift rate) accounts for effects on slower RT trials? Delving deeper here could make the paper richer.

This is a very interesting idea, which we now elaborate on in the Discussion paragraph. All features of the current results may be explained by the combined effects two distinct biasing mechanisms, whose effects superimpose, but contribute differently to fast and slow choices: (i) motor response alternation which will load on starting point and dominate fast choices; and (ii) an accumulation bias, which will load on drift bias and dominate slow choices.

Further, the starting point effect may be stereotypical (e.g. hardwired in the motor machinery), which is why it is generally negative for most subjects as we found in our DDM fits. Such a mechanism may be explained by the selective “rebound” of beta-band activity in the motor cortex (e.g., Pape and Siegel, 2016). By contrast, the drift bias may be more adaptive, varying with subjects’ belief about the structure of the environment, and thus giving rise to the individual differences in repetition behavior we observe in our participants overall.

In all our data sets, RTs are long so that the drift bias would be expected to dominate repetition behavior, just as we found. However, this scenario makes specific predictions when sorting trials based on RT: at very short RTs, we should predominantly find choice alternation, which is invariable across all participants, due to the stereotypical starting point shift. Furthermore, the starting point shift should only be caused by the immediately preceding response, whereas the drift bias may be affected by several choices back into the past.

In Figure 4—figure supplement 5C, we now test these predictions. This shows the overall probability of choice repetition as a function of RT quantile (not correcting for individual differences in choice history bias, as we do throughout the main figures). As expected, for the dataset which including very short RTs (< 600ms), we do observe a strong choice alternation bias on average across the group. Since not all of our datasets have many trials with such short RTs, we do not want to make too strong claims about this pattern and placed the RT split analysis in the supplement. Future work should systematically test this scenario, for example through manipulating decision speed and/or the delays between subsequent motor responses, and by modeling decision-related neural dynamics in the motor cortex.

To this point, in Figure 3B, what is the performance of a model including both z and vbias?

The model prediction for the hybrid model CBF is now added into the new Figure 3B. It is overall quite similar to the vbias-only model.

Reviewer 1 also points out that, in Figure 6, for the model that incorporates trial history, it looks like only 2 of 6 datasets are best fit by a model including both z and vbias, and most of the rest favor vbias. Does this reflect their parameterization of z?

This is a very perceptive point. Indeed, we also noticed and explored this discrepancy, and concluded that it is due to differences in the model fitting approaches for the two analyses – specifically between standard DDM fits used for all main analyses reported in this paper, and the regression approach used to Figure 6 (Wiecki et al., 2013). The differences persist even after fitting the regression model analogously to the fits in the other figures – i.e. only based on the choice (not correct and incorrect choice) for only lag 1 (data not shown). We believe this is a technical issue, which considers attention by the field of sequential sampling models, but is beyond the scope of the present paper. We have here used the standard fitting approach for all analyses in this paper except for the analysis in Figure 6, which can only be performed with the regression approach. We feel more comfortable to base our selection between competing models on the standard approach in general.

The purpose of the comparisons of different regression models is more methodological in nature: to find the best-fitting lag across trials. So, we have now moved this panel from main Figure 6 to Figure 6—figure supplement 1, instead focusing main Figure 6 on a conceptually more informative issue: the differences in the timescales of the history effects on drift and starting point (see reply below).

Did the authors explore the possibility that z and vbias might integrate over different timescales?

Thank you for this very interesting suggestion. We now assess this quantitatively in the main Figure 6A. We fit exponentials to these “history kernels” in order to estimate the timescale of drift bias and starting point effects. Interestingly, the time constant for starting point is lower than the one for drift bias, in particular after error trials. This difference further speaks to the mechanistically distinct nature of these two history effects. We point out this difference in timescale in the Results section (subsection “Accumulation bias correlates with several past choices”) and again refer to it in Discussion.

The Materials and methods section reports that they fit the same regressors (X) for vbias and z in the HDDM regression models; if that's not the case, the authors should clarify.

We realized that our Materials and methods section was not sufficiently clear, thank you for pointing this out. We have now clarified our description of the approach (section ‘HDDM regression models’):

“We first created a matrix 𝑋, with dimensions trials x 2 * lags, which included pairs of regressors coding for previous stimuli and choices (coded as −1,1), until (and including) each model’s lag. […] Starting point was defined asz~1+Xz, with a logistic link function 11+e-X.”

z seems to help model performance when it reflects one trial back, why is that not the case for longer timescales?

We realize that our model comparison visualization did not reflect the results sufficiently clearly. We have now improved the visualization, which shows that there is a consistent relationship between the three models: in those datasets where the hybrid model has the lowest AIC, this is true across several lags (up to lag 2 or 4).

See also our reply above (sixth response to point #1): these regression model comparisons are not important conceptually (only used to select a particular lag), and distract from the more important message on the timescale of integration. Thus, we have moved the full set of model comparison indices to Figure 6—figure supplement 1.

Reviewer 2 pointed out that the authors appear to treat choice history effects as monolithic with respect to mechanism. That is, they test whether influences of previous choices are universally (across the tasks explored) accounted for by certain model parameters and not others, whether those relate to response repetition versus alternation. This assumption may ultimately be justified, but the paper does not offer support for it from the literature or from the data examined. It would be appropriate to include a set of analyses that split the drift and starting point biases in two, with one set of fit parameters increasing the bias towards a given response after it is chosen, and the other set decreasing that bias. (The authors should select whether to do this in the standard DDM framework or one of their better-performing models, but they certainly do not need to test this across their wide set of models.) How does such a model compare to their current version that does not distinguish between these response biases? Do the currently observed relative influences of drift and starting point hold across both directions of bias?

In its current form, our model can (and does) capture biases either towards a previously given response (repetition), or away from the previously given response (alternation). Indeed, across our groups of observers, we see that the choice repetition probabilities of different individuals lie along a continuum from repeaters to alternators.

To investigate whether the effects on DDM parameters are the same within each of these two sub-groups of participants, we split the participants by their overall history bias: repeaters (P(repeat)>0.5) and alternators (P(repeat)>0.5). We then correlated individual repetition probability with each of the history-shifts in each of the two DDM parameters, separately within each sub-group. This was only possible for five out of the six datasets, where there were sufficient numbers of participants in both sub-groups (Figure 2B). The main result holds also within each sub-group: history shift in drift bias, not starting point, explains individual differences in choice repetition probability. This further corroborates the notion that we can treat individual choice history bias as lying on a single continuum, rather than arising from two qualitatively distinct mechanisms between these two sub-groups of individuals, or behavioral patterns.

This result is now reported in the text as follows:

“The same effect was present when individual participants were first split into “Repeaters” and “Alternators” based on P(repeat) being larger or smaller than 0.5, respectively (Figure 4—figure supplement 3)”. We hope this gets to your point.

Finally, reviewer 3 pointed out that the original model comparisons (Figure 3) show that in 4 of the 6 experiments a dual parameter model best explains behavior. While the authors make a clear case that choice bias drives drift-rate bias changes, it could be that another factor (e.g., accuracy) is also modifying starting point bias. In fact, the previous papers cited above show that accuracy biases boundary height. Given this previous literature, the authors should consider what other factors may independently be manipulating starting point bias (and boundary height).

As you suggested, we observe that the previous trial’s correctness affects both boundary separation and overall drift rate. This phenomenon of post-error slowing, and its algorithmic basis in the DDM, is present in some of our datasets (Figure 4—figure supplement 4). However, the results are not highly consistent across datasets, most likely reflecting the fact that our tasks were not conducted under speed pressure, and featured considerable sensory uncertainty.

In general, the reviews have indicated to us that we have not sufficiently addressed the link between our current work on choice history biases and previous work on other forms of sequential effects in decision-making, in particular post-error slowing. Correspondingly, we have now further unpacked the conceptual differences between these two distinct forms of sequential effects in the Introduction.

Additionally, we have now added an analysis where we allow both post-error slowing (i.e. previous correctness affecting bound height and overall drift rate) and choice history bias (i.e. previous choice affecting starting point and drift bias). These two processes seem to be relatively independent in these data: the joint fit shows largely similar results of post-error slowing (compare Figure 4—figure supplement 4B, C with D, E) as well as choice history bias (Figure 4—figure supplement 4F).

2) Choice history effects. All three reviewers requested a more detailed elaboration of the nature of the choice history effects.

Thank you for this suggestion. We realize that it was not sufficiently clear how our current approach and results differ from the previous work on sequential effects in decision making – specifically from previous work on post-error slowing. We have now changed the Introduction as follows:

“Decisions are not isolated events, but are embedded in a sequence of choices. Choices, or their outcomes (e.g. rewards), exert a large influence on subsequent choices (Sutton and Barto, 1998; Sugrue et al., 2004). […] Choice history biases vary substantially across individuals (Abrahamyan et al., 2016; Urai et al., 2017).”

Results:

“Within the DDM, choice behavior can be selectively biased toward repetition or alternation by two mechanisms: shifting the starting point or biasing the drift towards (or away from) the bound for the previous choice (Figure 1). […] History dependent changes in bound separation or mean drift rate may also occur, but they can only change overall RT and accuracy: those changes are by themselves not sufficient to bias the accumulation process toward one or the other bound, and thus towards choice repetition or alternation (see Figure 4—figure supplement 4).”

Reviewer 1 pointed out that, in Figure 4, when the authors compare history shifts in z and vbias to each subjects' choice repetition, are they only looking at trials where the current evidence and the previous choice were congruent (meaning, if I chose right on trial t-1, and right evidence > left evidence on trial t)? If so, this should be clarified in the text. If not, is this why the z parameter does so poorly? How would these plots look for trials when previous choice and current evidence are congruent vs. incongruent?

Our analyses did not distinguish whether previous choice and current stimulus were congruent; all trials were included. This is now clarified in the text:

“For instance, in the left vs. right motion discrimination task, the history shift in starting point was computed as the difference between the starting point estimate for previous ‘left’ and previous ‘right’ choices, irrespective of the category of the current stimulus.”

The possibility of testing whether the previous choice induces a ‘confirmation bias’ on the next trial, increasing perceptual sensitivity to stimuli that are congruent with the previous choice, is intriguing. In fact, we have previously investigated this same question in the context of a more complex sequential decision task that focused on the interaction between two successive judgments made within the same protracted decision process (Talluri et al., 2018). In that manuscript, we describe a model-based analysis approach that is specifically designed to tackle your question. However, the 2AFC structure of current datasets is not suited for the consistency-dependent analyses used in this previous work.

Specifically, splitting trials by whether previous choices are congruent or incongruent leads to a split that is dominated by repetition vs. alternation trials. For example, in trials where the previous choice was “up” in the up/down discrimination task, and the direction of the current stimulus is also up (‘congruent’), choice repetition is correct, and choice alternation is incorrect. Similarly, for trials where the previous choice was “up” and the current stimulus motion is down (‘incongruent’), choice alternation is correct, whereas choice repetition is incorrect. Because participants chose the correct stimulus category on the majority of trials, the ‘congruent’ condition will be primarily populated by choice repetitions, whereas the ‘incongruent’ condition will be primarily populated by alternations. In sum, this analysis is confounded by participants’ above-chance performance and does not reveal any mechanism of choice history bias.

Author response image 1

While we agree that this issue is an important direction for future research, it would require different experimental designs that are beyond the scope of our current paper. Again, we point the reviewers (and interested reader) to our recent work on confirmation bias (Talluri et al., 2018).

Reviewer 2 strongly recommended revising the Introduction and the start of Results to enumerate the choice history biases that have previously been observed, and the approach of the current analyses in clustering these together (this was not clear on first read).

This has now been addressed in the two new paragraphs (from Introduction and Results sections) quoted above.

Reviewer 2 also pointed out that it wasn't clear what distinction the authors were drawing between the prior reported effects and the current experiments – they say that the current ones "emerge spontaneously and in an idiosyncratic fashion" but estimates of prior probability could also change spontaneously and idiosyncratically.

There are two crucial ways in which choice history biases differ from previous studies of choice biases that have applied sequential sampling (“bounded accumulation”) models of decision-making (Hanks et al., 2011; Mulder et al., 2012). First, in the previous experiments, the biases were experimentally induced (through block structure in animals, or explicit task instruction or single-trial cues in humans); by contrast, no such experimental manipulation was performed in our study. (Observers were neither asked pay attention or use the past experimental sequence in any way, nor was there any sequential structure on average). Second, when prior probability was manipulated in the previous experiments, this was the probability of the occurrence of a particular target; not the conditional probability of occurrence, given a previous choice (or other experimental event). This is the difference between a frequency bias and a transition bias (see, e.g. Meyniel et al., PLoS Comput. Biol., 2016).

We have now edited the Discussion paragraph to better explain this difference:

“It is instructive to relate our results with previous studies manipulating the probability of the occurrence a particular category (i.e., independently of the sequence of categories) or the asymmetry between rewards for both choices. Most of these studies explained the resulting behavioral biases in terms of starting point shifts (Leite and Ratcliff, 2011; Mulder et al., 2012; White and Poldrack, 2014; Rorie et al., 2010; Gao et al., 2011; but only for decisions without time pressure, see Afacan-Seref et al., 2018). […] By contrast, the choice history biases we studied here emerge spontaneously and in an idiosyncratic fashion (Figure 2E), necessitating our focus on individual differences. "

Perhaps the distinction they are drawing (alluded to in the following paragraph) is between environments that naturally engender a rational basis for choice history biases (e.g., autocorrelation between responses) versus those that do not. This could still predict that the same rational biases that produce starting-point biases in other experiments could result in shorter term history effects in experiments like the current one, if participants treat short runs of the same response as weak evidence for autocorrelation. The current findings suggest that this is not the case for these data.

This distinction and the one we referred to in the quote above are different. Here, you refer to the distinction between environments with and without serial correlations in the stimuli (more precisely: stimulus categories). Let us elaborate on our assumptions about this here.

Based on insights from recent normative models (e.g. Yu and Cohen, 2009; Glaze et al., 2015) and our own empirical work (Braun et al., 2018), we suspect there is no fundamental difference between task environments that naturally engender choice history bias, and those who don’t. While idiosyncratic choice history biases appear across almost all task environments (even random ones, in which the experimenter intended to eliminate them) they are generally stronger and more consistent across people when the stimulus sequence exhibits correlation structure. In that framework of the above normative models, the existence of history biases in the face of random sequences can be readily explained by assuming that participants bring an internal representation of environmental stability to the lab that is biased toward repetition or alternation. If decision-makers assume that evidence is stable (i.e., repeating across trials), they should accumulate their past decision variables into a prior for the current trial; if they assume the evidence is systematically alternating, they should accumulate past decision variables with sign flips, yielding alternating priors (Glaze et al., 2015). Note that a stability assumption would make sense, because natural environments typically have strong autocorrelation (Yu and Cohen, 2009) just as one finds experimentally.

This is what we mean when we speculate: “… these considerations suggest that participants may have applied a rational strategy, but based on erroneous assumptions about the structure of the environment.”

Also it is unclear whether the authors find any cumulative effects of choice history if they look at how many times a given response was given prior to the current response (this is similar to the current lagged analyses but assumes a specific cumulative effect).

We investigate these cumulative effects by plotting the probability that a trial is a repetition of the previous choice, conditioned on the sequence of choices that preceded it. So, for lag 0 this equals the average P(repeat) across all trials, for lag 1 we take only those trials following a repetition, for lag 2 we take trials following two consecutive repeats, etc. for longer sequences of repetitions. The same analyses can be performed conditioning on increasingly long sequences of alternations.

We observe that, across all observers and datasets, repetition indeed increases following longer sequences of repeats. We refer to this cumulative effect as ‘cumulative repetition bias’. This cumulative bias saturates around 4 consecutive repeats, similar to effects previously reported in humans (e.g. Yu and Cohen, 2009; Cho et al., 2002; Kirby, 1976; Soetens et al., 1985; Sommer et al., 1999) and in rats (Hermoso-Mendizabal et al., 2018, their Figure 2E).

Author response image 2

Reviewer 3 pointed out that the authors' lack evidence to justify their claim that the accuracy of the preceding trials does not influence choice history effects on the drift rate. This inference is largely based on the fact that the drift rate correlations with repetition probability pass a significance threshold (Bayes factor and p-value) both when the previous trial was correct and when it was incorrect (Figure 5). However, the magnitude of the likelihood ratios (i.e., BF10's) look consistently smaller for previous error trials than previous correct trials. No direct comparison on the magnitude of these effects is run, thus the actual null hypothesis as to whether they are different was never evaluated.

Thank you for this perceptive point. We have now added direct comparison between the correlation coefficients after correct and error trials (Figure 5D). In fact, we had also discussed among ourselves prior to submission, but decided against addressing it further due to the associated additional complexity of the analysis.

For the data presented in our previous Figure 4, this comparison would be confounded, due to the asymmetries in trial counts for correct and incorrect choices. As shown in in Author response image 3, when drawing (for each synthetic subject) two random sequences of binomial coinflips with the same underlying probability, and then correlating these between synthetic subjects, the between-subject correlation decreases with fewer trials in each sequence. As an example (with n=32, as in our Visual motion 2AFC (FD) dataset), the median trial counts for error and correct are indicated on the left. This pattern does not decrease with a larger number of subjects, and reflects the added noise in the correlation coefficient when individual datapoints are less precisely estimated.

Author response image 3

To compute the comparison between the correlation coefficients in an unbiased manner, we have now subsampled the post-correct trials to equate the number of post-error trials for each participant and dataset (throughout Figure 5). Based on Bayes factors on the difference in the resulting correlation coefficients, we are not able to strongly refute nor confirm the null hypothesis of no difference (Figure 5D). In sum, we observe a significant positive correlation between P(repeat) and drift bias after both error and correct trials, showing that qualitatively similar mechanisms are at play after both outcomes. That said, we do not claim that these effects are identical.

3) Alternative models. Both reviewers 2 and 3 had concerns about possible alternative mechanisms.

Reviewer 2 pointed out that the authors currently interpret choice history as being causal on changes in drift across trials, but they do not address potential third variables that could commonly influence both. For instance, could autocorrelated changes in drift rate across trials (due to some unrelated generative function, e.g., slowly drifting over the experiment) drive the observed choice history biases? There isn't a strong intuition that this should be the case but it is a plausible mechanism (i.e., as an autocorrelated alternative to typical formulations of drift variability) and it would be easy enough to simulate this in order to rule it out. Similarly, could an alternate form of autocorrelation that reflects regression-to-the-mean on drift rate (i.e., larger drift rates are more likely to be followed by smaller ones) produce these biases?

To address this question, we simulated an ARIMA process that is defined by a parameter c, the auto-correlation at lag 1, and with standard deviation 0.1. We then added this fluctuating pertrial value to the drift rate, rather than using a single fixed drift rate term across trials. In Author response image 4, we allow c to vary positively from 0.1 to 1 (in steps of 0.1; blue, reflecting autocorrelated sequences) and negatively from -0.1 to -1 (in steps of 0.1; orange, reflecting ‘regression to the mean’ sequences). In neither case do we observe repetition or alternation biases, assessed based on P(repeat), nor a bias in starting point or drift. This rules out the notion that autocorrelations, or regression to the mean, on drift rates can produce any of the history-dependent bias effects we have quantified here.

Author response image 4

This result is a specific demonstration of the general point (now elaborated in the paper), that changes in mean drift rate (or boundary separation) alone cannot produce any selective bias towards one or the other choice. The mean drift rate v is an unsigned quantity that affects the decision through its multiplication with the stimulus category s:

dy = s ∙ v ∙ dt + cdW (1)

whereby s is coded as [-1,1] to represent ‘up’ or ‘down’ stimuli. If v is reduced over time (e.g. due to fatigue), this slows down decisions and leads to more errors (and can cause biased estimates of post-error slowing, (Dutilh et al., 2012) but it won’t by itself produce more choice repetitions or alternations. Some change in either starting point or drift bias is necessary for the latter.

Now, it may be possible that a third variable causes drift bias to vary slowly over time, which will then cause choice repetition. This is a viable alternative to the one that choices bias subsequent drift, to give rise to choice repetition (or alternation). Indeed, both scenarios would give rise to autocorrelations in the (signed) drift across trials. One plausible third variable that might cause such slow changes in drift bias towards one or the other bound is selective attention (see also our reply to your point on attention below). We now elaborate on this issue in our paragraph on possible underlying mechanism in our Discussion paragraph on attention.

“It is tempting to speculate that choice history signals in these regions cause the same top-down modulation of sensory cortex as during explicit manipulations of attention. […] These ideas are not mutually exclusive and can be tested by means of multiarea neurophysiological recordings combined with local perturbations.”

We think that both scenarios are interesting, and that conclusively distinguishing between them will require interventional approaches in future work. We are aware of the limitations of our current correlational approach, and we realize that some statements in the previous version of our manuscript may have suggested a strong causal interpretation. We have now rephrased down all statements of this kind throughout the paper. Thank you for pointing us to this important issue.

Reviewer 3 pointed out that, several papers have looked at how reinforcement learning mechanisms target specific parameters of accumulation-to-bound processes (see Frank et al. 2015; Pedersen et al. 2017; Dunovan and Verstynen, 2019). The critical difference between these studies and the current project is that all three found evidence that selection accuracy targets the boundary height parameter itself. Not only is it relevant to link the current study to this previous work, but it also begs the question as to why the authors did not test the boundary height parameter as well. The motivation for comparing models featuring starting point and drift-rate bias terms, respectively, makes sense. While the current analyses point to an accuracy-independent effect of choice bias (see point #2 for comments on accuracy analysis), given the previous findings showing the dependence between of the boundary height and selection accuracy a model testing this effect should be considered. Alternatively, given that the decision boundary is a relevant parameter and accuracy has been shown to modulate the boundary, the authors should give their rationale for its exclusion, in combination with a convincing response to the critique of the accuracy analysis (see below).

Frank MJ, Gagne C, Nyhus E, Masters S, Wiecki TV, Cavanagh JF, et al. fMRI 1037and EEG predictors of dynamic decision parameters during human reinforcement 1038 learning. J Neurosci. 2015;35(2):485-494

Pedersen ML, Frank MJ, Biele G. The drift diffusion model as the choice rule in 1105reinforcement learning. Psychonomic bulletin and review. 2017;24(4):1234-1251.

Dunovan K, and Verstynen T. "Errors in action timing and inhibition facilitate learning by tuning distinct mechanisms in the underlying decision process." J Neurosci (2019): 1924-18.

Thank you for this important point. Figure 4—figure supplement 4 shows two analyses: first (A-C), we test for the pure effect of post-accuracy increases in boundary separation (A) and reductions in mean drift rate (v) – two established sources of post-error slowing (Purcell and Kiani, 2016). While we observe post-error slowing in some of our datasets, this mainly loads onto the drift rate parameter.

We replicate these results when adding choice history bias into the same model (D-F), indicating that choice history biases (specifically via drift bias) and post-error slowing independently shape overall decision dynamics in our data.

Post-error slowing has previously been studied within the DDM framework, whereas choice history bias has not – so our focus is on the latter. Importantly, our control analyses show that all our conclusions pertaining to the mechanisms of choice history bias do not depend on whether or not post-error slowing effects are taken into account (Figure 4—figure supplement 4F).

4) Leaky competing accumulators. Reviewer 3 had concerns about the LCA simulations. The extension of the findings to models with collapsing bounds is interesting as a robustness analysis. However, the incorporation of the LCA model seems less straightforward. If input bias is the LCA parameter most similar to the drift rate in the DDM, then what are we to make of model 4 (LCA without collapsing bounds) that shows λ-bias being the best model (Figure 7B)? It is completely left out of Figure 7C. All in all, this analysis seemed to simply "muddy the water" on the main results.

Thank you for pointing this out – we agree that the two types of LCA models were confusing. We have removed model 4 from Figure 7, focusing now on the leaky accumulator models including collapsing bounds.

That said, we view the leaky accumulator model as more than just a “robustness check“. A substantial body of theoretical work (Usher and McClelland, 2001; Wong and Wang, 2006; Roxin, 2008; Brunton et al., 2013; Ossmy et al., 2013; Brunton et al., 2013; Glaze et al., 2015) indicates (i) that models with a non-zero leak term provide a more accurate description of decision dynamics and (ii) that leaky evidence accumulation is, in fact, normative under general conditions, in which the sensory evidence is not stationary (Ossmy et al., 2013; Glaze et al., 2015). Moreover, the leaky accumulator model is attractive in enabling the distinction between two forms of accumulation biases: a bias in the input feeding into the accumulators, or a bias in the accumulation itself. This helps generate more specific mechanistic hypotheses for future neurophysiological tests from the behavioral modelling. Similar points could be made for the collapsing bounds.

So, while we think use of the DDM is perfectly motivated from its general simplicity and wide use in the field, generalization of our conclusions to models with collapsing bounds and leaky accumulation is important at a conceptual level.

5) Mechanism. Reviewer 3 also requested more details on the proposed mechanism. The idea that changes in a history-dependent drift-rate bias correspond to shifts in endogenous attention from one interpretation to another could be further developed. In particular it seems to imply a more explicit (or at least non-procedural) mechanism for trial-wise changes in drift rate bias. But one could imagine many other mechanisms driving this effect as well (see papers referenced above). As it stands, the connection between shifts in attention and drift-rate bias does seem to be an intuitively plausible explanation (one of many), but a further description of the authors' line of thought would help to convince the reader.

Thank you for this suggestion. We have now elaborated on this idea in the corresponding Discussion paragraph (seventh paragraph), part of which we have quoted above. Our findings indicate that choice history signals bias the accumulation of subsequent evidence. At a neural level, an accumulation may be implemented in at least three ways: through (i) a bias in the neural input (from sensory cortex) to one of the two neural accumulator populations (in association cortex) encoding either choice; (ii) a bias in the way those populations accumulate their inputs (i.e. stronger weights in the connection from input to corresponding accumulator, or reduced leak in that accumulator); or (iii) additional, evidence independent input to one of the accumulators. Scenarios (i) and (ii) precisely match current accounts of the neural implementation of selective attention: a selective boosting of sensory responses to certain features, at the expense of others (Desimone and Duncan, 1995); or a stronger impact of certain sensory responses on their downstream target neurons (e.g. Salinas and Sejnowski, NRN, 2001). And, indeed our besting-fitting accumulator model is in line with scenario (i). Now, the selective amplification of certain sensory responses by top-down attention is commonly explained by selective feedback signals from prefrontal and parietal association cortex to sensory cortex (Desimone and Duncan, 1995) – the same regions that also carry choice history information (Akrami et al., 2018). So, it is tempting to speculate that choice history biases the state of these association circuits, which then feed back these biases to sensory cortex in much the same way as happens during explicit manipulations of attention.

Indeed, while attention is commonly studied in the lab by providing explicit cues, we speculate that an agent’s own choices may be one (among others) important factor that controls the allocation of attention in natural settings, where explicit cues are often not available. Top-down (goal-directed) attention is commonly thought to be accompanied by an awareness of control. We remain agnostic as to whether or not such a sense of control accompanies the spontaneous choice history biases we have studied here. But we feel the functional analogy to attention we have identified here is a potentially important avenue for future research.

https://doi.org/10.7554/eLife.46331.033

Article and author information

Author details

  1. Anne E Urai

    1. Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
    2. Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
    Present address
    Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    For correspondence
    anne.urai@gmail.com
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5270-6513
  2. Jan Willem de Gee

    1. Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
    2. Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
    Present address
    Department of Neuroscience, Baylor College of Medicine, Houston, United States
    Contribution
    Formal analysis, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5875-8282
  3. Konstantinos Tsetsos

    Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
    Contribution
    Formal analysis, Writing—review and editing
    Contributed equally with
    Tobias H Donner
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2709-7634
  4. Tobias H Donner

    1. Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
    2. Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
    3. Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands
    Contribution
    Conceptualization, Resources, Supervision, Writing—original draft, Writing—review and editing
    Contributed equally with
    Konstantinos Tsetsos
    For correspondence
    t.donner@uke.de
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7559-6019

Funding

German Academic Exchange Service London (A/13/70362)

  • Anne E Urai

Deutsche Forschungsgemeinschaft (DO 1240/2-1)

  • Tobias H Donner

Deutsche Forschungsgemeinschaft (DO 1240/3-1)

  • Tobias H Donner

Deutsche Forschungsgemeinschaft (SFB 936/A7)

  • Tobias H Donner

Deutsche Forschungsgemeinschaft (SFB 936/Z1)

  • Tobias H Donner

H2020 Marie Skłodowska-Curie Actions (658581)

  • Konstantinos Tsetsos

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Gilles de Hollander and Peter Murphy for discussion. Anke Braun kindly shared behavioral data of the Visual motion 2AFC (FD) study. Christiane Reißmann, Karin Deazle, Samara Green and Lina Zakarauskaite helped with participant recruitment and data acquisition for the Visual motion 2IFC (FD) #2 study.

This research was supported by the German Academic Exchange Service (DAAD, to AEU), the EU’s Horizon 2020 research and innovation program (under the Marie Skłodowska-Curie grant agreement No 658581 to KT) and the German Research Foundation (DFG) grants DO 1240/2–1, DO 1240/3–1, SFB 936/A7, and SFB 936/Z1 (to THD). We acknowledge computing resources provided by NWO Physical Sciences.

Ethics

Human subjects: All participants gave written informed consent, and consent to publish. The ethics committees of the University of Amsterdam (Psychology Department), University Medical Center Hamburg-Eppendorf (PV4714), and Leiden University (Cognitive Psychology department) approved the study procedures.

Senior Editor

  1. Barbara G Shinn-Cunningham, Carnegie Mellon University, United States

Reviewing Editor

  1. Timothy Verstynen, Carnegie Mellon University, United States

Reviewer

  1. Timothy Verstynen, Carnegie Mellon University, United States

Publication history

  1. Received: February 28, 2019
  2. Accepted: June 11, 2019
  3. Version of Record published: July 2, 2019 (version 1)
  4. Version of Record updated: July 9, 2019 (version 2)

Copyright

© 2019, Urai et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,443
    Page views
  • 218
    Downloads
  • 3
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

  1. Further reading