Neuronal origins of reduced accuracy and biases in economic choices under sequential offers

  1. Weikang Shi
  2. Sebastien Ballesta
  3. Camillo Padoa-Schioppa  Is a corresponding author
  1. Department of Neuroscience, Washington University in St. Louis, United States
  2. Department of Economics, Washington University in St. Louis, United States
  3. Department of Biomedical Engineering, Washington University in St. Louis, United States

Abstract

Economic choices are characterized by a variety of biases. Understanding their origins is a long-term goal for neuroeconomics, but progress on this front has been limited. Here, we examined choice biases observed when two goods are offered sequentially. In the experiments, rhesus monkeys chose between different juices offered simultaneously or in sequence. Choices under sequential offers were less accurate (higher variability). They were also biased in favor of the second offer (order bias) and in favor of the preferred juice (preference bias). Analysis of neuronal activity recorded in the orbitofrontal cortex revealed that these phenomena emerged at different computational stages. Lower choice accuracy reflected weaker offer value signals (valuation stage), the order bias emerged during value comparison (decision stage), and the preference bias emerged late in the trial (post-comparison). By neuronal measures, each phenomenon reduced the value obtained on average in each trial and was thus costly to the monkey.

Editor's evaluation

This manuscript describes three decision biases in value-based choice paradigms. Building on previous work from the lab, the authors focus on neural coding of decision variables in the orbitofrontal cortex of rhesus monkeys, and convincingly argue that different biases arise at different stages of the decision-making process. The reviewers found the study rigorous and believe that the results will be of broad interest. Understanding the neural mechanisms that produce biases in decision-making is an important goal for the field of decision-making and neuroeconomics, and also has relevance to conditions that involve disordered decision-making.

https://doi.org/10.7554/eLife.75910.sa0

Introduction

Some of the most mysterious aspects of economic behavior are choice biases documented in behavioral economics (Camerer et al., 2003; Kahneman and Tversky, 2000; Lichtenstein and Slovic, 2006). Standard economic theory fails to account for these effects, and shedding light on their origins is a long-term goal for neuroeconomics (Camerer et al., 2005; Glimcher and Rustichini, 2004). Progress on this front has been relatively modest, largely because the neural underpinnings of (even simple) choices were poorly understood until recently. However, the last 15 years have witnessed substantial advances. An important turning point was the development of experimental protocols in which subjects choose between different goods and relative subjective values are inferred from choices. Decision variables defined from these values are used to interpret neural activity (Kable and Glimcher, 2007; Padoa-Schioppa and Assad, 2006; Plassmann et al., 2007). Studies that adopted this paradigm showed that neurons in numerous brain regions represent the values of offered and chosen goods (Amemori and Graybiel, 2012; Cai et al., 2011; Cai and Padoa-Schioppa, 2012; Hosokawa et al., 2013; Jezzini and Padoa-Schioppa, 2020; Kim et al., 2008; Lak et al., 2014; Levy et al., 2010; Louie and Glimcher, 2010; Padoa-Schioppa and Assad, 2006; Pastor-Bernier et al., 2019; Shenhav and Greene, 2010). Furthermore, recent experiments using electrical stimulation showed that offer values encoded in the orbitofrontal cortex (OFC) are causally linked to choices (Ballesta et al., 2020). These results are of high significance for three reasons.

First, the identification in OFC and other brain regions of distinct groups of neurons encoding different decision variables is essential to ultimately understand the neural circuit and the mechanisms through which economic decisions are formed.

Second, in a more conceptual sense, the results summarized above provide a long-sought validation for the construct of value. The proposal that choices entail computing and comparing subjective values was put forth by early economists such as Bernoulli and Bentham (Niehans, 1990). Although this idea has remained influential, values defined at the behavioral level suffer from a fundamental problem of circularity. On the one hand, choices supposedly maximize values; on the other hand, values cannot be measured behaviorally independent of choices (Samuelson, 1938). Because of this problem, the construct of value gradually lost centrality in economic theory. Thus, in the standard neoclassic formulation choices are ‘as if’ driven by values, but there is no commitment to the idea that agents actually compute values (Samuelson, 1947). In this perspective, the fact that neuronal firing rates in any brain region are linearly related to values defined at the behavioral level constitutes powerful evidence that choices indeed entail the computation of values (Camerer, 2008).

Third and less frequently discussed, the identification of neurons encoding offer values and other decision variables, together with some rudimentary understanding of the decision circuit, provides the opportunity to break the circularity problem described above. To appreciate this point, consider the fact that economic choices are often affected by seemingly idiosyncratic biases. For example, while choosing between two options offered sequentially, people and monkeys typically show a bias favoring the second option (Ballesta and Padoa-Schioppa, 2019; Krajbich et al., 2010; Rustichini et al., 2021). This order bias might occur for at least two reasons. (1) Subjects might assign a higher value to any given good if that good is offered second. (2) Alternatively, subjects might assign identical values independent of the presentation order, and the bias might emerge downstream of valuation, for example during value comparison. In the latter scenario, by introducing the order bias, the decision process would actually fail to maximize the value obtained by the agent. Due to the circularity problem described above, these two hypotheses are ultimately not distinguishable based on behavior alone. However, access to a credible neural measure for the offer values makes it possible, at least in principle, to disambiguate between them. The results presented in this study build on this fundamental idea.

We focused on choice biases measured when two goods are offered sequentially. In the experiments, monkeys chose between two juices offered in variable amounts. In each session, we randomly interleaved two types of trials referred to as two tasks. In Task 1, offers were presented simultaneously; in Task 2, offers were presented in sequence. Comparing choices across tasks revealed three phenomena. (1) Monkeys were substantially less accurate (higher choice variability) in Task 2 (sequential offers) compared to Task 1 (simultaneous offers). (2) Choices in Task 2 were biased in favor of the second offer (order bias). (3) Choices in Task 2 were biased in favor of the preferred juice (preference bias). These effects are especially interesting because in most daily situations offers available for choice appear or are examined sequentially. Thus, we investigated the neuronal origins of these phenomena.

Neuronal recordings focused on the OFC. Earlier work on choices under simultaneous offers identified in this area different groups of cells encoding individual offer values, the binary choice outcome (chosen juice), and the chosen value (Padoa-Schioppa, 2013; Padoa-Schioppa and Assad, 2006). Furthermore, previous analyses indicated that choices under sequential offers engage the same neuronal populations (Ballesta and Padoa-Schioppa, 2019; Shi et al., 2022a). In other words, the cell groups labeled offer value, chosen juice and chosen value can be identified in either choice task and appear to preserve their functional role. In first approximation, the variables encoded in OFC capture both the input (offer values) and the output (chosen juice, chosen value) of the choice process, suggesting that the cell groups identified in this area constitute the building blocks of a decision circuit (Padoa-Schioppa and Conen, 2017). A series of experimental (Ballesta et al., 2021; Camille et al., 2011; Rich and Wallis, 2016) and theoretical (Friedrich and Lengyel, 2016; Rustichini and Padoa-Schioppa, 2015; Solway and Botvinick, 2012; Song et al., 2017; Zhang et al., 2018) results support this view. Here, we put forth a more articulated computational framework. In our account, different groups of OFC neurons participate in value computation and value comparison, and these processes are embedded in an ensemble of mental operations taking place before, during, and after the decision itself. In this view, sensory information, memory traces, and internal states are processed upstream of OFC and integrated in the activity of offer value cells. These neurons provide the primary input to a circuit formed by chosen juice cells and chosen value cells, where values are compared. The output of this circuit feeds brain regions involved in working memory and the construction of action plans (Figure 1).

Computational framework.

Information about sensory input, stored memory, and the motivational state is integrated during the computation of offer values. In orbitofrontal cortex (OFC), offer value cells provide the primary input to a decision circuit composed of chosen juice cells and chosen value cells. The detailed structure of the decision circuit is not well understood, but previous work indicates that decisions under sequential offers rely on circuit inhibition. In essence, neurons encoding the value of the first offer (offer1) indirectly impose a negative offset on the activity of chosen juice cells associated with the second offer (offer2). Notably, this circuit might also subserve working memory. The decision output, captured by the activity of chosen juice cells, informs other brain regions that transform it into a suitable action plan. Choice measured behaviorally is ultimately defined by the motor response. This framework highlights the fact that choice biases and/or noise might emerge at multiple computational stages. The arrows indicated here capture only the primary connections.

This framework guided a series of analyses relating the activity of each cell group to the choice biases described above. Our results revealed that different phenomena emerged at different computational stages. The lower choice accuracy observed under sequential offers reflected weaker offer value signals (valuation stage). Conversely, the order bias did not have neural correlates at the valuation stage, but rather emerged during value comparison (decision stage). Finally, the preference bias did not have neural correlates at the valuation stage or during value comparison; it emerged late in the trial, shortly before the motor response.

Results

Reduced accuracy and biases in choices under sequential offers

Two monkeys participated in the experiments. In each session, they chose between two juices labeled A and B, with A preferred. Offers were represented by sets of colored squares on a monitor, and animals indicated their choice with a saccade. In each session, two choice tasks were randomly interleaved. In Task 1, offers were presented simultaneously (Figure 2A); in Task 2, offers were presented in sequence (Figure 2B). A cue displayed at the beginning of the trial revealed to the animal the task for that trial. Offers varied from trial to trial, and we indicate the quantities offered in any given trial with qA and qB. An ‘offer type’ was defined by two quantities [qA, qB], and the same offer types were used for the two tasks in each session. For Task 2, trials in which juice A was offered first and trials in which juice B was offered first are referred to as ‘AB trials’ and ‘BA trials’, respectively. The first and second offers are referred to as ‘offer1’ and ‘offer2’, respectively.

Experimental design and choice biases.

(A, B) Experimental design. Animals chose between two juices offered in variable amounts. Offers were represented by sets of color squares. For each offer, the color indicated the juice type and the number of squares indicated the juice amount. In each session, trials with Tasks 1 and 2 were randomly interleaved. In Task 1, two offers appeared simultaneously on the left and right sides of the fixation point. In Task 2, offers were presented sequentially, spaced by an interoffer delay. After a wait period, two saccade targets matching the colors of the offers appeared on the two sides of the fixation point. The left/right configuration in Task 1, the presentation order in Task 2, and the left/right position of the saccade targets in Task 2 varied randomly from trial to trial. In any session, the same set of offer types was used for both tasks. (C) Example session 1. The percent of B choices (y-axis) is plotted against the log quantity ratio (x-axis). Each data point indicates one offer type in Task 1 (gray circles) or Task 2 (red and blue triangles for AB trials and BA trials, respectively). Sigmoids were obtained from probit regressions. The relative value (ρ) and sigmoid steepness (η) measured in each task and the order bias (ε) measured in Task 2 are indicated. In this session, the animal presented all three biases. Compared to Task 1, choices in Task 2 were less accurate (ηTask2 < ηTask1) and biased in favor of juice A (ρTask2 > ρTask1; preference bias). Furthermore, choices in Task 2 were biased in favor of offer2 (ε > 0; order bias). (D) Example session 2. Same format as panel C. (E, F) Comparing relative value across choice tasks. Each data point represents one session and gray ellipses indicate 90% confidence intervals. For both monkeys, relative values in Task 2 (y-axis) were significantly higher than in Task 1 (x-axis). Furthermore, the main axis of each ellipse was rotated counterclockwise compared to the identity line. (G, H) Comparing the sigmoid steepness across choice tasks. For both monkeys, sigmoids were consistently shallower (smaller η) in Task 2 compared to Task 1. (I, J) Order bias, distribution across sessions. Both monkeys presented a consistent bias favoring offer2 (mean(ε) > 0). Panels C, E, G, and I are from monkey J (N = 101 sessions); panels D, F, H, and J are from monkey G (N = 140 sessions). Sessions shown in panels C and D are highlighted in yellow in panels E, F, G, and H. Triangles in panels I and J indicate the mean. Statistical tests and exact p values are indicated in each panel.

The data set included 241 sessions (101 from monkey J, 140 from monkey G; see Methods). Sessions lasted for 217–880 trials (mean ± std = 589 ± 160). For each session, we analyzed choices in the two tasks separately using probit regressions. For Task 1 (simultaneous offers), we used the following model:

(1) Choice B=Φ(X)X=a0+a1 log(qB/qA)

where Choice B = 1 if the animal chose juice B and 0 otherwise, Φ was the cumulative function of the standard normal distribution, and qA and qB were the quantities of juices offered on any given trial. From the fitted parameters a0 and a1, we derived measures for the relative value of the two juices ρTask1 = exp(−a0/a1) and for the sigmoid steepness ηTask1 = a1. Intuitively, the relative value was the quantity ratio qB/qA that made the animal indifferent between the two juices, and the sigmoid steepness was inversely related to choice variability.

For Task 2 (sequential offers), we used the following model:

(2) Choice B=Φ(X)X=a2+a3 log(qB/qA)+a4 (δorder,ABδorder,BA)

where δorder,AB = 1 for AB trials and 0 otherwise, and δorder,BA = 1 − δorder,AB. In essence, AB trials and BA trials were analyzed separately but assuming that the two sigmoids were parallel. From the fitted parameters a2, a3, and a4, we derived measures for the relative value of the two juices ρTask2 = exp(−a2/a3), for the sigmoid steepness ηTask2 = a3, and for the order bias ε = 2 ρTask2 a4/a3. Intuitively, the order bias was a bias favoring the first or the second offer. Specifically, ε < 0 indicated a bias favoring offer1; ε > 0 indicated a bias favoring offer2. We also defined relative values specific to AB trials and BA trials as ρAB = exp((a2 + a4)/a3) and ρBA = exp(−(a2 − a4)/a3). Of note, the order bias was defined such that

(3) ϵρBAρAB

The experimental design gave us the opportunity to compare choices across tasks independently of factors such as selective satiation or changes in the internal state. The relative values measured in the two tasks were highly correlated (Figure 2E, F). At the same time, our analyses revealed three interesting phenomena. First, for both animals, sigmoids measured in Task 2 were significantly shallower compared to Task 1 (Figure 2G, H). In other words, presenting offers in sequence reduced choice accuracy. Second, in Task 2, both animals showed a consistent order bias favoring offer2 (Figure 2I, J). Third, in both animals, relative values in Task 2 were significantly higher than in Task 1 (ρTask2 > ρTask1), and this effect increased with the relative value (Figure 2E, F). In other words, the ellipse marking the 90% confidence interval for the joint distribution of relative values laid above the identity line and was rotated counterclockwise compared to the identity line.

To further investigate the differences in relative values measured across tasks, we quantified them separately in AB trials and BA trials in each monkey. We thus examined the relation between ρTask1 and ρTask2,AB and, separately, that between ρTask1 and ρTask2,BA (Figure 3). In both animals and in both sets of trials, the ellipse marking the 90% confidence interval was rotated counterclockwise compared to the identity line. Furthermore, the ellipse measured for BA trials was higher than that for AB trials. We quantified these observations with an analysis of covariance (ANCOVA) using the presentation order (AB and BA) as a covariate and imposing parallel lines (Figure 3C, F). In both animals, the two regression lines were significantly distinct (difference in intercept >0, p ≤ 0.002 in each animal). This result confirmed the presence of an order bias favoring offer2 in Task 2. Concurrently, in both animals the regression slope was significantly >1 (p ≤ 0.04 in each animal; ellipse rotation). This result indicated that the animals had an additional bias favoring juice A in Task 2, and that this bias increased as a function of the relative value ρ. We refer to this phenomenon as the preference bias.

Order bias and preference bias.

(A–C) Monkey J (N = 101 sessions). In panels A and B, ρTask2,AB and ρTask2,BA (y-axis) are plotted against ρTask1 (x-axis). Each data point represents one session and gray ellipses indicate 90% confidence intervals. The main axis of both ellipses is rotated counterclockwise compared to the identity line (preference bias). In addition, the ellipse in panel B is displaced upwards compared to that in panel A (order bias). In panel C, the same data are pooled and color coded. The two lines are from an ANCOVA (covariate: order; parallel lines). The regression slope is significantly >1 (preference bias) and the two intercepts differ significantly from each other (order bias). (D–F) Monkey G (N = 140 sessions). Same format. The results closely resemble those for monkey J but the preference bias is weaker.

Computational framework

The following sections present a series of results on the neuronal origins of these behavioral phenomena. We begin by discussing the computational framework for the analyses.

Economic choice is thought to entail two stages: values are assigned to the available offers and a decision is made by comparing values. Importantly, in our tasks and in most circumstances, choices elicit an ensemble of mental operations taking place before, during, and after the computation and comparison of offer values. Upstream of valuation, choices examined here entail the sensory processing of visual stimuli and the retrieval from memory of relevant information (e.g., the association between color and juice type). Downstream of value comparison, the decision outcome must guide a suitable motor response. In addition, performance in Task 2 requires holding in working memory the value of offer1 until offer2, remembering the decision outcome for an additional delay, and mapping that outcome onto the appropriate saccade target (Figure 2B). In principle, choice biases could emerge at any of these computational stages. Likewise, each of these mental operations could be noisy and thus contribute to choice variability.

Neuronal activity in OFC does not capture all of these processes. However, previous work indicates that neurons in this area participate both in value computation and value comparison. In the framework proposed here (Figure 1), sensory and limbic areas feed offer value cells, where values are integrated. In turn, offer value cells provide the primary input to a neural circuit constituted by chosen juice cells and chosen value cells, where decisions are formed. Finally, the decision circuit is connected with downstream areas, such as lateral prefrontal cortex, engaged in working memory and in transforming choice outcomes into suitable action plans. This scheme reflects the anatomical connectivity of OFC and other prefrontal regions (Carmichael and Price, 1995a; Carmichael and Price, 1995b; Petrides and Pandya, 2006; Saleem et al., 2014; Takahara et al., 2012); it is motivated by neurophysiology results from OFC (Ballesta et al., 2020; Rich and Wallis, 2016) and connected areas (Cai and Padoa-Schioppa, 2014; Sasikumar et al., 2018); and it is consistent with computational models of economic decisions (Friedrich and Lengyel, 2016; Rustichini and Padoa-Schioppa, 2015; Solway and Botvinick, 2012; Song et al., 2017; Yim et al., 2019; Zhang et al., 2018).

Of note, both offer value and chosen value cells encode subjective values. However, in the framework of Figure 1, offer value cells express a pre-decision value, while chosen value cells express a value emerging during the decision process. Conversely, the activity of chosen juice cells captures the evolving commitment to a particular choice outcome. In this framework, suitable analyses of neuronal activity may reveal whether particular choice biases emerge during valuation, during value comparison, or in subsequent computational stages.

Reduced accuracy under sequential offers emerged at the valuation stage

Other things equal, choices under sequential offers (Task 2) were significantly less accurate than choices under simultaneous offers (Task 1; Figure 2). We first investigated the neural origins of this phenomenon.

The primary data set examined in this study included 183 offer value cells, 160 chosen juice cells, and 174 chosen value cells (see Methods). Comparing neuronal responses across tasks, we noted that offer value signals in Task 2 were significantly weaker than in Task 1. Figure 4A, C illustrates one example cell. In both tasks, this neuron encoded the offer value B. However, the activity range (see Methods) measured in Task 2 was smaller than that measured in Task 1. This effect was also observed at the population level. For this analysis, we pooled offer value cells associated with juices A and B, and with positive or negative encoding (see Methods). For Task 1, we focused on the post-offer time window; for Task 2, we focused on post-offer1 and post-offer2 time windows, pooling trial types from both windows. For each cell, we imposed that the response be significantly tuned in these time windows in each task, and we quantified the mean activity and the activity range (Δr, see Methods). At the population level, the mean activity did not differ significantly across tasks (p = 0.6, t-test; p = 0.4, Wilcoxon test, Figure 4D). In contrast, the activity range was significantly lower in Task 2 compared to Task 1 (ΔrTask2 < ΔrTask1; p = 0.06, t-test; p = 0.02, Wilcoxon test Figure 4E). In other words, offer value signals were weaker in Task 2 compared to Task 1.

Figure 4 with 1 supplement see all
Lower choice accuracy in Task 2 reflects weaker offer value signals.

(A–C) Weaker offer value signals in Task 2, example cell. Panel A illustrates the choice pattern. Panel B illustrates the neuronal response measured in Task 1 (post-offer time window). Each data point represents one trial type. In C, two panels illustrate the neuronal responses measured in Task 2 (post-offer1 and post-offer2 time windows). Each data point represents one trial type; red and blue colors are for AB and BA trials, respectively. In panels B and C, firing rates (y-axis) are plotted against variable offer value B and gray lines are from linear regressions. Notably, the cell has lower activity range in Task 2 than in Task 1. (D, E) Weaker offer value signals in Task 2, population analysis (N = 109 offer value cells). The two panels illustrate the results for the mean activity and the activity range, respectively. In each panel, x- and y-axis represent measures obtained in Task 1 and Task 2, respectively. Each data point represents one cell. For each cell, we examined one time window (post-offer) in Task 1 and two time windows (post-offer1 and post-offer2) in Task 2. Circles and diamonds refer to post-offer1 and post-offer2 time windows, respectively. Measures of mean activity measured in the two tasks (panel D) were statistically indistinguishable. In contrast, activity ranges (panel E) were significantly reduced in Task 2 compared to Task 1. Statistical tests and exact p values are indicated in each panel. The example cell shown in panels A–C is highlighted in orange in panels D and E. (F) Offer value signals and choice accuracy (N = 109 cells). For each offer value cell, we computed the activity range Δr in each task (see Methods). Here, the difference in activity range Δr = ΔrTask2 ΔrTask1 (y-axis) is plotted against the difference in sigmoid steepness Δη = ηTask2 − ηTask1 measured in the same session (x-axis). The two measures were significantly correlated across the population. The gray line in panel F is from a linear regression. This analysis was restricted to 53 cells significantly tuned in the post-offer time window (Task 1) and post-offer1 time window (Task 2), and 56 cells significantly tuned in the post-offer time window (Task 1) and post-offer2 time window (Task 2).

The activity of offer value cells is causally related to choices (Ballesta et al., 2020). Furthermore, for given value range and mean activity, the activity range determines the neuronal signal-to-noise ratio. Indeed, we previously found that decreases in the encoding slope of offer value cells due to range adaptation reduce choice accuracy (Conen and Padoa-Schioppa, 2019; Rustichini et al., 2017). Along similar lines, we inquired whether the difference in choice accuracy measured across tasks (Figure 2G, H) might be explained, at last partly, by differences in neuronal activity range (Figure 4E). We thus examined the relation between the difference in sigmoid steepness (Δη = ηTask2 − ηTask1) and the difference in activity range (Δr = ΔrTask2 ΔrTask1). The two measures were positively correlated (Spearman r = 0.2, p = 0.01; Pearson r = 0.3, p = 0.003; Figure 4F). In other words, the drop in choice accuracy observed in Task 2 compared to Task 1 correlated with weaker offer value signals. A similar analysis of chosen value cells found that the activity range Δr was reduced in Task 2 compared to Task 1. However, this effect and the drop in choice accuracy were not significantly correlated (Figure 4—figure supplement 1).

In conclusion, the lower choice accuracy measured in Task 2 compared to Task 1 correlated with weaker offer value signals in OFC. Thus, this behavioral phenomenon emerged, at least partly, during valuation.

The order bias emerged during value comparison

The next series of analyses focused on the neural origins of the order bias (ε). Since this phenomenon pertains only to choices under sequential offers, we included in the analyses an additional data set recorded in the same animals performing only Task 2 (see Methods).

In the framework of Figure 1, we first inquired whether the order bias emerged during valuation. If this was the case, for any given good, offer value cells should encode a higher value when the good is presented as offer2. To test this hypothesis, we pooled offer value cells associated with the two juices. For each neuron, ‘E’ indicated the juice encoded by the cell and ‘O’ indicated the other juice. We thus refer to EO trials and OE trials. For any given cell, we compared the response recorded in EO trials (post-offer1 time window) with the response recorded in OE trials (post-offer2 time window). If the order bias emerged during valuation, the mean activity and/or the activity range should be higher for the latter (Figure 5—figure supplement 1A). Contrary to this prediction, across a population of 128 cells, we did not find any systematic difference in mean activity or activity range (Figure 5—figure supplement 1B, C). Furthermore, the difference between the activity parameters measured in OE and EO trials did not correlate with the order bias (Figure 5—figure supplement 1D). In conclusion, assigned values did not depend on the presentation order.

We next examined whether the order bias emerged during value comparison. If so, the bias should be reflected in the activity of both chosen juice and chosen value cells (Figure 1). For chosen value cells, the hypothesis might be tested noting that in post-offer1 and post-offer2 time windows these neurons encoded the value currently offered independently of the juice type (Table 1). Thus, the activity measured in these time windows in AB and BA trials provided neuronal measures for the relative values of the two juices. More specifically, for each chosen value cell, we derived the two measures ρneuronalAB and ρneuronalBA for AB trials and BA trials, respectively (Figure 5A; see Methods). We also defined the difference Δρneuronal = ρneuronalBA − ρneuronalAB. We recall that the order bias (ε) was essentially equal to the difference between the relative values measured behaviorally in BA and AB trials (Equation 3). Thus, assessing whether the activity of chosen value cells reflected the order bias amounts to testing the relation between Δρneuronal and ε.

Table 1
Neuronal encoding of decision variables in the two choice tasks.

The table summarizes the results of a previous report (Shi et al., 2022a). Under simultaneous offers, different groups of orbitofrontal cortex (OFC) neurons encode different decision variables, each with positive or negative sign (indicated here with + and −). In first approximation, each cell encodes the same variable across time windows. Under sequential offers, OFC neurons encode different variables in different time windows. However, the vast majority of them present one of eight specific patterns of variables, referred to as variable ‘sequences’ and detailed here. Furthermore, there is a clear correspondence between neurons encoding a particular variable in Task 1 and neurons encoding a particular sequence in Task 2. Hence, we can refer to different cell groups in OFC using the standard nomenclature originally defined for Task 1.

Task 1Task 2
Post-offer1Post-offer2Post-juice
offer value A +offer value A | AB +offer value A | BA +chosen value A +
offer value A −offer value A | AB −offer value A | BA −chosen value A −
offer value B +offer value B | BA +offer value B | AB +chosen value B +
offer value B −offer value B | BA -offer value B | AB −chosen value B −
chosen juice AAB | BA +AB | BA −chosen juice A
chosen juice BAB | BA −AB | BA +chosen juice B
chosen value +offer value1 +offer value2 +chosen value +
chosen value −offer value1 −offer value2 −chosen value −
Figure 5 with 1 supplement see all
Fluctuations in order bias and fluctuations in the activity of chosen value cells.

(A) Neuronal measures of relative value. The two panels represent in cartoon format the response of a chosen value cell in the post-offer1 and post-offer2 time window (Task 2). In each of these time windows, chosen value cells encode the value of the offer on display. Here, the two axes correspond to the firing rate (y-axis) and to the offered juice quantity (x-axis). The two colors correspond to the two orders (AB, BA). In each time window, two linear regressions provide two slopes, proportional to the value of the two juices. From the four measures θ1A (left panel, red), θ1B (left panel, blue), θ2A (right panel, blue), and θ2B (right panel, red), we derive four neuronal measures of relative value (Methods, Equations 13–16). (B) Neuronal measures of relative value in AB trials and BA trials (N = 96 cells). The x- and y-axis correspond to ρneuronalAB and ρneuronalBA, respectively. Each data point represents one cell. The two measures are strongly correlated. The gray line is from a Deming regression. (C) Fluctuations of relative value and fluctuations in order bias (N = 96 cells). For each chosen value cell, we quantified the difference in the neuronal measure of relative value Δρneuronal = ρneuronalAB − ρneuronalBA. Here, the x-axis is the order bias (ε), the y-axis is Δρneuronal, and each data point corresponds to one cell. The gray line is from a linear regression. Statistical tests and exact p values are indicated in each panel. This analysis was restricted to 96 cells that had significant θ1A, θ1B, θ2A, and θ2B. Fluctuations of Δρneuronal correlated with fluctuations of ε across the population. Of note, the regression line has a negative intercept and the data cloud seems displaced downwards compared to what one might expect. As a result, Δρneuronal was on average close to 0. We cannot provide a clear interpretation for this observation and future work shall revisit this issue.

We conducted a population analysis of 96 chosen value cells. Confirming previous results (Padoa-Schioppa and Assad, 2006), neuronal and behavioral measures of relative value were highly correlated. Similarly, the two neuronal measures of relative value, ρneuronalAB and ρneuronalBA, were correlated with each other (Figure 5B). Most importantly, the difference Δρneuronal and the order bias ε were significantly correlated across the population (Spearman r = 0.3, p = 0.007; Pearson r = 0.2, p = 0.02; Figure 5C). Hence, session-to-session fluctuations in the activity of chosen value cells correlated with fluctuations in the order bias.

Further insights on the order bias came from the analysis of chosen juice cells. Again, for each neuron, E and O indicated the juice encoded by the cell and the other juice, respectively. A previous study found that the baseline activity of chosen juice cells recorded in OE trials immediately before offer2 was negatively correlated with the value of offer1 (i.e., the value of the other juice) – a phenomenon termed circuit inhibition (Ballesta and Padoa-Schioppa, 2019). If the decision is conceptualized as the evolution of a dynamic system (Rustichini and Padoa-Schioppa, 2015; Wang, 2002), circuit inhibition sets the system’s initial conditions and is thus integral to value comparison. In this account, the evolving decision is essentially captured by the activity of chosen juice cells in OE trials, which reflects a competition between the negative offset set by the value of offer1 (initial condition) and the incoming signal encoding the value of offer2. If so, the intensity of circuit inhibition should be negatively correlated with the order bias.

We tested this prediction as follows. First, we replicated previous findings and confirmed the presence of circuit inhibition in our primary data set (Figure 6A). We then focused on a 300-ms time window starting 250 ms before offer2 onset. For each chosen juice cell, we regressed the firing rate against the normalized offer1 value (see Methods). Thus, the regression slope c1 quantified circuit inhibition for individual cells. Across a population of 295 chosen juice cells, mean(c1) was significantly <0 (p = 5 × 10−6, t-test; p = 9 × 10−8, Wilcoxon test; Figure 6B). Third, we examined the relation between circuit inhibition (c1) and the order bias (ε). Confirming the prediction, the two measures were significantly correlated across the population (Spearman r = 0.1, p = 0.02; Pearson r = 0.1, p = 0.02; Figure 6C). In other words, stronger circuit inhibition (more negative c1) corresponded to a weaker order bias (smaller ε).

Order bias and circuit inhibition.

(A) Circuit inhibition in chosen juice cells (primary data set, N = 160 cells). For each chosen juice cell E and O indicated the encoded juice and the other juice, respectively. We separated EO and OE trials, and divided each group of trials in tertiles based on the value of offer1. For EO trials, this corresponded to V(E); for OE trials, it corresponded to V(O). In this panel, Q1, Q2, and Q3 indicate low, medium, and high values of offer1. In OE trials, shortly before offer2, the activity of chosen juice cells was negatively correlated with V(O) – a phenomenon termed circuit inhibition. For a quantitative analysis of circuit inhibition, we focused on 300ms time window starting 250 ms before offer2 onset (black line). (B) Circuit inhibition for individual cells (N = 295 cells). For each chosen juice cell, we regressed the firing rate against the normalized V(O) (see Methods). The histogram illustrates the distribution of regression slopes (c1), which quantify circuit inhibition for individual cells. The effect was statistically significant across the population (mean = −0.95). (C) Correlation between order bias and circuit inhibition (N = 295 cells). Here, the x-axis is the order bias (ε), the y-axis is circuit inhibition (regression slope c1) and each data point represents one cell. The two measures were significantly correlated across the population. Panel A includes only the primary data set; thus circuit inhibition shown here replicates previous findings (Ballesta and Padoa-Schioppa, 2019). Panels B and C include both the primary and the additional data sets (see Methods). In panels B and C, 47 cells were excluded from the analysis because measures of order bias (ε) or circuit inhibition (c1) were detected as outliers by the interquartile criterion. Including these cells in the analysis did not substantially alter the results. Statistical tests and exact p values are indicated in panels B and C.

In conclusion, the order bias did not originate before or during valuation. Analysis of chosen juice cells and chosen value cells indicated that the order bias emerged during value comparison (decision stage).

The preference bias emerged late in the trial (post-comparison)

When offers were presented sequentially (Task 2), both monkeys showed an additional preference bias that favored juice A and was more pronounced when the relative value of the two juices was larger (Figure 3). Our last series of analyses focused on the origins of this bias.

First, we inquired whether the preference bias emerged during valuation. If this was the case, one or both of the following should be true: (1) offer value A cells encoded higher values in Task 2 than in Task 1 and/or (2) offer value B cells encoded lower values in Task 2 than in Task 1. Furthermore, these putative effects should increase as a function of the relative value. To test these predictions, we examined the tuning functions of offer value cells. For each cell group (offer value A, offer value B), we pooled neurons with positive and negative encoding. For Task 1, we focused on the post-offer time window; for Task 2, we focused on post-offer1 and post-offer2 time windows, pooling trial types from both windows. Indicating with b0 and b1 the tuning intercept and tuning slope (see Methods, Equation 8), we computed the difference in intercept Δb0 = b0,Task2 − b0,Task1 and the difference in slope Δb1 = b1,Task2 − b1,Task1 for each cell. We then examined the relation between these measures and the relative value ρ across the population, separately for each cell group. Contrary to the prediction, we did not find any correlation between neuronal measures (Δb0, Δb1) and the behavioral measure (ρ) for either offer value A or offer value B cells (Figure 7—figure supplement 1). Thus, the preference bias did not seem to emerge at the valuation stage.

We next examined chosen value cells. As discussed above, their activity provided a neuronal measure for the relative value (ρneuronal), which reflected the internal subjective values of the juices emerging during value comparison. In principle, ρneuronal might differ from the relative value derived from choices through the probit regression (ρbehavioral) because choices might be affected by systematic biases originating downstream of value comparison (Figure 1). In the light of this consideration, we examined the relation between the neuronal measure of relative value in Task 2 (ρneuronalTask2, see Methods) and the behavioral measures obtained in the two tasks (ρbehavioralTask1, ρbehavioralTask2). We envisioned two possible scenarios (Figure 7A). In scenario 1, the preference bias reflected a difference in values across tasks. In other words, the subjective values of the juices in the two tasks were different and such that the relative value of juice A was higher in Task 2 than in Task 1. If so, ρneuronalTask2 should be statistically indistinguishable from ρbehavioralTask2 and systematically larger than ρbehavioralTask1. In scenario 2, the subjective values of the juices were the same in both tasks and the preference bias reflected some neuronal process taking place downstream of value comparison. If so, ρneuronalTask2 should be statistically indistinguishable from ρbehavioralTask1 and systematically smaller than ρbehavioralTask2.

Figure 7 with 1 supplement see all
The preference bias does not reflect differences in the activity of chosen value cells.

(A) Hypothetical scenarios. The two panels represent in cartoon format two possible scenarios envisioned at the outset of this analysis. In both panels, the x-axis represents behavioral measures from either Task 1 (green) or Task 2 (yellow); the y-axis represents the neuronal measure from Task 2. In scenario 1, the animal assigned higher relative value to juice A in Task 2. Thus, neuronal measures of relative value derived from the activity of chosen value cells in Task 2 (ρneuronalTask2) would align with behavioral measures from the same task (ρbehavioralTask2) and be systematically higher than behavioral measures from Task 1 (ρbehavioralTask1). In scenario 2, the animal assigned the same relative values to the juices in both tasks. Thus, neuronal measures of relative value in Task 2 (ρneuronalTask2) would be systematically lower than behavioral measures from the same task (ρbehavioralTask2) and would align with behavioral measures from Task 1 (ρbehavioralTask1). (B) Empirical results (N = 52 cells). Neuronal measures derived from Task 2 (ρneuronalTask2) are plotted against behavioral measures obtained in Task 1 (ρbehavioralTask1, green) or Task 2 (ρbehavioralTask2), yellow. Lines are from linear regressions. In essence, ρneuronalTask2 was statistically indistinguishable from ρbehavioralTask1 and systematically lower than ρbehavioralTask2. Details on the statistics and exact p values are indicated in the figure. The analysis was restricted to 52 cells that had significant θ1A, θ1B, θ2A, and θ2B. For this analysis, ρneuronalTask2 was taken as equal to ρneuronaloffer2 (Equation 14). Other definitions provided similar results (data not shown).

The results of our analysis clearly conformed with scenario 2 (Figure 7B). For each chosen value cell, we computed ρneuronalTask1 in the post-offer time window and ρneuronalTask2 in the post-offer2 time window. Across the population, the two measures were statistically indistinguishable (p = 0.3, t-test; not shown). We then regressed ρneuronalTask2 onto ρbehavioralTask1. The linear relation between these measures was statistically indistinguishable from identity. Separately, we regressed ρneuronalTask2 onto ρbehavioralTask2. In this case, the regression slope was significantly <1 (p = 0.02). This result is quite remarkable. It shows that the chosen value represented in the brain in Task 2 was equal to that inferred from choices in Task 1, and significantly different from that inferred from choices in Task 2. This fact implies that the preference bias was costly for the monkey, as it reduced the value obtained on average at the end of each trial (see Discussion).

In summary, the preference bias did not reflect differences in the values assigned to individual offers (offer values). Furthermore, insofar as the activity of chosen value cells reflects the decision process (Figure 1), the preference bias did not seem to emerge during value comparison. So how can one make sense of this behavioral phenomenon? At the cognitive level, the preference bias might be interpreted as due to the higher demands of Task 2. When the two saccade targets appeared on the monitor, information about values was no longer on display (Figure 2B). If at that point the animal had not finalized its decision, or if it had failed to retain in working memory the decision outcome, the animal might have selected the target associated with the better juice (juice A). Such bias would have been especially strong when the value difference between the two juices was large. In this view, the preference bias would reflect a ‘second thought’ occurring after value comparison, in some trials.

To test this intuition, we turned to the activity of chosen juice cells. As noted above, in Task 2, the evolving decision was captured by the activity of these neurons recorded in OE trials immediately before and after offer2 onset (Figure 8A). More specifically, the state of the ongoing decision was captured by the distance between the two traces corresponding to the two possible choice outcomes (E chosen, O chosen). For any neuron, we quantified this distance with a receiver operating characteristic (ROC) analysis, which provided a choice probability (CP). In essence, CP can be interpreted as the probability with which an ideal observer may guess the eventual choice outcome based on the activity of the cell. For each chosen juice cell, we computed the CP at different times in the trial. Across the population, mean(CP) exceeded chance level starting shortly before offer2, consistent with the above discussion on circuit inhibition. We then proceeded to investigate the origins of the preference bias.

Preference bias and choice probability (CP) in chosen juice cells.

(A) Profiles of activity and CP (N = 160 cells). On the top, separate traces are activity profiles for EO trials (dark colors) and OE trials (light colors), separately for E chosen (blue) and O chosen. On the bottom the trace is the mean(CP) computed for OE trials in 100-ms sliding time windows (25-ms steps). Red dots indicate that mean(CP) was significantly >0.5 (p < 0.001; t-test). Value comparison typically takes place shortly after the onset of offer2. (B–E) Distribution of CP in four 250-ms time windows. The time windows used for this analysis are indicated in panel A. (F–I) Correlation between CP and preference bias index. Each panel corresponds to the histogram immediately above it. CPs are plotted against the preference bias index (PBI), which quantifies the preference bias independently of the juice types. Each symbol represents one cell and the line is from a linear regression. CP and PBI were negatively correlated immediately before and after offer2 onset, but not later in the trial. This pattern suggests that the preference bias emerged late in the trial, when decisions were not finalized shortly after offer2 presentation.

We reasoned that, at the net of noise in measurements and cell-to-cell variability, CPs ultimately quantify the animal’s commitment to the eventual choice outcome. If the preference bias emerged late in the trial – perhaps after target presentation, if animals had not already finalized their decision – the intensity of the preference bias should be inversely related to the animals’ commitment to the eventual choice outcome measured earlier in the trial. In other words, there should be a negative correlation between the preference bias and CPs computed at the time when decisions normally take place (shortly before or after offer2 onset). Our analyses supported this prediction. To quantify the preference bias intensity independent of the juice pair, we defined the preference bias index (PBI) = 2 (ρTask2 − ρTask1)/(ρTask2 + ρTask1). We then focused on four 250-ms time windows before offer1 (control window), before and after offer2 onset, and before juice delivery (Figure 8B–E). Confirming our predictions, CP and PBI were significantly anti-correlated immediately before and during offer2 presentation, but not in the control time window or late in the trial (Figure 8F–I).

In conclusion, our results indicated that the preference bias did not emerge during valuation or during value comparison. Conversely, our results suggest that the preference bias emerged late in the trial, as a ‘second thought’ process that guided choices when decisions were not finalized based on offer values alone.

Discussion

Behavioral values, neuronal values, and the origins of choice biases

Early economists proposed that choices between goods entail the computation and comparison of subjective values. However, the concept of value is somewhat slippery, because values relevant to choices cannot be measured behaviorally other than from choices themselves. This circularity problem haunted generations of scholars, dominating academic debates in the 19th and 20th centuries. In the end, neoclassic economic theory came to reject (cardinal) values and to rely only on (ordinal) preferences (Niehans, 1990; Samuelson, 1947). In other words, standard economics is agnostic as to whether subjective values are computed at all. The construction of standard economic theory was a historic success, but it came at a cost: the theory cannot explain a variety of biases observed in human choices (Camerer et al., 2003; Kahneman and Tversky, 2000; Lichtenstein and Slovic, 2006). In this perspective, neuroscience results showing that neuronal activity in multiple brain regions is linearly related to values defined behaviorally (Amemori and Graybiel, 2012; Cai et al., 2011; Cai and Padoa-Schioppa, 2012; Hosokawa et al., 2013; Jezzini and Padoa-Schioppa, 2020; Kable and Glimcher, 2007; Kim et al., 2008; Lak et al., 2014; Levy et al., 2010; Louie and Glimcher, 2010; Padoa-Schioppa and Assad, 2006; Pastor-Bernier et al., 2019; Plassmann et al., 2007; Shenhav and Greene, 2010), constitute a significant breakthrough. They validate the concept of value and effectively break the circularity surrounding it. Indeed, a neuronal population whose activity is reliably correlated with values measured from choices (behavioral values) may be used to derive independent measures of subjective values (neuronal values). In most circumstances, neuronal values and behavioral values should be (and are) indistinguishable. However, in specific choice contexts, the two measures might differ somewhat. When observed, such discrepancies indicate that choices are partly determined by processes that escape the maximization of offer values. If so, suitable analyses of neuronal activity may be used to assess the origins of particular choice biases.

These considerations motivated the analyses conducted in this study. In our experiments, monkeys chose between two juices offered simultaneously or sequentially. Choices under sequential offers were less accurate, biased in favor of the second offer (order bias), and biased in favor of the preferred juice (preference bias). It is generally understood that good-based economic decisions take place in OFC (Cisek, 2012; Padoa-Schioppa, 2011; Rushworth et al., 2012) and that the encoding of decision variables in this area is categorical in nature (Hirokawa et al., 2019; Onken et al., 2019; Padoa-Schioppa, 2013). Earlier studies had identified in OFC three groups of neurons encoding individual offer values, the chosen juice and the chosen value. Furthermore, choices under simultaneous or sequential offers were found to engage the same groups of cells (Shi et al., 2022a). Notably, the variables encoded in OFC capture both the input and the output of the decision process. This observation and a series of experimental and theoretical results lead to the hypothesis that the cell groups identified in OFC constitute the building blocks of a decision circuit (Padoa-Schioppa and Conen, 2017). More specifically, we hypothesize that offer value cells provide the primary input to a circuit formed by chosen juice cells and chosen value cells, where decisions are formed. In this view, different cell groups in OFC may be associated with different computational stages: offer value cells instantiate the valuation stage; chosen value cells reflect values possibly modified by the decision process; and chosen juice cells capture the evolving commitment to a particular choice outcome. In this framework, we examined the activity of each cell group in relation to each behavioral phenomenon.

Our results may be summarized as follows. (1) Other things equal, neuronal signals encoding the offer values were weaker (smaller activity range) under sequential offers than under simultaneous offers. The reason for this discrepancy is unclear, but this neuronal effect was correlated with the difference in choice accuracy measured at the behavioral level. In other words, the drop in choice accuracy observed under sequential offers originated, at least partly, at the valuation stage. (2) The order bias did not correlate with any measure in the activity of offer value cells. However, the order bias was negatively correlated with circuit inhibition in chosen juice cells – a phenomenon seen as key to value comparison (Ballesta and Padoa-Schioppa, 2019). Furthermore, session-to-session fluctuations in the order bias correlated with fluctuations in the neuronal measure of relative value derived from chosen value cells. These findings indicate that the order bias emerged during value comparison. (3) The preference bias did not have any correlate in the activity of offer value cells or chosen value cells. Moreover, the preference bias was inversely related to a measure derived from chosen juice cells and quantifying the degree to which the decision was finalized when offer values are ‘normally’ compared (i.e., upon presentation of the second offer). These findings indicate that the preference bias emerged late in the trial. As a caveat, the hypothesis discussed above, linking different cell groups in OFC to specific decision stages, awaits further confirmation.

Two of our findings are particularly relevant to the distinction between behavioral values and neuronal values. First, the activity of offer value cells did not present any difference associated with the presentation order or with the juice preference. Second, relative values derived from chosen value cells under sequential offers differed significantly from behavioral measures obtained in the same task, and were indistinguishable from behavioral measures obtained in the other task (simultaneous offers). Thus, the order bias and the preference bias highlighted significant differences between neuronal and behavioral measures of value. These observations imply that the order bias and the preference bias emerged downstream of valuation. Importantly, they also imply that the two choice biases imposed a cost to the animals, in the sense that they reduced the (neuronal) value obtained on average in any given trial. Notably, it would be impossible to draw such conclusion based on choices alone.

To our knowledge, this is the first study to investigate the origins of choice biases building on the distinction between behavioral values and neuronal values. At the same time, some of our results are not unprecedented. Earlier work showed that human and animal choices are affected by a bias favoring, on any given trial, the same good chosen in the previous trial (Alós-Ferrer et al., 2016; Goodwin, 1977; Padoa-Schioppa, 2013; Schoemann and Scherbaum, 2019; Senftleben et al., 2021). The origins of this phenomenon, termed choice hysteresis, are hard to pinpoint based on behavioral evidence alone. However, previous analysis of neuronal activity in OFC revealed that choice hysteresis is not reflected in the encoding of offer values. Conversely, choice hysteresis correlates with fluctuations in the baseline activity of chosen juice cells, which is partly influenced by the previous trial’s outcome (Padoa-Schioppa, 2013). Thus, similar to the order bias, choice hysteresis appears to emerge at the decision stage.

The cost of choice biases

We have noted that the three behavioral phenomena described here were detrimental to the animals. This point bears a few comments.

Let us consider, in a very general sense, choices between two goods A and B, taking place in two possible conditions 1 and 2. We may refer to the subjective values of the two goods in the two conditions as VA,1, VA,2, VB,1, and VB,2. Lets also assume the presence of a choice bias such that in condition 1 the animal consistently chooses A, while in condition 2 it consistently chooses B. In broad strokes, that might be for two reasons. Either (1) values differ across conditions such that VA,1 > VB,1 and VA,2 < VB,2, or (2) the value of each good remains unchanged across conditions (VA,1 = VA,2 and VB,1 = VB,2) and choices are affected by some other process, downstream of value computation. If so, in one of the two conditions the animal consistently chooses the lower value. In other words, the choice bias is detrimental to the animal. Coming to our experiments, the analysis of neuronal activity indicates that the subjective values of offered juices were independent of the choice task, and independent of the presentation order in Task 2. Thus, scenario (2) held true with respect to both the order bias and the preference bias. Consequently, both biases were detrimental to the animals.

With respect to the preference bias, one question is whether the bias affected choices in Task 1 or in Task 2 (in principle, there could be a bias favoring the unpreferred juice in Task 1). The fact that ρbehavioralTask1, ρneuronalTask1, and ρneuronalTask2 were all indistinguishable from each other while ρbehavioralTask2 differed from them (Figure 7) argues for the latter understanding.

It is interesting to speculate whether the choice biases documented here might benefit the animal in some more general sense. For example, one might wonder whether the cost imposed by the preference bias was lower than the metabolic cost the monkey would have incurred to increase its performance level and avoid that bias. If so, the preference bias would be, in fact, ecologically adaptive. Addressing this question would require quantifying the metabolic cost of increasing performance in the same value units used for the juices – a challenge open for future studies. However, independent of that assessment, our present results indicate that the putative metabolic cost of increasing performance in the task did not explicitly enter the decision process. If metabolic cost affected behavior, it did so in a meta-decision sense.

Similar considerations hold for the difference in choice accuracy measured across tasks. The fact that sigmoid functions are not infinitely steep (i.e., the presence of choice variability) means that on some trials the animal chooses the lower value. In fact, one can quantify the loss in expected payoff as a function of the sigmoid steepness (Constantinople et al., 2019; Rustichini et al., 2017). That sigmoids were shallower in Task 2 means that the average payoff was lower in that task – a detriment to the animal. Again, it is interesting to speculate whether weaker offer value signals recorded in Task 2 might also benefit the animal in some way, perhaps by reducing cognitive or metabolic costs. This question remains open for future studies. Importantly, such costs did not explicitly enter the decision process; if they affected behavior, they did so in a meta-decision sense.

Conclusions

The past two decades have witnessed a lively interest for the neural underpinnings of choice behavior. In this effort, a significant breakthrough came from the adoption of behavioral paradigms inspired by the economics literature, in which subjective values derived from choices are used to interpret neural activity. Without renouncing this approach, here we took a further step, showing that the decision process sometimes falls short of selecting the maximum offer value, and that choices are sometimes affected by processes taking place downstream of value comparison. In other words, behavioral values and neuronal values sometimes differ. These results might seem uncontroversial, but they have deep implications for economic theory and beyond. Looking forward, the framework developed here, in which the computation and comparison of offer values are central, but choices can also be affected by other processes accessible through neuronal measures, may help understand the origins of other choice biases.

Materials and methods

All the experimental procedures adhered to the NIH Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committee (IACUC) at Washington University (protocol number 190931).

Animal subjects, choice tasks, and neuronal recordings

Request a detailed protocol

This study presents new analyses of published data (Shi et al., 2022a). Experimental procedures for surgery, behavioral control, and neuronal recordings have been described in detail. Briefly, two male monkeys (Macaca mulatta; monkey J, 10.0 kg, 8 years old; monkey G, 9.1 kg, 9 years old) participated in the study. Under general anesthesia, we implanted on each animal a head restraining device and an oval chamber (axes 50 × 30 mm) allowing bilateral access to OFC. During the experiments, monkeys sat in an electrically insulated environment with their head fixed and a computer monitor placed at 57 cm distance. The gaze direction was monitored at 1 kHz using an infrared video camera (Eyelink, SR Research). Behavioral tasks were controlled through custom written software based on Matlab (v2016a; MathWorks Inc). The code is available online (Hwang et al., 2019; Shi et al., 2022b; https://monkeylogic.nimh.nih.gov).

In each session, the animal chose between two juices labeled A and B (A preferred) offered in variable amounts. Trials with two choice tasks, referred to as Task 1 and Task 2, were pseudorandomly interleaved. In both tasks, offers were represented by sets of colored squares displayed on the monitor. For each offer, the color indicated the juice type and the number of squares indicated the quantity. Each trial began with the animal fixating a large dot. After 0.5 s, the initial fixation point changed to a small dot or a small cross; the new fixation point cued the animal to the choice task used in that trial. In Task 1 (Figure 2A), cue fixation (0.5 s) was followed by the simultaneous presentation of the two offers. After a randomly variable delay (1–1.5 s), the center fixation point disappeared and two saccade targets appeared near the offers (go signal). The animal indicated its choice with an eye movement. It maintained peripheral fixation for 0.75 s, after which the chosen juice was delivered. In Task 2 (Figure 2B), cue fixation (0.5 s) was followed by the presentation of one offer (0.5 s), an interoffer delay (0.5 s), presentation of the other offer (0.5 s), and a wait period (0.5 s). Two colored saccade targets then appeared on the two sides of the fixation point. After a randomly variable delay (0.5–1 s), the center fixation point disappeared (go signal). The animal indicated its choice with a saccade, maintained peripheral fixation for 0.75 s, after which the chosen juice was delivered. Central and peripheral fixation were imposed within 4–6 and 5–7 degrees of visual angle, respectively. Aside from the initial cue, the choice tasks were nearly identical to those used in previous studies (Ballesta and Padoa-Schioppa, 2019; Padoa-Schioppa and Assad, 2006).

For any given trial, qA and qB indicate the quantities of juices A and B offered to the animal, respectively. An ‘offer type’ was defined by two quantities [qA qB]. On any given session, we used the same juices and the same sets of offer types for the two tasks. For Task 1, the spatial configuration of the offers varied randomly from trial to trial. For Task 2, the presentation order varied pseudorandomly and was counterbalanced across trials for any offer type. The terms ‘offer1’ and ‘offer2’ indicated, respectively, the first and second offer, independently of the juice type and amount. Trials in which juice A was offered first and trials in which juice B was offered first were referred as ‘AB trials’ and ‘BA trials’, respectively. The spatial location (left/right) of saccade targets varied randomly. The juice volume corresponding to one square (quantum) was set equal for the two choice tasks and remained constant within each session. It varied across sessions between 70 and 100 μl for both monkeys. The association between the initial cue (small dot, small cross) and the choice task varied across sessions in blocks. Across sessions, we used 12 different juices (and colors) and 45 different juice pairs. Based on a power analysis, in most sessions the number of trials for Task 2 was set equal to 1.5 times that for Task 1.

Neuronal recordings were guided by structural MRI scans (1 mm sections) obtained before and after surgery and targeted area 13 m (Ongür and Price, 2000). We recorded from both hemispheres in both monkeys. Tungsten single electrodes (100 µm shank diameter; FHC) were advanced remotely using a custom-built motorized microdrive. Typically, one motor advanced two electrodes placed 1 mm apart, and 1–2 such pairs of electrodes were advanced unilaterally or bilaterally in each session. Neural signals were amplified (gain: 10,000) bandpass filtered (300 Hz to 6 kHz; Lynx 8, Neuralynx), digitized (frequency: 40 kHz) and saved to disk (Power 1401, Cambridge Electronic Design). Spike sorting was performed offline (Spike2, v6, Cambridge Electronic Design). Only cells that appeared well isolated and stable throughout the session were included in the analysis.

Behavioral analyses

Request a detailed protocol

In each session, choice patterns were analyzed using probit regressions as described in the main text (Equations 1 and 2). For convenience, we repeat here the equation only for Task 1.

(4) Choice B=Φ(X)X=a0+a1 log(qB/qA)

Here, Φ indicates the cumulative function of the standard normal distribution. This model is referred to as the ‘log value ratio’ model. For Task 1 (simultaneous offers), the probit fit provided measures for the relative value ρTask1 and the sigmoid steepness ηTask1. For Task 2 (sequential offers), the probit fit provided measures for the relative value ρTask2, the sigmoid steepness ηTask2, and the order bias ε. Subsequent analyses of neuronal activity relied on these behavioral measures.

To test the robustness of our findings, we conducted a series of control analyses. First, we fitted a probit using a ‘value difference’ model, defined as follows:

(5) Choice B=Φ(X)X=a0 qA+a1 qB

Second, we fitted a logit using a log value ratio model:

(6) Choice B=1/(1+eX)X=a0+a1 log(qB/qA)

Third, we fitted a logit using a value difference model:

(7) Choice B=1/(1+eX)X=a0 qA+a1 qB

Each of these fit provided measures for each of the parameters characterizing choices in the two tasks (ρTask1, ρTask2, etc.). For each session and for each model we obtained an R2. We then compared different models by computing the distribution of BIC across sessions for each pair of models. We generally found that log value ratio models provided a better fit compared to value difference models, consistent with theoretical considerations (Padoa-Schioppa, 2022). We also found that logit models provided a better fit compared to probit models, although measures of relative value, sigmoid steepness, and order bias were very similar and highly correlated. For consistency with previous studies, we report the results of neuronal analyses based on neuronal measures derived from Equations 1 and 2. However, all our results held true using measures derived from logit regressions.

Notably, Equation 2 describes two parallel sigmoids. In a control analysis, we relaxed this assumption and fitted choices in AB and BA trials with two independent sigmoids. Analyzing neuronal activity based on measures derived from this analysis did not substantially alter any of the results.

Finally, we defined the order bias as ε = 2 ρTask2 a4/a3. This definition is particularly convenient for the present analyses because ε equals the difference ρBA – ρAB (Equation 3). Alternative and valid definitions include ε = a4 and ε = a4/a3. Control analyses showed that using these definitions did not substantially alter any of the results.

Preliminary analyses of neuronal activity

Request a detailed protocol

The present analyses build on the results of a previous study showing that both choice tasks engage the same groups of neurons in OFC (Shi et al., 2022a). Here, we briefly summarize those findings.

The original data set included 1526 neurons (672 from monkey J, 854 from monkey G) recorded in 306 sessions (115 from monkey J, 191 from monkey G). For each neuron, trials from Task 1 and Task 2 were first analyzed separately using the procedures developed in previous studies (Ballesta and Padoa-Schioppa, 2019; Padoa-Schioppa and Assad, 2006). For Task 1, we defined four time windows: post-offer (0.5 s after offer onset), late-delay (0.5–1 s after offer onset), pre-juice (0.5 s before juice onset), and post-juice (0.5 s after juice onset). A ‘trial type’ was defined by two offered quantities and a choice. For Task 2, we defined three time windows: post-offer1 (0.5 s after offer1 onset), post-offer2 (0.5 s after offer2 onset) and post-juice (0.5 s after juice onset). A ‘trial type’ was defined by two offered quantities, their order and a choice. For each task, each trial type and each time window, we averaged spike counts across trials. A ‘neuronal response’ was defined as the firing rate of one cell in one time window as a function of the trial type. Neuronal responses in each task were submitted to an analysis of variance (factor: trial type). Neurons passing the p < 0.01 criterion in ≥1 time window in either task were identified as ‘task-related’ and included in subsequent analyses.

Following earlier work (Padoa-Schioppa, 2013), neurons in Task 1 were classified in one of four groups offer value A, offer value B, chosen juice, or chosen value. Each variable could be encoded with positive or negative sign, leading to a total of eight cell groups. Each neuronal response was regressed against each of the four variables. If the regression slope b1 differed significantly from zero (p < 0.05), the variable was said to ‘explain’ the response. In this case, we set the signed R2 as sR2 = sign(b1) R2; if the variable did not explain the response, we set sR2 = 0. After repeating the operation for each time window, we computed for each cell the sum(sR2) across time windows. Neurons explained by at least one variable in one time window, such that sum(sR2) ≠ 0, were said to be tuned; other neurons were labeled ‘untuned’. Tuned cells were assigned to the variable and sign providing the maximum |sum(sR2)|, where |·| indicates the absolute value. Thus, indicating with ‘+’ and ‘−’ the sign of the encoding, each neuron was classified in one of nine groups: offer value A+, offer value A−, offer value B+, offer value B−, chosen juice A, chosen juice B, chosen value+, chosen value−, and untuned.

Neuronal classification in Task 2 followed the procedures described in a previous study (Ballesta and Padoa-Schioppa, 2019). Under sequential offers, neuronal responses in OFC were found to encode different variables defined in relation to the presentation order (AB or BA). Specifically, the vast majority of responses were explained by one of 11 variables including 1 binary variable capturing the presentation order (AB | BA), 6 variables representing individual offer values (offer value A | AB, offer value A | BA, offer value B | AB, offer value B | BA, offer value 1, and offer value 2), 3 variables capturing variants of the chosen value (chosen value, chosen value A, and chosen value B), and a binary variable representing the binary choice outcome (chosen juice). Each of these variables could be encoded with a positive or negative sign. Most neurons encoded different variables in different time windows. In principle, considering 11 variables, 2 signs of the encoding and 3 time windows, neurons might present a very large number of variable patterns across time windows. However, the vast majority of neurons presented one of eight patterns referred to as ‘sequences’. Classification proceeded as follows. For each cell and each time window, we regressed the neuronal response against each of the variables predicted by each sequence. If the regression slope b1 differed significantly from zero (p < 0.05), the variable was said to explain the response and we set the signed R2 as sR2 = sign(b1) R2; if the variable did not explain the response, we set sR2 = 0. After repeating the operation for each time window, we computed for each cell the sum(sR2) across time windows for each of the eight sequences. Neurons such that sum(sR2) ≠ 0 for at least one sequence were said to be tuned; other neurons were untuned. Tuned cells were assigned to the sequence that provided the maximum |sum(sR2)|. As a result, each neuron was classified in one of nine groups: seq #1, seq #2, seq #3, seq #4, seq #5, seq #6, seq #7, seq #8, and untuned (Table 1).

The results of the two classifications were compared using analyses for categorical data. In essence, we found a strong correspondence between the cell classes identified in the two choice tasks (Shi et al., 2022a). Hence, we may refer to the different groups of cells using the standard nomenclature – offer value, chosen juice, and chosen value – independently of the choice task. Based on this result, we proceeded with a comprehensive classification based on the activity recorded in both choice tasks. For each task-related cell, we calculated the sum(sR2) for the eight variables in Task 1 (sum(sR2)Task1) and eight sequences in Task 2 (sum(sR2)Task2) as described above. We then added the corresponding sum(sR2)Task1 and sum(sR2)Task2 to obtain the final sum(sR2)final. Neurons such that sum(sR2)final ≠ 0 for at least one class were said to be tuned; other neurons were untuned. Tuned cells were assigned to the cell class that provided the maximum |sum(sR2)final|.

Data sets

Request a detailed protocol

In some sessions, one or both choice patterns presented complete or quasicomplete separation – that is, the animal split choices for <2 offer types in Task 1 and/or in Task 2. In these cases, the probit regression did not converge, the resulting steepness η was high and unstable, and the relative value was not unique. This issue affected the classification analyses described above only marginally, but for the present study it was critical that behavioral measures be accurate and precise. We thus restricted our analyses to stable sessions by imposing an interquartile criterion on the sigmoid steepness (Tukey, 1977). Defining IQR as the interquartile range, values below the first quartile minus 1.5*IQR or above the third quartile plus 1.5*IQR were identified as outliers and excluded. Thus, our entire data set included 1204 neurons (577 from monkey J, 627 from monkey G) recorded in 241 sessions (101 from monkey J, 140 from monkey G). In this population, the classification procedures identified 183 offer value cells, 160 chosen juice cells, and 174 chosen value cells. These neurons constitute the primary data set for this study.

Most of our analyses compared choices and neuronal activity across tasks and were restricted to the primary data set. However, some analyses included only trials from Task 2 and quantified the effects due to the presentation order (AB vs. BA). In these analyses, we included an additional data set recorded previously from the same two animals performing only Task 2 (Ballesta and Padoa-Schioppa, 2019). All the procedures for behavioral control and neuronal recording were essentially identical to those described above. Furthermore, behavioral analyses and inclusion criteria were identical to those used for the primary data set. The resulting data set included 1205 neurons (414 from monkey J, 791 from monkey G) recorded in 196 sessions (51 from monkey J, 145 from monkey G). In this population, the classification procedures identified 243 offer value cells, 182 chosen juice cells, and 187 chosen value cells. We refer to these neurons as the additional data set. Importantly, the order bias was also observed in these sessions (Ballesta and Padoa-Schioppa, 2019).

The interquartile criterion was also used to identify outliers in all the analyses conducted throughout this study. In practice, this criterion became relevant only for the analyses shown in Figure 6 and Figure 5—figure supplement 1, as indicated in the respective figure legends.

Comparing tuning functions across choice tasks

Request a detailed protocol

Several analyses compared the tuning functions recorded in the two tasks. Tuning functions were defined by the linear regression of the firing rate r onto the encoded variable S:

(8) r=b0+b1 S

Regression coefficients b0 and b1 were referred to as tuning intercept and tuning slope, respectively. Positive and negative encoding corresponded to b1 > 0 and b1 < 0, respectively. We also defined the mean activity and the activity range as follows. Indicating with Smax the maximum value of S, the mean activity was defined as rmean = b0 + b1Smax/2. The activity range was defined as Δr = |b1Smax|, where |·| indicates the absolute value.

For any neuronal response, the tuning was considered significant if b1 differed significantly from zero (p < 0.05) and if the sign of the encoding was consistent with the cell class (e.g., b1 > 0 for offer value A + cells). All the analyses comparing tuning functions across tasks were restricted to neuronal responses with significant tuning.

Neuronal measures of relative value

Request a detailed protocol

Several analyses relied on neuronal measures for the relative value of the juices (ρneuronal) derived from the activity of chosen value cells. In Task 1, these neurons encode the chosen value independently of the juice type. For each neuronal response, we performed a bilinear regression:

(9) r=θ0+θA qA δchoice,A+θB qB δchoice,B

where θ0, θA, and θB were the regression coefficients, δchoice,A = 1 if the animal chose juice A and 0 otherwise, and δchoice,B = 1 δchoice,A. If the response encodes the chosen value, θA should be proportional to the value of a quantum of juice A (uA), θB should be proportional to the value of a quantum of juice B (uB), and the ratio θA/θB should equal the value ratio – that is, the relative value of the two juices. We thus defined

(10) ρneuronal=θA/θB

Previous studies showed that this measure is statistically indistinguishable from the behavioral measure ρbehavioral derived from the probit analysis of choice patterns (Padoa-Schioppa and Assad, 2006).

In Task 2, in the post-offer1 and post-offer2 time windows, chosen value cells encoded the value of the current offer, independent of the juice type (Table 1). For each neuron, we thus performed a bilinear regression for each of the two time windows:

(11) r1=θ10+θ1A qA δorder,AB+θ1B qB δorder,BA
(12) r2=θ20+θ2A qA δorder,BA+θ2B qB δorder,AB

where r1 and r2 were their responses recorded in the post-offer1 and post-offer2 time windows, respectively, and θ10, θ1A, θ1B, θ20, θ2A, and θ2B were regression coefficients. These coefficients provided four neuronal measures of relative value:

(13) ρneuronaloffer1=θ1A/θ1B
(14) ρneuronaloffer2=θ2A/θ2B
(15) ρneuronalAB=θ1A/θ2B
(16) ρneuronalBA=θ2A/θ1B

In essence, these four measures corresponded to the two time windows (post-offer1 and post-offer2) and to the two presentation orders (AB and BA). Importantly, all these measures were computed conditioned on θ1A, θ1B, θ2A, and θ2B differing significantly from zero (p < 0.05). The analyses illustrated in Figures 5 and 7 were restricted to neurons satisfying this criterion.

In terms of notation, we often omit the superscript in ρbehavioral and we indicate behavioral measures simply as ρ (with the relevant subscripts). We use the superscript ‘behavioral’ only when we explicitly compare behavioral and neuronal measures, for clarity. In contrast, for neuronal measures of relative value we always use the superscript ‘neuronal’.

Activity profiles of chosen juice cells

Request a detailed protocol

To conduct population analyses, we pooled all chosen juice cells. The juice eliciting higher firing rates was labeled ‘E’ (encoded) and other juice was labeled ‘O’. In Task 2, we thus referred to EO trials and OE trials, depending on the presentation order.

To illustrate the activity profiles of chosen juice cells in Task 2, we aligned spike trains at offer1 and, separately, at juice delivery. For each trial, the spike train was smoothed using a kernel that mimicked the postsynaptic potential by exerting influence only forward in time (decay time constant = 20 ms) (So and Stuphorn, 2010). In Figures 6 and 8A, we used moving averages of 100 ms with 25 ms steps for display purposes.

Under sequential offers, chosen juice cells encode different variables in different time windows (see Table 1). During offer1 and offer2 presentation, these cells encode in a binary way the juice type currently on display. Later, as the decision develops, these neurons gradually come to encode the binary choice outcome (i.e., the chosen juice). We previously showed that the activity of these neurons recorded in OE trials shortly before offer2 is inversely related to the value of offer1 (Ballesta and Padoa-Schioppa, 2019). This phenomenon, termed circuit inhibition, resembles the setting of a dynamic system’s initial conditions and is regarded as an integral part of the decision process (Ballesta and Padoa-Schioppa, 2019).

For a quantitative analysis of circuit inhibition, we focused on a 300-ms time window starting 250 ms before offer2 onset. We excluded forced choice trials, for which one of the two offers was null. For each neuron, we examined OE trials and we regressed the firing rates against the normalized value of offer1:

(17) r=c0+c1 V(O)/ΔVo

where ΔVO was the value range for juice O. The normalization allowed to pool neurons recorded with different value ranges. The regression slope c1 quantified circuit inhibition for individual cells, and we studied this parameter at the population level.

The activity of chosen juice cells in OE trials captures the momentary state of the decision and thus the evolving commitment to a particular choice outcome. To quantify the momentary decision state, we conducted an ROC analysis (Green and Swets, 1966) on the activity recorded during OE trials. This analysis was conducted on raw spike counts, without kernel smoothing, time averaging or baseline correction. We restricted the analysis to offer types for which the animal split choices between the two juices and we excluded trial types with <2 trials. For each offer type, we divided trials depending on the chosen juice (E or O) and we compared the two distributions. The ROC analysis provided an area under the curve (AUC). For each neuron, we averaged the AUC across offer types to obtain the overall CP (Kang and Maunsell, 2012). The ROC analysis was performed in 100ms time windows shifted by 25 ms. We also conducted the same analysis on four 250 ms time windows, namely pre-offer1 (−250 to 0 ms from offer1 onset), late offer2 (−250 to 0 ms from offer1 offset), early wait (0 to 250 ms after offer2 offset), and pre-juice (−250 to 0 ms before juice delivery) (Figure 8). In Figures 6 and 8B–I, cells were excluded because the Matlab function perfcurve.m failed to converge.

Data availability

Neuronal data and analysis scripts are deposited in GitHub: https://github.com/PadoaSchioppaLab/2022_eLife_choicebias, (copy archived at swh:1:rev:a474955590576fecabc02fac14edfc2ef4e89144).

The following data sets were generated
    1. Padoa-Schioppa C
    (2022) GitHub
    ID PadoaSchioppaLab. Neuronal origins of reduced accuracy and biases in economic choices under sequential offers.

References

  1. Book
    1. Camerer C
    2. Loewenstein G
    3. Rabin M
    (2003)
    Advances in Behavioral Economics
    New York, NY; Princeton, NJ: Russell Sage Foundation - Princeton University Press.
  2. Book
    1. Camerer C
    (2008) Neuroeconomics: Using neuroscience to make economic predictions
    In: Hausman DM, editors. The Philosophy of Economics: An Antology (3rd edition). Cambridge University Press. pp. 356–377.
    https://doi.org/10.1017/CBO9780511819025
  3. Book
    1. Green DM
    2. Swets JA
    (1966)
    Signal Detection Theory and Psychophysics
    New York: Wiley.
  4. Book
    1. Kahneman D
    2. Tversky A
    (editors) (2000) Choices, Values and Frames
    Cambridge, UK; New York, NY: Russell Sage Foundation - Cambridge University Press.
    https://doi.org/10.1017/CBO9780511803475
  5. Book
    1. Niehans J
    (1990)
    A History of Economic Theory: Classic Contributions, 1720-1980
    Baltimore: Johns Hopkins University Press.
  6. Book
    1. Samuelson PA
    (1947)
    Foundations of Economic Analysis
    Cambridge, MA: Harvard University Press.
  7. Book
    1. Tukey JW
    (1977)
    Exploratory Data Analysis
    Reading, Mass: Addison-Wesley Pub. Co.

Decision letter

  1. Erin L Rich
    Reviewing Editor; Icahn School of Medicine at Mount Sinai, United States
  2. Michael J Frank
    Senior Editor; Brown University, United States
  3. Veit Stuphorn
    Reviewer; Johns Hopkins University, United States

Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Neuronal origins of reduced accuracy and biases in economic choices under sequential offers" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Michael Frank as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Veit Stuphorn (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Overall, reviewers found the study convincing and the results a significant advance in understanding circuit-level computations underlying decision-making. However, they raised multiple important issues to improve the clarity and specificity of the claims. The most important points to be addressed in revision are listed below, and the reviewers' individual comments are attached in full to provide additional context and suggestions.

Essential revisions:

1. The main topic that arose in reviewer consultation was the degree to which the conclusions rely on the lab's previous work parsing neural responses into discrete categories. Whereas this work has shown the reliability of these categories within the lab's own data, other literature has found conflicting results. Given that the degree to which categorical boundaries can be set within a continuum of response profiles is debatable, the reviewers think it's important to include discussion of the issue and acknowledgement of the explicit assumptions of the authors' methods and framework. Importantly, it's not just the categorization that the interpretation relies on: it's the assumption that each plays specific decision roles (valuation vs. comparison). (R1/R2/R3)

2. Conceptualizing the two tasks as computationally identical may not be accurate. The authors should discuss potential roles of working memory in task 2, and assess their model fits to behavior in each task independently. (R2/R3)

3. The idea of optimality, bias, and what constitutes reduced accuracy was also raised, and should be addressed by clarifications of the text: e.g. describe how exactly the different choice effects are 'detrimental', whether effects should be considered biases versus stochasticity, and how the findings here relate to classic effects in economic choice. (R2)

4. Please describe the rationale behind the order bias calculation, as opposed to using the a4 coefficient. In addition, consider how the signs of the neural order biases predict a consistent behavioral order bias. (R1/R2)

5. Please address why output signals do not reflect the reduced value sensitivity of the input signals. (R3)

Reviewer #1 (Recommendations for the authors):

1. The core premise of their interpretation/model is that the functional cell types they define (offer value, chosen value cells) are (mostly?) non-overlapping populations, if they contribute to distinct aspects of behavior. Is that strictly true, or are there subsets of cells that belong to both groups? In the methods section, it sounds like they force neurons to belong to only one group ("Tuned cells were assigned to the cell class that provided the maximum |sum(sR2 )final|"). This decision seems arbitrary but central to the interpretation that these really are different functional cell types.

2. On page 7, they define the order bias as ε = 2 ρTask2 a4/a3. Why don't they just evaluate the magnitude of the a4 coefficient? If it were 0, wouldn't that indicate no order bias? The sign would indicate the bias for first vs. second, and the magnitude the strength of the bias? Perhaps I am missing something, but that seems like a much more straightforward way to obtain an estimate of order bias from this model.

3. The authors show that offer value cells exhibit reduced responses for sequential compared to simultaneous offers. Is this true for both offer1 and offer2? They show an example neuron for both offer1 and offer2, but the population activity pools those responses together.

4. Related to point #1, it sounds like they combine data from task 1 and task 2 when they determine whether a cell is an "offer cell" or a "chosen cell." Are most of the offer value cells categorized based on their task 1 responses (i.e., is the maximum |sum(sR2 )final| the task1 offer value)? I am wondering the following: because there is a pre-stimulus cue at the beginning of each trial that tells the monkey which task it is performing, it's possible that the monkey/OFC treats them as two different tasks, and that slightly different populations of neurons encode offer value in each case. If that were the case, then using the task 1 responses to define the cells as encoding offer value could trivially result in the first finding, that there are reduced offer value responses in task 2. What if the authors determined whether cells were offer value cells based on the task 2 responses only? Does the result hold that the offer value responses are reduced, relative to task 1?

5. Figure 4F, the correlation is not very compelling. Is there something different about the sessions in which the sigmoid steepness and δ activity range go in opposite directions (i.e., the upper left and lower right quadrants?)

6. Throughout the paper, it is hard to remember what the different Greek letters refer to. For instance, on page 13, when ε reappears, I couldn't remember what that referred to. When possible, for clarity it might help to use interpretable terms when possible (e.g., order bias instead of ε).

Reviewer #2 (Recommendations for the authors):

Much of my feedback can likely be gleaned from the public review, but I'll be specific (and probably repetitive) here:

(1) It would be helpful if the authors provided a more detailed framing of existing behavioral economic choice biases and how the current choice effects relate. For terminology, it is a bit confusing that these previous findings are labelled "biases" but that one of the effects discussed here is increased stochasticity (reduced accuracy) rather than a bias – can the authors find some way of being more general about choice effects?

(2) For the analysis of choice behavior and the use of probit regression, it would help if the authors could quantify the goodness of fit of their choice functions (probit model using log amount ratios), specifically against alternative models (logit, and/or models with amount difference rather than log ratio). The big issue is that the relevant measures (accuracy, order bias, preference bias) depend largely on parameters derived form these fits, so it is important to see that the fits good and the model is appropriate. for the order effect in particular, have the authors examined the data while relaxing the assumption of a constant sigmoid?

(3) Re: the documentation of the preference bias, I would like the authors to be clearer in the text about the statistical test of the basic preference effect (higher rho in Task 1 vs. Task 2) – the effect is rather strong, and just needs to be clearly conveyed in the text. I do think that the dependence of the effect on relative value (i.e. rotation of the ellipse) seems to be less well supported: beyond clearing up the presentation of the significance of the results, perhaps this could be places in a clearer context vis a vis the basic preference bias – I don't think it is necessary to the general findings.

(4) Please be clearer about how exactly the different effects are detrimental. I think this point related directly to the initial framing about the difficulty of characterizing the optimality of value-based choice using behavior alone.

(5) Regarding the interpretation of OFC activity within the Rustichini/Wang modeling framework, I realize that the authors have good reason (and past data) to adopt this interpretation – I think it would be fair to at least acknowledge the explicit assumptions for the general reader.

(6) One interpretation issue that remains is whether the OFC neural effects are causal or correlational (particularly the accuracy and order effects, which can be seen in OFC activity). One issue that points towards the latter, at least in the order bias, is the sign of the neuronal effects (Figure 5C, Figure 6C). While the behavioral order bias is overwhelming positive, the neural effects (difference in AB vs BA rhos, circuit inhibition) do not on average differ from zero, even if they are correlated with the behavioral effect. If these neural effects drive the bias, shouldn't hey also be biased in the correct direction? Can the authors provide an interpretation?

(7) Additional points:

– How exactly is the preferred item (A vs. B) defined – across both tasks combined?

– Given the initial framing, it seems to me that these data – especially the preference bias effect – provides insight into the goods versus action based choice discussion (and relates to some previous work like Cai and Padia-Schioppa). Perhaps it would be relevant to discuss this briefly at the end of the paper.

– A conceptual question: given that all three effects are presented as suboptimal, why should this be the case given the initial framing that sequential decisions are the prevalent, natural form of choices?

Reviewer #3 (Recommendations for the authors):

The paper is written very clearly and the underlying logic for the different analysis is very well described. The results are for the most part convincing, albeit sometimes of a weak effect size.

https://doi.org/10.7554/eLife.75910.sa1

Author response

Essential revisions:

1. The main topic that arose in reviewer consultation was the degree to which the conclusions rely on the lab's previous work parsing neural responses into discrete categories. Whereas this work has shown the reliability of these categories within the lab's own data, other literature has found conflicting results. Given that the degree to which categorical boundaries can be set within a continuum of response profiles is debatable, the reviewers think it's important to include discussion of the issue and acknowledgement of the explicit assumptions of the authors' methods and framework. Importantly, it's not just the categorization that the interpretation relies on: it's the assumption that each plays specific decision roles (valuation vs. comparison). (R1/R2/R3)

The reviewers are correct. Our analyses were designed, and the results were interpreted, under a series of assumptions:

(a) Good-based decisions take place in OFC

(b) Different groups of neurons in OFC encode categorically distinct variables

(c) Choices under simultaneous and sequential offers engage the same cell groups

(d) These cell groups constitute the building blocks of a decision circuit. Specifically, offer value cells provide input to a circuit composed of chosen value cells and chosen juice cells, where values are compared and the decision is formed (Figure 1)

Assumption (a) builds on a broad literature and is generally accepted in the field (Cisek, 2012; Padoa-Schioppa, 2011; Rushworth et al., 2012). It is also supported by recent results showing that weak electrical stimulation of OFC selectively disrupts value comparison without inducing any choice bias (Ballesta et al., 2021). Assumptions (b) and (c) are directly supported by a series of studies showing categorical encoding in OFC (Hirokawa et al., 2019; Onken et al., 2019; Padoa-Schioppa, 2013) and by previous analysis of the same data examined here (Shi et al., 2022). In contrast, assumption (d) remains a working hypothesis, in the sense that the organization and mechanisms of the decision circuit are not well understood. Importantly, the scheme proposed here is very general in the sense that we don’t make any particular assumption about the connections between the cell groups, except that offer value cells are upstream of the other cell groups. This point is supported by analyses correlating choice variability with trial-by-trial variability in the activity of offer value cells (Conen and Padoa-Schioppa, 2015). However, we recognize that more work is needed to ascertain the anatomical organization of the decision circuit.

We revised the manuscript including in the Discussion the following paragraph that summarizes these points:

“In our experiments, monkeys chose between two juices offered simultaneously or sequentially. Choices under sequential offers were less accurate, biased in favor of the second offer (order bias), and biased in favor of the preferred juice (preference bias). It is generally understood that good-based economic decisions take place in OFC (Cisek, 2012; Padoa-Schioppa, 2011; Rushworth et al., 2012) and that the encoding of decision variables in this area is categorical in nature (Hirokawa et al., 2019; Onken et al., 2019; Padoa-Schioppa, 2013). Earlier studies had identified in OFC three groups of neurons encoding individual offer values, the chosen juice and the chosen value. Importantly, choices under simultaneous or sequential offers engage the same neurons (Shi et al., 2022). Notably, the variables encoded in OFC capture both the input and the output of the decision process. This observation and a series of experimental (Ballesta et al., 2021; Rich and Wallis, 2016) and theoretical results lead to the hypothesis that the cell groups identified in OFC constitute the building blocks of a decision circuit (Padoa-Schioppa and Conen, 2017). In this view, offer value cells provide the primary input to a circuit formed by chosen juice cells and chosen value cells, where decisions are formed. Different cell groups in OFC may thus be associated with different computational stages: offer value cells instantiate the valuation stage; chosen value cells reflect values possibly modified by the decision process; and chosen juice cells capture the evolving commitment to a particular choice outcome. In this framework, we examined the activity of each cell group in relation to each behavioral phenomenon.

[…] As a caveat, our results rely on the hypothesis that different cell groups identified in OFC play specific roles in the decision process. This working hypothesis awaits further confirmation.”

2. Conceptualizing the two tasks as computationally identical may not be accurate. The authors should discuss potential roles of working memory in task 2, and assess their model fits to behavior in each task independently. (R2/R3)

Actually, we don’t describe the two tasks as identical. In fact, we discuss how Task 2 involves a series of cognitive operations not required in Task 1, each of which could in principle introduce noise or biases (p.6). That said, we do argue that the two tasks engage the same groups of neurons in OFC. This statement is based on empirical evidence – that is, previous analyses of this same data set (Shi et al., 2022). We elaborate on this issue below in response to R1 (Major comments, point 1).

3. The idea of optimality, bias, and what constitutes reduced accuracy was also raised, and should be addressed by clarifications of the text: e.g. describe how exactly the different choice effects are 'detrimental', whether effects should be considered biases versus stochasticity, and how the findings here relate to classic effects in economic choice. (R2)

We added to the Discussion a new section ‘The cost of choice biases’ (p.13) that elaborates on this important issue. In a nutshell, if in two conditions (e.g., Task1 and Task2) subjective values are the same but choices are different, in one or both conditions the subject fails to choose the higher value. In that sense, the choice bias is detrimental. Our analyses of neuronal activity indicated that subjective offer values were the same in the two tasks, and were the same independent of the offer presentation in Task 2. Hence, both the preference bias and the order bias were detrimental to the animal.

4. Please describe the rationale behind the order bias calculation, as opposed to using the a4 coefficient. In addition, consider how the signs of the neural order biases predict a consistent behavioral order bias. (R1/R2)

We added to the Methods a new section ‘Behavioral analyses’ (p.16-17) where we discuss alternative logistic analyses (value difference instead of log value ratio; logit instead of probit). In the same section, we also explain the rationale for defining the order bias as we did, and we report that control analyses based on alternative definitions of the order bias provided essentially the same results.

5. Please address why output signals do not reflect the reduced value sensitivity of the input signals. (R3)

We agree that this result was puzzling and we gave this issue more thought. We realized that our definition of activity range for chosen value cells was arbitrary and – in some sense – not consistent across tasks. In Task 2, in the time windows of interest for this analysis (post-offer1 and post-offer2), these neurons encode the value of the offer currently on display. That value ranges between zero (forced choices for the other good) and the max offer value. In contrast, in Task 1, in the relevant time window (post-offer), these neurons encode the chosen value. That value ranges between the minimum chosen value (which is always >0) and the max offer value. In our previous definition of activity range, we had overlooked this difference. We have now corrected our definition, defining the activity range as that corresponding to the range of values [0, max offer value] in both tasks. According to this new and better definition, the activity range of chosen value cells is reduced in Task 2 compared to Task 1. Of note, this issue is not relevant to offer value cells, because for those neurons the minimum encoded value is always =0 in both tasks. The revised manuscript reports the new results (Figure 4—figure supplement 1).

Reviewer #1 (Recommendations for the authors):

1. The core premise of their interpretation/model is that the functional cell types they define (offer value, chosen value cells) are (mostly?) non-overlapping populations, if they contribute to distinct aspects of behavior. Is that strictly true, or are there subsets of cells that belong to both groups? In the methods section, it sounds like they force neurons to belong to only one group ("Tuned cells were assigned to the cell class that provided the maximum |sum(sR2 )final|"). This decision seems arbitrary but central to the interpretation that these really are different functional cell types.

This issue is partly addressed above, under Essential Revisions, point (1). Here we would like to emphasize a few points.

First, this study relies on previous work in monkeys (Onken et al., 2019; Padoa-Schioppa, 2013) and rodents (Hirokawa et al., 2019) showing that the encoding of decision variables in OFC is categorical in nature. In other words, in a statistical sense, different groups of neurons encode different variables. Of course, our ability to assess the class of a particular cell is limited by neuronal noise, a finite number of trials, and the fact that the encoded variables are correlated with each other. Hence, we certainly make some classification errors. In previous work (Xie and Padoa-Schioppa, 2016), we estimated our classification precision to be ~80%.

Second, previous analyses of this same data set indicated that the cell groups identified in the two tasks are one and the same (Shi et al., 2022).

Third, it is clear that classifying cells using trials from both tasks will (a) reduce our classification errors and (b) avoid biasing any further analysis in favor of one task of the other. Indeed, this is why we classified neurons using both tasks.

Finally, while it is true that the idea of distinct cell groups is central to our analyses and thus to the interpretation, classification errors are not a confounding factor for any of our results. Indeed, classification errors are like noise, and will generally make it harder – not easier – to demonstrate differences between cell groups and correlations between behavioral measures and neuronal measures derived for a particular cell group.

These considerations leave us confident that our results are valid and not artifactual.

2. On page 7, they define the order bias as ε = 2 ρTask2 a4/a3. Why don't they just evaluate the magnitude of the a4 coefficient? If it were 0, wouldn't that indicate no order bias? The sign would indicate the bias for first vs. second, and the magnitude the strength of the bias? Perhaps I am missing something, but that seems like a much more straightforward way to obtain an estimate of order bias from this model.

We added to the Methods a new section ‘Behavioral analyses’ (p.16-17). There we clarified that the definition of order bias used here (ε = 2 ρTask2 a4/a3) is particularly convenient for the present analyses because ε equals the difference ρBA – ρAB (Equation 3). Alternative and valid definitions include ε=a4 and ε=a4/a3. To address this question from R1, we conducted a series of control analyses using these definitions, and found that the results reported in the manuscript did not vary in any substantial way (not shown).

3. The authors show that offer value cells exhibit reduced responses for sequential compared to simultaneous offers. Is this true for both offer1 and offer2? They show an example neuron for both offer1 and offer2, but the population activity pools those responses together.

The short answer is that the result held for each time window, but it was statistically significant only for post-offer1. Author response image 1A-D illustrates this point. For a control, we tested whether the drop in activity range measured in Task 2 depended on the time window. Thus we computed the differences in activity range ΔARpost-offer1 = ARTask 2, post-offer1 – ARTask 1, post-offer and ΔARpost-offer2 = ARTask 2, post-offer2 – ARTask 1, post-offer and we examined the relationship between these measures. We did not find any significant difference (Author response image 1E) .

Author response image 1
Weaker offer value signals in Task 2, population analysis in individual time windows.

(AB) Post-offer1 time window (N = 53 offer value cells). (CD) Post-offer2 time window (N = 56 offer value cells). Panel A-D are in the same format as Figure 4EF. (E) Comparing the effect across time windows. X- and y-axis represent ΔARpost-offer1 and ΔARpost-offer2, respectively. Across the population, the two measures were statistically indistinguishable.

4. Related to point #1, it sounds like they combine data from task 1 and task 2 when they determine whether a cell is an "offer cell" or a "chosen cell." Are most of the offer value cells categorized based on their task 1 responses (i.e., is the maximum |sum(sR2 )final| the task1 offer value)? I am wondering the following: because there is a pre-stimulus cue at the beginning of each trial that tells the monkey which task it is performing, it's possible that the monkey/OFC treats them as two different tasks, and that slightly different populations of neurons encode offer value in each case. If that were the case, then using the task 1 responses to define the cells as encoding offer value could trivially result in the first finding, that there are reduced offer value responses in task 2. What if the authors determined whether cells were offer value cells based on the task 2 responses only? Does the result hold that the offer value responses are reduced, relative to task 1?

This issue is fully addressed above (point 1). The key point is that we classify cells using information from both tasks – i.e., in an unbiased way. We think that this resolves the issue. That said, we repeated the analysis of Figure 4 having classified neurons only on the basis of Task 2. Although the correlation was understandably weaker, the result of Figure 4 held true (Author response image 2) .

Author response image 2
Weaker offer value signals in Task 2, population analysis based on a neuronal classification relying only on Task 2 (N = 74 offer value cells).

Panel A and B are in the same format as Figure 4EF.

5. Figure 4F, the correlation is not very compelling. Is there something different about the sessions in which the sigmoid steepness and δ activity range go in opposite directions (i.e., the upper left and lower right quadrants?)

As discussed in the two previous points, the result shown in Figure 4F is actually quite robust. Of course, those examined here are noisy neuronal measures. We are not aware of anything that would differentiate sessions or cells populating the “wrong” quadrants (2nd and 4th) compared to the “right ” ones (1st and 3rd).

6. Throughout the paper, it is hard to remember what the different Greek letters refer to. For instance, on page 13, when ε reappears, I couldn't remember what that referred to. When possible, for clarity it might help to use interpretable terms when possible (e.g., order bias instead of ε).

In the revised manuscript, we added several reminders about what Greek letters stand for.

Reviewer #2 (Recommendations for the authors):

Much of my feedback can likely be gleaned from the public review, but I'll be specific (and probably repetitive) here:

(1) It would be helpful if the authors provided a more detailed framing of existing behavioral economic choice biases and how the current choice effects relate. For terminology, it is a bit confusing that these previous findings are labelled "biases" but that one of the effects discussed here is increased stochasticity (reduced accuracy) rather than a bias – can the authors find some way of being more general about choice effects?

See Public review, weaknesses, point 1.

(2) For the analysis of choice behavior and the use of probit regression, it would help if the authors could quantify the goodness of fit of their choice functions (probit model using log amount ratios), specifically against alternative models (logit, and/or models with amount difference rather than log ratio). The big issue is that the relevant measures (accuracy, order bias, preference bias) depend largely on parameters derived form these fits, so it is important to see that the fits good and the model is appropriate. for the order effect in particular, have the authors examined the data while relaxing the assumption of a constant sigmoid?

Thanks for raising this question. To address it, we conducted a series of control analyses.

First, we kept the assumption of parallel sigmoids and we repeated the behavioral analysis using different regression functions (logit instead of probit) and different models (value difference instead of log value ratio). For each fit, we derived measures of the relative value (in each task), the sigmoid steepness (in each task) and the order bias (in Task 2), as well as a measure for the goodness of fit. Importantly, the measures obtained for different fits were all very similar. To compare the goodness of fit obtained with different functions and models, we used the Bayesian information criterion (BIC). Author response image 3 illustrates the 3 most relevant comparisons (pooling data for Task 1 and Task 2). In essence, the logit typically provided a better fit than the probit, and the log value ratio model typically provided a better fit than the value difference model. The latter result is consistent with theoretical considerations (Padoa Schioppa, 2022).

Author response image 3
Comparing behavioral models (N = 241 sessions, pooled Task 1 and Task 2).

(A) BIC difference between probit regression with log ratio vs.. logit regression with log ratio. (B) BIC difference between probit regression with log ratio vs.. probit regression with linear difference. (C) BIC difference between logit regression with log ratio vs.. logit regression with linear difference. Smaller BIC indicates better fitting goodness, therefore negative BIC difference indicates the former model was better and vis versa. Altogether, logit regression with log quantity ratio shows the best goodness of fit. Individual data from Task 1 or Task 2 alone show similar results (not shown).

Second, we repeated all the analyses relating neuronal activity to behavioral measures using measures derived from the logit function using the log value ratio model. We found that all the results held true.

Third, we examined whether assuming parallel sigmoids for AB and BA trials in Task 2 affected the results. Thus we fitted the two groups of trials with independent sigmoids. As illustrated in Author response image 4, we did not find any systematic difference between the two measures of steepness. We went on and repeated the analyses of neuronal activity using behavioral measures obtained fitting independent sigmoids. None of the results presented in the manuscript were substantially affected.

Author response image 4
No steepness difference between AB and BA trials in Task 2 (N = 241 sessions, pooled two monkeys).

The revised manuscript includes a new section in the Methods (‘Behavioral analysis’) where we discuss these issues and describe the results of control analyses.

(3) Re: the documentation of the preference bias, I would like the authors to be clearer in the text about the statistical test of the basic preference effect (higher rho in Task 1 vs. Task 2) – the effect is rather strong, and just needs to be clearly conveyed in the text. I do think that the dependence of the effect on relative value (i.e. rotation of the ellipse) seems to be less well supported: beyond clearing up the presentation of the significance of the results, perhaps this could be places in a clearer context vis a vis the basic preference bias – I don't think it is necessary to the general findings.

See Public review, weaknesses, point 3.

(4) Please be clearer about how exactly the different effects are detrimental. I think this point related directly to the initial framing about the difficulty of characterizing the optimality of value-based choice using behavior alone.

See Public review, weaknesses, point 4.

(5) Regarding the interpretation of OFC activity within the Rustichini/Wang modeling framework, I realize that the authors have good reason (and past data) to adopt this interpretation – I think it would be fair to at least acknowledge the explicit assumptions for the general reader.

See Public review, weaknesses, point 5.

(6) One interpretation issue that remains is whether the OFC neural effects are causal or correlational (particularly the accuracy and order effects, which can be seen in OFC activity). One issue that points towards the latter, at least in the order bias, is the sign of the neuronal effects (Figure 5C, Figure 6C). While the behavioral order bias is overwhelming positive, the neural effects (difference in AB vs BA rhos, circuit inhibition) do not on average differ from zero, even if they are correlated with the behavioral effect. If these neural effects drive the bias, shouldn't hey also be biased in the correct direction? Can the authors provide an interpretation?

We agree that this is an important issue. Aside from other considerations, the experiments and analyses conducted in this study can only show correlations – not causal relations – between neural activity and behavioral effects. Having recently conducted causal experiments, we are very primed to this distinction, and we think that our writing throughout this manuscript is unambiguous about this issue. That said, the signs in Figure 5C and Figure 6C brought up by R2 deserve some discussion.

With respect to Figure 6C, the location of the data cloud on the axis is actually consistent with our predictions and understanding. The reason is that we expect some amount of circuit inhibition (y-axis) even in the absence of order bias. In fact, we think that this amount of circuit inhibition is critical to the decision process under sequential offers (Ballesta and Padoa-Schioppa, 2019). In this view, reduced circuit inhibition would correlate with the order bias. Consistently, the regression line that summarizes the data cloud has a negative intercept (y<0 for x=0; presence of circuit inhibition when the order bias = 0) and positive slope (reduced circuit inhibition correlating with positive order bias).

In contrast, the cloud of data points in Figure 5C differs somewhat from our natural intuition. Specifically, the regression line has a negative intercept (y<0 for x=0). In other words, the entire data cloud is displaced downwards compared to where one might have thought. As a result, if we project the whole cloud on the y axis, the center of the distribution does not differ significantly from zero. As we conclude this study, we don’t have a good interpretation for this phenomenon, and future work will need to revisit this issue. We clarified this point in the figure legend.

(7) Additional points:

– How exactly is the preferred item (A vs. B) defined – across both tasks combined?

Juice A was defined such that (ρTask1 + ρTask2)/2 ≥ 1. (We typically knew ahead of time what juice was preferred; in rare cases where the inequality turned out in the opposite direction, we inverted the definition of A and B.)

– Given the initial framing, it seems to me that these data – especially the preference bias effect – provides insight into the goods versus action based choice discussion (and relates to some previous work like Cai and Padia-Schioppa). Perhaps it would be relevant to discuss this briefly at the end of the paper.

The debate about action-based vs good-based decisions has traditionally been about whether value comparison takes place in the space of actions of in the space of goods. The present results do not revisit that issue – in fact, we start from the assumption that values are compared in goods space. That said, even if one embraces the good-based model, as we do, it is clear that the choice outcome has to be transformed into a suitable action. To this point, which we made many other times, the present study adds a simple consideration: noise and biases affecting choices could also emerge late, after value comparison. Furthermore, we show that the preference bias indeed emerged late. This result might be perceived as resonating with the idea of a “distributed consensus” involving multiple representations (Cisek, 2012). However, in our understanding of the distributed consensus model, different representation would be engaged in parallel and not serially as we argue here.

– A conceptual question: given that all three effects are presented as suboptimal, why should this be the case given the initial framing that sequential decisions are the prevalent, natural form of choices?

Great question, and we don’t have a definite answer. But here are a few thoughts. In the behavioral economics literature, there has long been the idea that acquiring information or paying attention to all aspects of the offers is costly, and thus in some situations it may be rationale to make choices that appear affected by systematic errors (i.e., biases) but that reduce the attentional cost (Simon, 1956; Sims, 2003). Along similar lines, the two biases documented here might reflect some ecological trade-off. For example, maintaining in working memory the value of offer1 until offer2 must have some metabolic cost. The animal would be better off remembering offer1 faultlessly and choosing the higher value, but there is some trade-off between the metabolic cost of working memory and the cost of choosing the lower value once in a while. And if the animal forgets the value of offer1, it makes sense to bias the choice in favor of offer2 (order bias). The important point is that this trade-off involving metabolic costs does not enter explicitly the encoding and the comparison of offer values. In other words, insofar as the metabolic cost of remembering offer1 affects choices, it does so in a meta-decision sense. Similar arguments can be made apropos the preference bias and the drop in choice accuracy. In the revised Discussion, we included a new section ‘The cost of choice biases’, where we discuss these issues.

https://doi.org/10.7554/eLife.75910.sa2

Article and author information

Author details

  1. Weikang Shi

    Department of Neuroscience, Washington University in St. Louis, St. Louis, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Software, Visualization, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4068-1168
  2. Sebastien Ballesta

    Department of Neuroscience, Washington University in St. Louis, St. Louis, United States
    Present address
    1. Laboratoire de Neurosciences Cognitives et Adaptatives, Strasbourg, France
    2. Centre de Primatologie de l'Université de Strasbourg, Niederhausbergen, France
    Contribution
    Conceptualization, Data curation, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Camillo Padoa-Schioppa

    1. Department of Neuroscience, Washington University in St. Louis, St. Louis, United States
    2. Department of Economics, Washington University in St. Louis, St. Louis, United States
    3. Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, United States
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Project administration, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    camillo@wustl.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7519-8790

Funding

National Institute of Mental Health (R01-MH104494)

  • Camillo Padoa-Schioppa

McDonnell Center for Systems Neuroscience (CCSN Fellowship)

  • Weikang Shi

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank H Schoknecht for help with animal training, L Snyder for helpful discussions, and Z Balewski, E Bromberg-Martin, K Conen, A Livi, P Natenzon, T Ott, J Tu, and M Zhang for comments on the manuscript. This research was supported by the National Institutes of Health (grant number R01-MH104494 to CPS) and by the McDonnell Center for Systems Neuroscience (predoctoral fellowship to WS).

Ethics

All the experimental procedures adhered to the NIH Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committee (IACUC) at Washington University (protocol number 190931).

Senior Editor

  1. Michael J Frank, Brown University, United States

Reviewing Editor

  1. Erin L Rich, Icahn School of Medicine at Mount Sinai, United States

Reviewer

  1. Veit Stuphorn, Johns Hopkins University, United States

Publication history

  1. Preprint posted: November 8, 2021 (view preprint)
  2. Received: November 28, 2021
  3. Accepted: April 8, 2022
  4. Accepted Manuscript published: April 13, 2022 (version 1)
  5. Version of Record published: April 27, 2022 (version 2)

Copyright

© 2022, Shi et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 493
    Page views
  • 82
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Weikang Shi
  2. Sebastien Ballesta
  3. Camillo Padoa-Schioppa
(2022)
Neuronal origins of reduced accuracy and biases in economic choices under sequential offers
eLife 11:e75910.
https://doi.org/10.7554/eLife.75910

Further reading

    1. Neuroscience
    Nikoloz Sirmpilatze et al.
    Research Article

    During deep anesthesia, the electroencephalographic (EEG) signal of the brain alternates between bursts of activity and periods of relative silence (suppressions). The origin of burst-suppression and its distribution across the brain remain matters of debate. In this work, we used functional magnetic resonance imaging (fMRI) to map the brain areas involved in anesthesia-induced burst-suppression across four mammalian species: humans, long-tailed macaques, common marmosets, and rats. At first, we determined the fMRI signatures of burst-suppression in human EEG-fMRI data. Applying this method to animal fMRI datasets, we found distinct burst-suppression signatures in all species. The burst-suppression maps revealed a marked inter-species difference: in rats, the entire neocortex engaged in burst-suppression, while in primates most sensory areas were excluded—predominantly the primary visual cortex. We anticipate that the identified species-specific fMRI signatures and whole-brain maps will guide future targeted studies investigating the cellular and molecular mechanisms of burst-suppression in unconscious states.

    1. Neuroscience
    Maria Ribeiro, Miguel Castelo-Branco
    Research Article

    In humans, ageing is characterized by decreased brain signal variability and increased behavioral variability. To understand how reduced brain variability segregates with increased behavioral variability, we investigated the association between reaction time variability, evoked brain responses and ongoing brain signal dynamics, in young (N=36) and older adults (N=39). We studied the electroencephalogram (EEG) and pupil size fluctuations to characterize the cortical and arousal responses elicited by a cued go/no-go task. Evoked responses were strongly modulated by slow (<2 Hz) fluctuations of the ongoing signals, which presented reduced power in the older participants. Although variability of the evoked responses was lower in the older participants, once we adjusted for the effect of the ongoing signal fluctuations, evoked responses were equally variable in both groups. Moreover, the modulation of the evoked responses caused by the ongoing signal fluctuations had no impact on reaction time, thereby explaining why although ongoing brain signal variability is decreased in older individuals, behavioral variability is not. Finally, we showed that adjusting for the effect of the ongoing signal was critical to unmask the link between neural responses and behavior as well as the link between task-related evoked EEG and pupil responses.