1. Neuroscience
Download icon

Frontal eye field and caudate neurons make different contributions to reward-biased perceptual decisions

  1. Yunshu Fan
  2. Joshua I Gold
  3. Long Ding  Is a corresponding author
  1. Department of Neuroscience and Neuroscience Graduate Group, University of Pennsylvania, United States
Research Advance
  • Cited 0
  • Views 570
  • Annotations
Cite this article as: eLife 2020;9:e60535 doi: 10.7554/eLife.60535

Abstract

Many decisions require trade-offs between sensory evidence and internal preferences. Potential neural substrates include the frontal eye field (FEF) and caudate nucleus, but their distinct roles are not understood. Previously we showed that monkeys’ decisions on a direction-discrimination task with asymmetric rewards reflected a biased accumulate-to-bound decision process (Fan et al., 2018) that was affected by caudate microstimulation (Doi et al., 2020). Here we compared single-neuron activity in FEF and caudate to each other and to accumulate-to-bound model predictions derived from behavior. Task-dependent neural modulations were similar in both regions. However, choice-selective neurons in FEF, but not caudate, encoded behaviorally derived biases in the accumulation process. Baseline activity in both regions was sensitive to reward context, but this sensitivity was not reliably associated with behavioral biases. These results imply distinct contributions of FEF and caudate neurons to reward-biased decision-making and put experimental constraints on the neural implementation of accumulation-to-bound-like computations.

Introduction

Complex decisions often require interpreting external sensory inputs in the context of outcome expectations and preferences. This kind of decision-making is pervasive in our daily lives, balancing what we observe with what we desire. Under controlled task conditions, both humans and non-human animals tend to achieve this balance in a roughly normative manner. Specifically, when the sensory evidence strongly supports a particular option, decision-makers tend to choose that option independent of alternative expectations and preferences. Conversely, when the sensory evidence is weak, decision-makers tend to make more and faster choices to options with preferred, expected outcomes (Maddox and Bohil, 1998; Voss et al., 2004; Diederich and Busemeyer, 2006; Liston and Stone, 2008; Whiteley and Sahani, 2008; Feng et al., 2009; Summerfield and Koechlin, 2010; Teichert and Ferrera, 2010; Gao et al., 2011; Leite, 2012; Mulder et al., 2012; Blank et al., 2013; Fan et al., 2018; Waiblinger et al., 2019). However, exactly how and where in the brain the computations needed for these flexible decision processes are implemented is not well understood.

The observed patterns of choices and reaction times (RTs) for these kinds of tasks are often consistent with an accumulate-to-bound (drift-diffusion) decision process (Ratcliff, 1978; Maddox and Bohil, 1998; Gold and Shadlen, 2002; Voss et al., 2004; Bogacz et al., 2006; Diederich and Busemeyer, 2006; Bogacz, 2007; Feng et al., 2009; Simen et al., 2009; Krajbich et al., 2010; Summerfield and Koechlin, 2010; Gao et al., 2011; Leite, 2012; Mulder et al., 2012; Blank et al., 2013; Fan et al., 2018). Within this framework, asymmetric reward-choice associations (reward contexts) induce biases in the evidence-accumulation process, the decision bounds for different choice options, or both. The relative contributions of these different forms of reward context-dependent bias likely reflect specific adaptive strategies and can vary by task design, subject, and testing days (Fan et al., 2018).

How does single-neuron activity relate to the computations required for incorporating reward and visual information to form decisions? Previously we trained monkeys to perform an asymmetric-reward direction-discrimination task (Figure 1A), in which the monkeys report their perception of the global motion direction of a noisy stimulus with eye movements under different reward contexts. We showed that activity of some caudate neurons was sensitive to both reward context and motion stimulus. In addition, caudate microstimulation affected the monkeys’ reward biases in a manner that reflected coordinated changes in drift rates and relative bound heights in a drift-diffusion decision framework (Doi et al., 2020).

Monkeys biased toward choices associated with large reward.

(A) Task design and timeline. Monkeys reported the perceived motion direction with a saccade to one of the two choice targets. The motion stimulus was turned off upon detection of the saccade. Correct trials were rewarded based on the reward context. Error trials were not rewarded. The color bars in the timeline indicate epoch definitions for the regression analysis of neural firing rates in Equation 1. (B) Average choice (top) and RT (bottom) behavior of three monkeys for sessions with FEF and caudate recordings. The FEF dataset (black) included 16,561 trials from 33 sessions for monkey F, 7924 trials from 23 sessions for monkey C, and 24,419 trials from 69 sessions for monkey A. The caudate dataset (red) included 26,614 trials from 69 sessions for monkey F, 21,076 trials from 44 sessions for monkey C, and 6309 trials from 17 sessions for monkey A. Filled and open circles: data from the two reward contexts. Similar results were reported previously for sessions with caudate recordings (Doi et al., 2020). (C) Histograms of reward bias for all sessions, estimated using logistic fits to choice data. Note that the bias magnitude varied in magnitude across monkeys and sessions, depending on the large:small reward ratio, the motion-coherence levels used in a given session, and the monkeys’ inherent perceptual sensitivity (Fan et al., 2018).

To provide additional insights into the neural implementation of these decision computations in multiple brain regions, we examined both the caudate nucleus and one of its major cortical input sources, the frontal eye field (FEF) of the lateral prefrontal cortex. Neurons in these two regions contribute to perceptual and reward-based decision making along with reward-modulated motor performance (for a very limited sample, see Thompson et al., 1996; Thompson et al., 1997; Kawagoe et al., 1998; Kim and Shadlen, 1999; Freedman et al., 2001; Schall, 2001; Coe et al., 2002; Kobayashi et al., 2002; Lauwereyns et al., 2002b; Lauwereyns et al., 2002a; Roesch and Olson, 2003; Heekeren et al., 2004; Samejima et al., 2005; Ding and Hikosaka, 2006; Nakamura and Hikosaka, 2006a; Nakamura and Hikosaka, 2006b; Boettiger et al., 2007; Lau and Glimcher, 2007; Lau and Glimcher, 2008; Pan et al., 2008; Ferrera et al., 2009; Basten et al., 2010; Ding and Gold, 2010; Cai et al., 2011; Ding and Gold, 2012c; Ding and Gold, 2012a; Heitz and Schall, 2012; Seo et al., 2012; Kim and Hikosaka, 2013; Teichert et al., 2014; Yanike and Ferrera, 2014b; Ding, 2015; Hanks et al., 2015; Santacruz et al., 2017; Amemori et al., 2018; Schall, 2019). Functional imaging and modeling studies also suggest that the two regions are involved in complex decisions that balance visual evidence and reward expectation to guide appropriate movements (Rao, 2010; Summerfield and Koechlin, 2010; Chen et al., 2015). However, their specific computational roles in these decisions, represented at the single-neuron level, remain largely speculative.

In this study, we focused on three questions: (1) How do FEF and caudate neurons encode key task factors including choice, motion strength, reward context, and reaction times (RT)? (2) Do any of these task-related modulations in either brain area reflect the monkeys’ behaviorally derived biases in drift rates? (3) Do any of these task-related modulations in either brain area reflect the behaviorally derived biases in relative bound heights?

Results

We recorded from 149 FEF neurons from three monkeys (n = 85, 24 and 40 from monkeys A, C and F, respectively) and, in separate sessions, from 140 caudate neurons from the same monkeys (n = 18, 49, and 73 from monkeys A, C and F, respectively) performing the asymmetric-reward direction-discrimination task. As we reported previously (Fan et al., 2018), the monkeys’ choices and RTs tended to reflect the strength (coherence) and direction of the visual motion stimulus but with a bias toward the large-reward option (Figure 1B,C).

Diverse task-relevant sensory and reward encoding in both brain regions

Individual neurons in both the FEF and caudate showed a diversity of task-driven responses (several examples are illustrated in Figure 2; population summaries are shown in Figure 2—figure supplement 1 and Figure 3). The FEF neuron in Figure 2A responded to choice target presentation with phasic (transient) and tonic (sustained) activation, showed a dip in activity after motion onset, then had gradually increasing activity (more for trials resulting in a contralateral choice) during motion viewing until a saccade-related burst for the contraversive saccade and a return to baseline activity for the other saccade. The FEF neuron in Figure 2B was activated after target onset, with higher activation when the contralateral choice was paired with large reward (red curves > green curves). This modulation by reward context persisted during a gradual ramp in activity during motion viewing (more for trials with the contralateral choice and for blocks when the contralateral choice was paired with large reward; t-test for H0: regression coefficient for reward context = 0, p<0.05 for all epochs 1–8). This neuron also showed a saccade-related burst for the contraversive saccade. The FEF neuron in Figure 2C showed phasic activation by choice targets and motion onset, with activity that decreased during motion viewing, more gradually for the contralateral choice and higher coherences (compare curves with different shades), until reaching a saccade-related suppression.

Figure 2 with 1 supplement see all
Task-related activity in FEF and caudate neurons.

(A-C) Activity of three example FEF neurons. For display purposes, average spike count was measured for correct trials only and convolved with a Gaussian kernel (sd = 40 ms). Green colors: large reward was paired with the ipsilateral choice. Red colors: large reward was paired with the contralateral choice. Shades: coherence levels. For alignment to motion onset, activity was truncated at 100 ms before the median reaction time. For alignment to saccade onset, activity was truncated at 200 ms after the median time for motion onset. (D-E) Activity of two example caudate neurons. Same format as A.

Figure 3 with 7 supplements see all
Comparison of task-related modulation of FEF and caudate activity.

(A) Fractions of FEF (black) and caudate (red) neurons showing significant regression coefficients in the multiple linear regression in Equation 2. Criterion: t-test, p<0.05. Dashed lines: chance level, adjusted for the number of comparisons. Filled circles: the fraction was significantly greater than chance level (Chi-square test, p<0.05/72 (8 epochs x nine comparisons)). ‘Coherence’ and ‘Coh x Rew’: neurons with significant coefficients for either choice. Vertical color bars indicate epochs defined in Figure 1A. Stars indicate epochs in which the fractions differed between FEF and caudate populations (Chi-square test, p<0.05/72). (B) Fraction of neurons with joint modulation by coherence and reward-related terms. Same format as A. (C, D) Fractions of neurons showing significant regression coefficients in the multiple linear regression in Equation 3. Same format as A and B. (E-I) Fractions of neurons showing significant non-zero regression coefficients for different regressors (Equation 3). Results from RT-reward interaction terms were omitted because both regions showed near chance-level fractions. Dashed horizontal lines: chance level. Only neurons tested with non-vertical motion stimuli were included (n = 126 and 136 for FEF and caudate, respectively).

The caudate neuron in Figure 2D did not respond to target onset but was activated after motion onset, with higher activity for the contralateral choice, at higher coherence, and in blocks when the contralateral choice was paired with large reward. These coherence and reward-context modulations persisted through saccade onset, with no convergence before saccade onset. The caudate neuron in Figure 2E also did not respond to target onset but was activated after motion onset for both choices, with a preference for ipsilateral choices that were paired with the small reward. After an initial large activation, this neuron gradually reduced firing toward saccade onset, largely maintaining reward-context and coherence modulation until after saccade onset.

These diverse trends were apparent across the populations of recorded FEF and caudate neurons. Most neurons in our sample had responses that were modulated, on average, over multiple time points during each trial, albeit with differences across neurons and brain areas. FEF neurons typically had elevated responses that began just after target onset and then persisted through motion viewing until the saccadic response (Figure 2—figure supplement 1A; for most neurons, spike rate increased just after target onset). Caudate neurons typically did not respond strongly to target onset but then had elevated responses during motion viewing and through the saccadic response (Figure 2—figure supplement 1B). Both regions included neurons with activity that increased and/or decreased relative to baseline at various time points during each trial.

To assess how these responses were modulated by choice, coherence, reward context, expected reward size for a given choice, and reaction time (RT), we first used linear regression applied to neural data in pre-defined task epochs in Figure 1A (Figure 3A–D). Because neurons in both brain areas represent a coherence- and time-dependent decision process (Kim and Shadlen, 1999; Ding and Gold, 2010; Ding and Gold, 2012c) that can conflate the effects of those two factors on neural responses, and because RT was modulated strongly by reward context in the current task, we conducted two sets of analyses: (1) using all of the factors listed above including coherence but not RT (Figure 3A,B), and (2) using all of the factors listed above including normalized RT (normalized separately for each reward context x choice combination) but not coherence (Figure 3C,D).

These epoch-based analyses showed several differences between the FEF and caudate populations, including: (1) a larger fraction of FEF neurons showed choice selectivity around and after saccade onset (Figure 3A and C, first column); (2) although selectivity for reward context emerged before motion onset in both populations and persisted through a trial, a larger fraction of caudate neurons showed such selectivity during motion viewing (second column); (3) a larger fraction of caudate neurons showed selectivity for reward size (third column); and (4) both populations showed significant coherence selectivity after motion onset, but a larger fraction of caudate neurons remained coherence-selective after a saccade was made (Figure 3A, fourth column). The caudate, but not FEF, population showed above-chance fractions of neurons with joint modulation by both reward and motion coherence (Figure 3B). Activity in both regions was related to the RT in similar fashions (Figure 3C and D). The RT-based regression also captured a larger variance of activity than the coherence-based regression in both populations (t-test on the explained variance, p<0.0001 and p=0.007 for FEF and caudate, respectively).

To examine these task-related modulations at a finer time resolution, we applied the RT-based linear regression to neural data in sliding windows (Figure 3E–I). These analyses produced results that were consistent with the epoch-based analyses and showed further between-region differences in the timing and direction of task-related activity modulations. Selectivity for choice tended to increase during motion viewing until around the saccade, with stronger selectivity for contralateral/upward choices in both regions but particularly in the FEF (Figure 3E). Selectivity for reward context was evident before motion onset and continued toward saccade onset, with a mixture of preferences for the two reward contexts (Figure 3F). For FEF neurons, a higher fraction preferred the contralateral-Large Reward context before and during early motion viewing and similar fractions preferred either contexts before saccade onset. Although this was also true for the caudate population, the extent of the laterality was weaker. Selectivity for reward size, independent of the actual choice made, was most evident for data aligned to the saccade, with similar fractions of neurons of the caudate population preferring large or small reward (Figure 3G). Very few FEF neurons showed reward-size selectivity. Selectivity for RT was evident in both regions, with a dominant preference for short RTs associated with contralateral choices (Figure 3H) and mixed preferences otherwise (Figure 3I). These general patterns were present in the three monkeys for both FEF and caudate data (Figure 3—figure supplements 13).

To further characterize the modulation patterns, we applied the demixed principal component analysis (dPCA) method for the two populations (Kobak et al., 2016). Although our sample size was relatively small and trials were inherently unbalanced for different reward-choice-coherence combinations for this method, the dPCA results corroborated several findings from the multiple linear regression analysis (Figure 3—figure supplements 47), including: (1) choice-related components tended to account for a larger portion of variance in FEF activity than caudate activity (panels B and C in each figure, purple); (2) reward context-related components tended to account for a larger portion of variance in caudate activity than FEF activity (orange), particularly for around saccade onset; (3) coherence-related components tended to account for a larger portion of variance in FEF activity around motion onset than for activity around saccade onset, while the opposite was true for caudate activity (cyan); and (4) coherence and reward context or size interactions accounted for substantial variance for both regions (dark and light green). Collectively, these results indicated that neurons in both FEF and caudate represent a variety of task-relevant signals that could, in principle, support reward-biased perceptual decisions, but with different prevalences and preferences.

Predictions of the biased decision variable in a drift-diffusion framework

As we showed previously, these monkeys’ patterns of choices and RTs on this task were consistent with a drift-diffusion model (DDM; Fan et al., 2018). According to this model, a decision is formed when accumulated motion evidence reaches one of two pre-defined (collapsing) decision bounds (Figure 4A). The monkeys’ reward-driven biases arose from coordinated, reward context-dependent adjustments of the rate of accumulation (drift rate, which scales with motion coherence) and relative bound heights (Figure 4E; for more details see Fan et al., 2018). A bias in the drift rate (ΔDrift, corresponding to the me parameter in the DDM) can be implemented as a constant offset to the momentary motion evidence (Figure 4B). A bias in the relative bound heights (ΔBound, corresponding to the z parameter in DDM) can be implemented as an offset in the starting value of the accumulation process (Figure 4C), an asymmetry in the absolute bound heights for the two choices (Figure 4D), or a combination of the two.

DDM illustration and fitted reward bias terms.

(A) Drift-diffusion model (DDM). Evidence is accumulated over time into a decision variable (DV). A decision is made when DV crosses either collapsing bound. Thin noisy lines represent simulated DVs for two coherence levels and two motion directions (three trials for each combination). The straight lines represent the average DVs. (B-D) Illustration of different implementations of a bias favoring the upper-bound choice. (B) Drift rates are biased by adding a constant positive value to the evidence, resulting in steeper slopes for motion to the upper-bound choice and shallower slopes for motion to the lower-bound choice. (C) The accumulation begins with a positive starting value. (D) The accumulation ends at a lower absolute value for upper-bound choices than for lower-bound choices. (E) Summary of reward biases in drift and bound terms from DDM fits for the three monkeys. Positive values indicate biases toward the large-reward choice. Black and red data points represent sessions with FEF and caudate recordings, respectively.

These different ways of implementing reward-driven biases correspond to different predictions of how and when the accumulating decision variable is modulated by reward context during a trial. Specifically, in the presence of a bias in the drift rate, the slope of the decision variable would be modulated by reward context during evidence viewing. In the presence of a bias in the starting value, the baseline value of the decision variable would be modulated by reward context before evidence onset. In the example in Figure 4C, the baseline value would be higher when the reward bias favors the upper-bound choice. In the presence of a bias in the bound heights, the decision variable would be modulated by reward context at the time of decision commitment. In the example in Figure 4D, the ending value would be closer to the starting point of the evidence accumulation when the reward bias favors the upper-bound choice.

We examined whether FEF and caudate activity conform to these predictions. Given the asymmetric effects of these predicted biases on the two choices, we focused on neurons with reliable choice selectivity (Figure 5) and present results from other neurons as supplements when appropriate. The ‘choice-selective’ neurons were identified as showing significant and consistent choice modulation through motion viewing (epoch #5 in Figure 1) and before saccade onset (epoch #6), based on RT-based regression analysis results shown in Figure 3. The numbers of neurons meeting these criteria for the three monkeys are shown in Table 1. The average activity of choice-selective and other neurons is shown in Figure 5. Note that because different coherence levels were used for the three monkeys, we grouped the trials by quintiles of RT for these plots.

Average activity for neuron categories.

(A) Average firing rates of neurons with significant and consistent choice selectivity. See Table 1 for number of neurons in each category. Trials were grouped by choice (left and right rows), reward context (magenta/green), and RT quintiles (shade). Activity was aligned to motion and saccade onsets for the top and bottom rows, respectively. Only correct trials were included. For motion onset alignment, firing rates were truncated at the median RT minus 100 ms for each group. For saccade onset alignment, firing rates were truncated before median motion onset plus 200 ms for each group. For display purposes, firing rates were convolved with a Gaussian kernel (sigma = 25 ms). (B) Average firing rates of other neurons. Same format as A.

Table 1
Summary of counts/percentages for neurons with task-modulated activity.
FEFCaudate
Monkey AMonkey CMonkey FMonkey AMonkey CMonkey F
Total852440184973
Consistently choice-selective3514763121
41%58%18%33%63%29%
Coherence-modulated slope of firing rate during motion viewing441226103341
52%50%65%56%67%56%
Reward context-modulated slope of firing rate during motion viewing21532915
25%21%8%11%18%21%
Reward context-modulated activity before motion onset381219153439
45%50%48%83%69%53%
Reward context-modulated activity just before saccade onset511626153449
60%67%65%83%69%67%

Even using these common criteria for categorization, the average activity patterns of neurons in the same category differed for FEF and caudate. First, consider the choice-selective subpopulations. Whereas choice-selective FEF activity appeared to be roughly consistent with bound-crossing in the DDM (i.e., reaching a fixed level of activity at the end of the decision process and before saccade onset, regardless of the time it took to reach the decision), choice-selective caudate activity did not (Ding and Gold, 2010; Ding and Gold, 2012b). For trials in which the monkey made the preferred choice of the given neuron, the slope of FEF activity during motion viewing appeared to show more separations than the slope of caudate activity, between reward contexts and RT groups. The baseline FEF activity before motion onset appeared to differ more between reward contexts. The peri-saccade activity for the preferred choice appeared to show opposite selectivity for reward context in the two regions (the purple curves tended to be above and below the green curves for FEF and caudate, respectively).

Second, in the other subpopulations that did not exhibit consistent choice selectivity, the average caudate activity appeared to maintain RT separation through saccade generation and onward, whereas the average FEF activity appeared to converge around saccade onset (Figure 5B). These apparent differences suggest that activity in the two regions may relate differently to the predictions of the DDM, which we examine in more detail below.

FEF activity reflected behaviorally derived reward-driven drift-rate biases

We first examined whether FEF and caudate activity reflected evidence accumulation with a reward-driven bias in the drift rate. As illustrated in Figure 4B, such a signal is expected to show two features in neural activity. First, the rate of accumulation depends on motion coherence. For individual neurons, this dependence translates to motion-coherence modulation of the slope of firing rates during motion viewing. Figure 6A and B illustrate our procedure for estimating the slope of change in firing rates and its modulation by coherence, reward context, and their interaction. Second, the reward-context modulation of the slope of change reflects the behavioral reward bias in drift rate. In the model, the reward bias in drift rates is independent of coherence and the drift-rate scaling and dependent only on reward context. The corresponding modulation of the (slope of) activity of individual neurons is thus by reward context alone and not by the reward context-coherence interaction. Because neurons showed substantial variations in their firing-rate ranges, we used each neuron’s modulation by coherence to normalize its modulation by reward context. The second expectation thus translates to a correlation between this normalized quantity and the behaviorally estimated bias in drift rate across neurons/sessions.

Figure 6 with 2 supplements see all
Reward-context modulation of the rate of change in FEF more closely reflected reward bias in drift rates.

(A) Illustration of measurements of different modulations of the rate of change for a single neuron. Left: average firing rates of the example neuron in Figure 2B for its preferred choice and aligned to motion onset. In 200 ms sliding windows, linear regressions were performed to estimate the slope of firing-rate changes as a function of time, coherence, reward context and their combination. Right: slope values for the sliding window in the left panel. A multiple linear regression was performed with coherence, reward context and their interaction as the regressors (lines). The offset between the two reward contexts at zero coherence (filled triangles) represents the magnitude of reward-context modulation in the regression. (B) The regression coefficients of the linear regression for different sliding windows for the example neuron. Filled circle: coefficient was significantly different from zero (t-test, p<0.05). For each neuron, the time with the largest absolute coherence modulation was identified (arrow). For the alignment to motion onset, a minimum 100 ms visual latency was imposed. (C) Coefficient values for FEF (top) and caudate (bottom) neurons with significant coherence-modulated slope values for trials with the preferred choices. (D) Scatter plots of the ratio of regression coefficients for reward context and coherence modulation (abscissa) and the behavioral bias in drift rates (from DDM fits, ordinate), for FEF (top) and caudate (bottom) neurons with significant coherence modulation. Preferred choice only. Slope values were measured from activity aligned to motion (left) and saccade (right) onset. Line and shaded area: linear regression with significant non-zero slope (t-test, p<0.05) and 95% confidence interval. Colors indicate neurons from the three monkeys.

Figure 6—source data 1

Source data for Figure 6D: FEF activity aligned to motion onset.

https://cdn.elifesciences.org/articles/60535/elife-60535-fig6-data1-v1.csv
Figure 6—source data 2

Source data for Figure 6D: FEF activity aligned to saccade onset.

https://cdn.elifesciences.org/articles/60535/elife-60535-fig6-data2-v1.csv
Figure 6—source data 3

Source data for Figure 6D: caudate activity aligned to motion onset.

https://cdn.elifesciences.org/articles/60535/elife-60535-fig6-data3-v1.csv
Figure 6—source data 4

Source data for Figure 6D: caudate activity aligned to saccade onset.

https://cdn.elifesciences.org/articles/60535/elife-60535-fig6-data4-v1.csv

We found that many neurons in both regions showed motion-coherence modulation of the slope of firing rates (Figure 6, Table 1), consistent with an involvement of both regions in evidence accumulation (Ding and Gold, 2010; Ding and Gold, 2012b). For choice-selective FEF neurons, the slope of change tended to be greater for higher coherence for trials with the neurons’ preferred choices (i.e., positive coefficients; Figure 6C). Many FEF neurons also showed opposite modulation for trials with the null choices, but these effects were inconsistent, reflecting the lower reliability in estimating slope values from low firing rates. For choice-selective caudate neurons, the slope of change did not show a consistent relationship with coherence for either the preferred or null choices. The overall magnitude of the coefficients tended to be smaller for caudate neurons, reflecting the lower firing rates of caudate neurons.

FEF activity also aligned closely with the monkeys’ reward-driven bias in drift rates across sessions and monkeys. All three monkeys tended to use positive reward biases in drift rates; that is toward the large-reward choice (Figure 4E and Figure 6D). The ratios of regression coefficients for reward context and coherence also tended to be positive for choice-selective neurons in the FEF (Figure 6D, top row). Moreover, there was a significant correlation between the monkeys’ behavioral bias in drift rates and the ratio measured from neural data (Pearson’s correlation coefficient: 0.55 and 0.48, p=0.0084 and 0.0077, for activity aligned to motion and saccade onset, respectively). As expected given the smaller sample sizes, none of the per-monkey results was statistically significant (Figure 6—figure supplement 2). These results indicated a close relationship between FEF neurons and the neural implementation of reward biases in drift rates assessed across monkeys. In the caudate sample, the behavioral bias was mostly positive, but the neural ratio was more mixed, and the two measurements did not exhibit a significant positive correlation (Figure 6D, bottom row; note that there was a significant negative correlation for caudate activity aligned to motion onset, correlation coefficient: −0.38, p=0.032). These results appeared inconsistent with a direct involvement of caudate neurons in implementing the reward bias in drift rates (see Discussion for a potential sampling bias).

Although the DDM does not provide predictions for neurons without consistent choice selectivity, these neurons may participate in the other aspects of decision-making, such as decision evaluation, that also uses information about the reward biases. We performed the same analysis for these neurons. In the FEF, the reward-context modulation of the slope of the firing rates also covaried with the monkeys’ reward biases for activity aligned to motion onset, regardless of choices (Figure 6—figure supplement 1C; correlation coefficients: 0.55 and 0.38, p=0.003 and 0.018, for contralateral and ipsilateral choices, respectively). There was also a significant correlation for these not-choice-selective neurons in the caudate sample, for activity aligned to saccade onset in trials with ipsilateral choices (Figure 6—figure supplement 1D; correlation coefficient: 0.37, p=0.0095). Thus, FEF and caudate neurons might carry information about reward biases in drift rates for computations that are not directly related to decision formation.

In addition, a small number of neurons showed significant modulation of the slope of firing rates by the reward context-coherence interaction. In the DDM, such a modulation may relate to reward context-dependent changes in the scaling factor, k. However, the small sample size precluded the detection of any such relationship (data not shown).

Reward context-modulated baseline activity was inconsistent with reward biases in relative bound heights

As we showed above, reward-driven biases in the relative bound heights of the DDM, ‘bound bias’ in short, can, in principle, be implemented as an offset to the beginning of the accumulation process (Figure 4C), an offset to the end of the accumulation process (Figure 4D), or the combined effects of the two. Neural activity reflecting such biases is expected to show three features. First, the neural activity should be sensitive to reward context. Second, the sign of its reward-context modulation should be congruent with the reward bias. For example, if the monkey uses the bound bias to favor the large-reward choice, when its preferred choice is paired with the large reward, then the neuron should increase its baseline firing before motion onset (as an offset to the beginning of the accumulation) or decrease its firing before saccade onset (as an offset to the end of the accumulation). Third, in consideration of our lack of knowledge of whether a neuron provides an excitatory or inhibitory role in the decision network, we can relax our expectation for sign congruency . However, we may still expect that, on trials when the reward-context modulation of neural activity is strong, the monkey uses a larger bound bias. We tested these predictions on choice-selective neurons. Note that similar predictions cannot be specified for neurons without choice selectivity.

We found that many choice-selective neurons showed reward context-modulated activity before motion onset and/or before saccade onset in both regions (Figure 7A–D). We assessed the reward context modulation in running windows covering two time periods around motion onset and before saccade, respectively. Figure 7A–D shows heatmaps of regression coefficients for reward context using Equation 3 (same as Figure 3F, but only for choice-selective neurons with significant non-zero values in any time bins). A quick glance suggested that FEF neurons tended to show positive coefficients before motion onset (Figure 7A; warm colors: higher activity when the neuron’s preferred choice was paired with large reward). The coefficients were more mixed in signs for FEF activity before saccade onset (Figure 7C) and for caudate activity (Figure 7B and D).

Figure 7 with 2 supplements see all
Reward-context modulation of neural activity did not conform to predictions of reward bias in relative bound heights.

(A,B) Heatmaps of normalized regression coefficients for choice-selective FEF (A) and caudate (B) neural activity before/around motion onset (Equation 3). Only neurons with significant modulation in at least one time bin are shown (t-test, p<0.05). Neurons were sorted by bound bias values (color bar to the right), measured with DDM fits. Coefficients were normalized by the maximal absolute value for each neuron for better visualization. For the heatmaps, warm colors indicate stronger activity when the neuron’s preferred choice was paired with large reward, cool colors indicate stronger activity when the null choice was paired with large reward, and gray indicates bins without significant reward context modulation. For the color bars, warm colors indicate bound biases that favored the large-reward choice, cool colors indicate bound biases that favored the small-reward choice. (C,D) Heatmaps of normalized regression coefficients for activity before saccade onset. Same format as A and B. (E-H) Fractions of neurons showing reward context modulation that was congruent with the behaviorally measured bound bias for panels (A-D), respectively. Filled circles indicate fractions that were significantly different from chance level (0.5; chi-square test, 0 < 0.05).

In contrast to the second expectation above, the signs of coefficients were not consistently congruent with the monkeys’ behavioral bound biases. As illustrated by the color bars at the right of each panel, the monkeys tended to use negative bound biases (favoring the small-reward choice). If the neural activity reflected such biases, the heatmap should be dominated by cool colors for activity before motion onset and by warm colors for activity before saccade onset. This appeared not to be the case. We quantified the fraction of congruent sessions (Figure 7E–H). The only time points with fractions that differed significantly from chance suggested incongruent neural modulation for FEF activity before motion onset (Chi-square test, p=0.05, uncorrected for multiple comparisons to reduce false negatives). We also performed running regression with coherence-based regressors (Equation 2) and observed a similar lack of congruent modulation (Figure 7—figure supplement 1).

In addition to the discrepancy in signs, the magnitude of the reward-context modulation in neural activity also did not co-vary with the magnitude of the reward bias in bound heights in either region. For each neuron with significant reward-context modulation in their activity before motion onset (epoch #3 in Figure 1), we split the trials into two groups with larger and smaller differences in activity between reward contexts, respectively (Figure 8A). If the activity reflects the bound bias, we expected the former group to show a larger bound bias. We fitted the DDM to these two groups of trials and found no consistent difference in their bound biases in either brain region (Figure 8B and C). These results suggested that the reward context-modulated baseline activity in choice-selective FEF and caudate neurons did not directly reflect the eventual bound bias that the monkeys used.

The magnitude of reward bias in relative bound heights did not vary with reward-context modulation of neural activity.

(A) Trials for each reward context were split into two halves based on a neuron’s average activity before motion onset (epoch #3). Reward bias in relative bound heights were measured for trials with large/small reward-context modulation of activity (dark gray/light gray). If the neural activity reflects the behavioral bias, the trials with large modulation were expected to show a larger behavioral bias. (B,C) Scatter plots of the difference in reward bias in relative bound heights between large and small-modulation trials and the bias measured from all trials for FEF (B) and caudate (C) neurons with consistent choice selectivity and significant reward context modulation. P values are from t-test (H0: the mean difference of the x-axis values is zero).

Discussion

Using a task with manipulations of visual stimuli and reward-choice associations, we found that a substantial fraction of neurons in both FEF and caudate were sensitive to both stimulus properties and the reward-choice association. Despite these coarse similarities, we also identified inter-regional differences in the prevalence and distribution of lateralized modulation by choice and reward context, reminiscent of previous results using tasks with either stimulus or reward-context manipulations (Ding and Hikosaka, 2006; Kobayashi et al., 2007; Ding and Gold, 2010; Ding and Gold, 2012c). For choice-selective FEF neurons, their average activity profile followed an accumulation-to-bound-like pattern (Thompson et al., 1996; Thompson et al., 1997; Kim and Shadlen, 1999; Purcell et al., 2010; Ding and Gold, 2012c). Their reward-context modulations were consistent with predictions of a reward-driven bias in drift rates, but not a reward-driven bias in relative bound heights. For choice-selective caudate neurons, their average activity profile was consistent with evidence accumulation, but not bound crossing (Ding and Gold, 2010). Their reward-context modulations did not show a consistent link with either form of reward-driven biases. These differences suggest that the two regions have distinct roles in implementing the computations required for this task.

The closer link between FEF activity with biases in drift rate versus bound heights may appear to be at odds with previous results implicating a more prominent role of the FEF and its rodent homolog in transforming the accumulated evidence into a categorical choice than in the evidence-accumulation process itself (Freedman et al., 2001; Ferrera et al., 2009; Hanks et al., 2015). However, these roles likely reflect the specific task and the subjects’ strategy for performing that task. For example, for tasks involving manipulations of category definitions, FEF activity showed strong correlates of decision rules (Freedman et al., 2001; Ferrera et al., 2009). For our task, the category definitions remained constant and monkeys tended to use consistent changes in drift rates to favor the large-reward choice and variable changes in relative bound heights that can favor the large- or small-reward choice, depending on the monkey and daily session (Fan et al., 2018). The propensity of FEF neurons to encode the changes in drift rates may thus reflect the relative importance of those particular biases to the decision process.

FEF shares many response properties with the lateral intraparietal area (LIP), particularly for decisions based on random-dot motion stimulus (e.g., Shadlen and Newsome, 1996; Kim and Shadlen, 1999; Roitman and Shadlen, 2002; Ding and Gold, 2012c; Meister et al., 2013). Interestingly, a previous study of monkey LIP activity for an asymmetric-reward motion discrimination task showed opposite relationships with behavioral reward biases than what we found for FEF (Rorie et al., 2010): LIP activity was consistent with an involvement in reward-biased bound heights but not drift rates. The contrasts between that study and ours suggest two possible interpretations. One possibility is that LIP and FEF perform complementary roles by implementing reward biases in relative bound heights and drift rate, respectively. Another possibility is that the two regions share similar roles, and the apparent differences from the two studies reflect differences in their task designs. Rorie and colleagues used a substantially different task design from ours, including experimenter- versus subject-controlled motion viewing and trial- versus block-wise manipulations of reward contexts. In principle, these differences could influence not only what strategy monkeys use, but also which brain regions are employed to implement the required computations through training. A direct comparison between LIP and FEF neurons in the same monkeys performing the same decision task would help disambiguate these possibilities.

The lack of a consistent link between caudate activity before motion onset and reward bias in relative bound heights was surprising. Previous studies using tasks with reward manipulations and salient visual stimuli showed shared time courses between caudate activity before target onset and reward biases in RT during reward-context transitions (Lauwereyns et al., 2002a) and with manipulations of the timing of target onset (Ding and Hikosaka, 2007), as well as trial-by-trial correlations between caudate activity and action values estimated from monkeys’ reward biases (Lau and Glimcher, 2008). We also showed previously that caudate microstimulation evoked changes in relative bound for a symmetric-reward motion discrimination task (i.e., without reward-driven biases; Ding and Gold, 2012a). These previous results naturally led to the hypothesis that the caudate activity preceding stimulus presentation helps to bias the starting value for evidence accumulation. Our negative results here imply that, for this task, caudate activity does not directly set the starting value. This finding is also consistent with our observation that, for the same asymmetric-reward motion discrimination task, caudate microstimulation did not cause consistent changes in bound biases (Doi et al., 2020). Taken together, we hypothesize that the caudate nucleus does not directly implement the bound bias, but rather coordinates the bound bias with the bias in drift rate. For simple tasks in which bound biases alone are sufficient, caudate activity appears to be directly correlated with bound bias. For complex tasks in which additional computations are involved, such a correlation could be substantially weakened.

For the lack of a relationship between caudate activity and the bias in drift rate, a caveat needs to be considered. To detect a correlation between neural activity and the reward bias in drift rates (Figure 6), it requires a sample that is large enough and, equally as important, with sufficient variations in behavioral biases. For practical reasons, our caudate samples were mostly from two monkeys (C and F) that tended to use smaller and less variable drift-rate biases across sessions than the other monkey (A). The more restricted ranges of drift-rate bias might have biased our results. Nevertheless, the negative correlation we observed in Figure 6D is consistent with our previous demonstration that caudate microstimulation tended to scale down reward biases in drift rates (Doi et al., 2020).

Besides decision formation, both FEF and the caudate nucleus have other hypothesized decision-related roles, including performance monitoring (Ding and Gold, 2010; Ding and Gold, 2012c; Ding and Gold, 2013; Teichert et al., 2014; Yanike and Ferrera, 2014a). As we showed, reward context-modulated neural activity was present in a substantial fraction of neurons that were not consistently selective for choice. These activity patterns were sensitive to reward biases in drift rates during motion viewing (Figure 6—figure supplement 1) or in the baseline firing before motion onset and saccade onset (Figure 7—figure supplement 2). In addition, some choice-selective neurons showed negative ratios of reward context and coherence coefficients (Figure 6D), which are not consistent with the decision variable predicted by the DDM but could reflect a choice confidence signal instead. It would be interesting to investigate further the exact functional roles of these activity patterns for solving decision-making tasks.

For our task, neither FEF nor caudate activity represented the full, latent decision variable as predicted in the DDM framework. For example, in addition to the disconnect between bound bias and reward context-modulated baseline activity, the example FEF neuron in Figure 6A showed a strong modulation by the coherence-reward context interaction, which was not predicted by the DDM. A striking observation for FEF was the relatively consistently opposite signs in the reward bias in bound heights and the reward-context modulation of pre-motion baseline activity in choice-selective neurons. This finding raises several possibilities, including: (1) the DDM framework does not accurately capture the monkeys’ decision-related computations; (2) the reward-context modulation of pre-motion baseline activity contributes to the reward bias in bound heights through an intermediary, sign-reversing mechanism; and/or (3) such activity does not contribute to the reward bias in bound heights. Relevant to the first possibility, we previously fitted the monkeys’ performance using two model variants (fixed-bound and collapsing-bound) and two fitting procedures (Hierarchical DDM using MCMC sampling and single-session DDM fits with multiple runs using maximum a posteriori) (Fan et al., 2018; Doi et al., 2020). These different ways of model fitting resulted in similar patterns of the signs of reward biases in bound heights and drift rates. These data argued against gross inaccuracy in DDM fits of reward bias in bound heights, but it remains to be tested whether a non-DDM framework could capture the monkey’s performance and predict modulations of decision variables more in line with those observed in FEF activity.

Many other brain regions undoubtedly contribute to decisions that combine sensory and reward information. These regions likely include the lateral intraparietal area (LIP), superior colliculus, and the premotor cortex in monkeys, each of which has been shown to represent the basic patterns of activity that are reminiscent of an accumulation-to-bound decision process (Roitman and Shadlen, 2002; Ratcliff et al., 2003; Thura and Cisek, 2016). Even more regions have shown reward-context modulated pre-stimulus activity and saccade-related activity in monkeys performing tasks with reward manipulations and salient visual stimuli, including those areas and many other nuclei in the basal ganglia (Platt and Glimcher, 1999; Coe et al., 2002; Lauwereyns et al., 2002a; Sato and Hikosaka, 2002; Ikeda and Hikosaka, 2003; Roesch and Olson, 2003; Isoda and Hikosaka, 2008). Reward manipulations also likely affect the sensory representation itself, leading to biased drift rates (Cicmil et al., 2015). More studies like ours that directly compare neural activity in different brain regions under the same task conditions are needed to better understand their overlapping and distinct roles in these kinds of decisions. A particularly intriguing target of such studies would be the superior colliculus, because of its convergent inputs from FEF, LIP, and the basal ganglia, as well as its well-documented roles in attentional control that are likely closely related to reward modulation (Krauzlis et al., 2018). The distribution of neural representations of biases in drift rates and relative bound heights would also help us understand the dissociated effects of Parkinson’s Disease on these two forms of bias (Perugini et al., 2016).

To summarize, FEF and caudate activity showed modulations by choice, reward context, and visual stimulus strength, in monkeys that combined reward context and visual input into categorical saccade choices. These two regions shared certain features in their activity, but also showed distinct patterns that implicated their different roles in complex decision making. It would be interesting to examine how the relative contributions of FEF and caudate neurons develop over training (Antzoulatos and Miller, 2011; Seo et al., 2012) and with induced changes in the subjects’ reward bias strategy.

Materials and methods

Key resources table
Reagent type (species)
or resource
DesignationSource or referenceIdentifiersAdditional information
Software, algorithmMATLABMathworksRRID:SCR_001622https://www.mathworks.com
Software, algorithmPython 3.5Python Software FoundationRRID:SCR_008394https://www.python.org/
Software, algorithmPsychophysics ToolboxPelli, 1997; Kleiner, 2007RRID:SCR_002881http://psychtoolbox.org/
Software, algorithmPandas v0.19.2Python Data Analysis LibraryRRID:SCR_018214https://pandas.pydata.org/
Software, algorithmScikit-learn v0.18.1scikit-learn.orgRRID:SCR_002577https://scikit-learn.org/stable/
Software, algorithmStatsmodels v0.8.0Statsmodels.orgRRID:SCR_016074https://www.statsmodels.org/stable/index.html
Software, algorithmScipy v0.18.1SciPy.orgRRID:SCR_008058https://docs.scipy.org/doc/scipy/reference/stats.html
Software, algorithmPyMC 2.3.6http://github.com/pymc-devs/pymchttp://github.com/pymc-devs/pymc
Software, algorithmdPCAKobak et al., 2016https://github.com/machenslab/dPCA/tree/master/matlab

Experimental model and subject details

Request a detailed protocol

All training and experimental procedures were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the University of Pennsylvania Institutional Animal Care and Use Committee (protocol #804726). Details about monkey training, behavioral tasks, and caudate recording were reported previously (Fan et al., 2018; Doi et al., 2020).

Neural recording

Request a detailed protocol

Each monkey was implanted with a head holder and recording cylinder that provided access to the FEF (right for monkeys C and F, left for monkey A). The FEF was identified as the anterior bank of the arcuate sulcus where saccades were evoked with microstimulation of <50 μA (70 ms trains of 300 Hz, 250-μs biphasic pulses) (Bruce and Goldberg, 1985; Ding and Gold, 2012b). Neural activity was recorded using a combination of glass-coated tungsten electrodes (Alpha-Omega), epoxylite-coated tungsten electrodes (FHC), and multi-contact electrodes (V-probe, Plexon, Inc; Multitrodes, Thomas Recording), driven by a NaN microdrive (NAN Instruments, LTD). A memory-guided delayed saccade task was used to estimate the response field of a neuron (Ding and Gold, 2012b). For the motion discrimination task, one choice target was placed in the response field and the other was placed symmetrically across the central fixation point. Motion directions were along the axis defined by the choice targets.

Single-unit recordings were obtained for neurons that showed activity modulation during trials by visual inspection and single-unit spikes were sorted offline (OfflineSorter, Plexon). Neurons with low firing rates (peak firing rate <5 Hz) and few trials (<5 finished trials per choice × coherence × reward context combination or <3 correct trials per combination) were excluded from analysis.

Behavioral analysis

Request a detailed protocol

To quantify reward context-induced biases, a logistic function was fitted to the choice data for all trials for each session:

(1) Pcontra choice= 11+ eSlope(Coh+Bias),

where Coh is the signed motion coherence,

Slope = slope0 + sloperew×RewCont,
Bias = bias0 + biasrew×RewCont,
RewCont={1 for contralaterallarge reward blocks,1 for ipsilaterallarge reward blocks}.

To infer the latent decision variable, we also fitted the choice and saccade reaction time (RT) data simultaneously to a drift-diffusion model (DDM; Figure 4A), following previously established procedures (Fan et al., 2018). We defined RT as the time from stimulus onset to saccade onset. Saccade onset was identified offline with respect to velocity (>40°/s) and acceleration (>8000°/s2). The DDM assumes that the latent decision variable (DV) is the time integral of evidence (E) and reward asymmetry-induced fictive evidence (me), scaled by a constant (k).

EN(coherence, 1) and DV= k(E+me) dt

At each time point, the DV was compared with two collapsing choice bounds (Zylberberg et al., 2016). The time course of the choice bounds was specified as a/(1+eβ_alphat-β_d), where β_alpha and β_d controlled the rate and onset of decay, respectively and a specified the maximal distance between the two choice bounds. A bias-related parameter (z) specified the relative bound heights of the two choice bounds, where z = 0.5 indicated equal bound heights for the two choices, z>0.5 indicated that the upper bound was closer to the starting point of evidence accumulation than the lower bound.

For sessions with neurons showing choice-selective activity during a pre-saccade period (see below for epoch definitions), the upper bound was associated with the preferred choice and the lower bound was associated with the null choice. In other words, if DV crossed the upper bound first, a saccade was made to the target inside the neuron’s response field; if DV crossed the lower bound first, a saccade was made to the other target.

DDM model fitting was performed, separately for each session, using the maximum a posteriori estimate method (python v3.5.1, pymc 2.3.6) and prior distributions suitable for human and monkey subjects (Wiecki et al., 2013). We performed at least five runs for each variant and used the run with the highest likelihood for further analyses. Biases in drift and bound (Figure 5I) were computed as the difference in the fitted me and z values between the two reward contexts, respectively. Positive values indicated biases toward the large reward choice.

Neural data analysis

Request a detailed protocol

We performed three regression analyses on the neural data. First, for each single unit, we computed the average firing rates in eight task epochs (Figure 1A): three epochs before motion stimulus onset (400 ms window beginning at target onset, variable window from target onset to dots onset, and 400 ms window ending at motion onset), two epochs during motion viewing (a fixed window from 100 ms after motion onset to 100 ms before median RT and a variable window from 100 ms after motion onset to 100 ms before saccade onset), a pre-saccade 100 ms window, a peri-saccade 300 ms window beginning at 100 ms before saccade onset, and a post-saccade 400 ms window beginning at saccade onset (before feedback and reward delivery). For each unit, a multiple linear regression was performed on the spike counts in correct trials, for each task epoch separately.

(2) Spike count=β0+βChoice×IChoice+βRewCont×IRewCont+βRewSize×IRewSize+ βCohContra×ICohContra+ βCohIpsi×ICohIpsi+ βRewCohContra×ICohContra×IRewSize + βRewCohIpsi×ICohIpsi×IRewSize,

where

IChoice={1 for contralateral/up choice, 1 for ipsilateral/down choice} ,
IRewCont={1 for contralateral/uplarge reward blocks ,1foripsilateral/downlargerewardblocks},
IRewSize={1 if a large reward is expected for the choice,  1  if a small reward is expected},
ICohContra={coherence for contralateral/up choice,  0 for ipsilateral/down choice},

and

ICohIpsi={0 for contralateral/up choice, coherence for ipsilateral/down choice}.

Significance of non-zero coefficients was assessed using t-test (criterion: p=0.05).

Second, for each single unit, we also performed running regressions using Equation 2 on the spike counts within 150 ms windows every 10 ms. These running regressions were performed on activity aligned to target, motion, and saccade onsets separately. Only correct trials were included. Time windows with fewer than 10 correct trials were excluded.

Third, for these neurons, the following multiple linear regressions was performed in epochs and running windows defined above:

(3) Spike count=β0+βChoice×IChoice+βRewCont×IRewCont+βRewSize×IRewSize+ βRTContra×RTContra+ βRTIpsi×RTIpsi+ βRewRTContra×RTContra×IRewSize + βRewRTIpsi×RTIpsi×IRewSize

where

RTContra={RT for the contralaeteral/up choice,  0 for the ipsilateral/down choice},

and

RTIpsi={0 for the ipsilateral/down choice,  RT for the contralateral/up choice}.

To control for reward context or choice-dependent modulation of RT, the RT values used in the regressions were the mean-subtracted values, with the mean values measured for the corresponding reward context-choice combinations. Significance of non-zero coefficients was assessed using t-test (criterion: p=0.05).

Because reward context was alternated in blocks in our task, we examined whether the significant coefficients for reward context in these regressions were simply due to serial correlation of the reward context values (Elber-Dorozko and Loewenstein, 2018). To assess the potential effect of serial correlation on our results, we focused on the epoch-based regressions. For each neuron x epoch combination with a significant coefficient for reward context, we estimated the null distribution of the coefficient by performing 100 regressions using random, unmatched reward-context values. To obtain these unmatched values, we concatenated the reward-context values from all neurons and randomly picked a segment for each regression. We performed one-tailed comparisons between the null distributions and the coefficients obtained using real data and updated the p-values for the reward-context coefficient accordingly. To gain another perspective of the task-related modulation patterns in each population, we performed demixed principal component analysis on spike activity (Kobak et al., 2016), using the publicly available source code (https://github.com/machenslab/dPCA/tree/master/matlab). We focused on two epochs for activity aligned to motion and saccade onset, respectively (Figure 3—figure supplements 47) and used only correct trials for this analysis. To mitigate the unbalance inherent in our data set (e.g., there were fewer correct trials for low coherence or when the choice led to small reward; different coherence levels were used for the three monkeys), we used the four highest coherence levels for each session as equivalent conditions across monkeys/sessions.

Measuring the slope of change in firing rates

Request a detailed protocol

Only correct trials were included for this analysis. Spike trains were aligned to motion onset and grouped by coherence x reward context combinations. The average firing rates were computed for each combination, truncated at median RT for the combination, and convolved with a Gaussian kernel (σ = 20 ms). The slope of change was measured from 200 ms running windows (in 20 ms steps) of the smoothed firing rates for each combination, using a linear regression with time as the independent variable. For each running window, a multiple linear regression was performed, using coherence, reward context, and their interaction as the independent variable and the slopes of change as the dependent variable. Significance for individual regressors was assessed using t-test (criterion: p=0.05).

Splitting trials based on baseline activity before motion onset

Request a detailed protocol

This analysis was performed on neurons with significant reward-context modulation of average firing rates during epoch #3 (a 400 ms window before motion onset), as identified using the regression in Equation 3. For each neuron, trials were divided into two halves based on the average firing rate in epoch #3, separately for each reward context. This resulted in four combinations of trials: high/low firing rates and two reward contexts (Figure 8A). The ‘large modulation’ trials comprised of high-firing-rates trials in the neuron’s preferred reward context and low-firing-rates trials in the other context. Conversely, the ‘small modulation’ trials comprised of low-firing-rates trials in the neuron’s preferred reward context and high-firing-rates trials in the other context. If the neural activity is closely linked to the reward bias in relative bound heights, the trials with large modulation were expected to show a larger reward bias. These two types of trials were fitted by DDM separately and their estimated reward bias in relative bound heights were compared (Figure 8B and C).

Data availability

Data used for this manuscript are included as supporting files.

References

  1. Book
    1. Ding L
    2. Gold JI
    (2012b)
    Neural Correlates of Perceptual Decisions That Incorporate Asymmetric Reward Information
    Society for Neuroscience.
    1. Kleiner M
    (2007)
    What’s new in Psychtoolbox-3?
    Perception 36:1–16.

Decision letter

  1. Daeyeol Lee
    Reviewing Editor; Johns Hopkins University, United States
  2. Michael J Frank
    Senior Editor; Brown University, United States
  3. Chandramouli Chandrasekaran
    Reviewer
  4. Jochen Ditterich
    Reviewer; University of California, Davis, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Thank you for submitting your article "Frontal eye field and caudate neurons encode complementary features of reward-biased perceptual decisions" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Michael Frank as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Chandramouli Chandrasekaran (Reviewer #2); Jochen Ditterich (Reviewer #3).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

As the editors have judged that your manuscript is of interest, but as described below that additional experiments are required before it is published, we would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). First, because many researchers have temporarily lost access to the labs, we will give authors as much time as they need to submit revised manuscripts. We are also offering, if you choose, to post the manuscript to bioRxiv (if it is not already there) along with this decision letter and a formal designation that the manuscript is "in revision at eLife". Please let us know if you would like to pursue this option. (If your work is more suitable for medRxiv, you will need to post the preprint yourself, as the mechanisms for us to do so are still in development.)

Summary:

This study examined how the activity of neurons in the two important nodes in the frontocortico-ganglia network, namely, the frontal eye field (FEF) and caudate nucleus (CD), change their activity according to the animal's choice, incoming sensory information, and expected outcome, during a asymmetrically rewarded random-dot motion discrimination task. The main focus of this manuscript is to report significant differences in the properties of neurons in these two structures, and how they might covary with the systematic changes in the parameters of the drift-diffusion model (DDM). The main differences between the FEF and CD were (1) that choice signals were more robust in the FEF, (2) that reward context signals were more robust in the CD, and (3) that the coherence signals lasted longer after the choice in the CD. These results were obtained using the conventional regression analyses and appear to be reliable. By contrast, the analyses that attempted to link the changes in the FEF and CD activity to the behavioral variability across different sessions were confusing and not convincing, and require significant additional work.

Essential revisions:

1) All the reviewers felt that the reward-dependent changes in the baseline firing rate before motion set must be analyzed more thoroughly and discussed better, especially given the opposite conclusion in a previous report by Rorie and Newsome ("Overall, detailed analysis and computer simulation reveal that our data are consistent with a two-stage drift diffusion model proposed by Diederich and Bussmeyer, 2006 for the effect of payoffs in the context of sensory discrimination tasks. Initial processing of payoff information strongly influences the starting point for the accumulation of sensory evidence, while exerting little if any effect on the rate of accumulation of sensory evidence.").

The relationship between the result in Figure 3F, middle column, suggests that, at the time of motion onset, a substantial fraction of FEF neurons are modulated by reward context, while the result in Figure 7B show that this modulation was not particularly congruent with relative bound heights in the DDM. This might be possible when there is a sizeable number of sessions with a positive decision bound asymmetry and a sizeable number of sessions with a negative decision bound asymmetry (which seems consistent with what is plotted in Figure 4I), although the sign of the behavioral bias does not change. However, it is very surprising to see Figure 7D, which shows hardly any sessions with a positive bias in bound heights. This should be explained better.

Related to this, when assessing congruency for the statistical analysis shown in Figure 7B, did the authors use only the sessions with context modulations and biases that were significantly different from zero (i.e., the red dots in Figure 7—figure supplement 2)? Or all sessions (i.e., you also used the sign of the parameters to assess congruency, even when the parameters were not both significantly different from zero, meaning you used all of the dots in Figure 7—figure supplement 2, regardless of their color)? If the latter, do they get the same result when limiting your analysis to the red dots? It is possible that the noise contributed by parameters with very small absolute values could have differentially affected the results for the different brain areas, as the dissociation is based on the fraction of congruent modulations being significantly different from 0.5 in caudate neurons, but not so for FEF neurons.

2) The reviewers are concerned about the fact that the caudate data were collected from monkeys C and F, whereas the majority of FEF data were collected from a different animal, monkey A. It is therefore important to make sure that the reported dissociation is indeed one between different brain areas, and not one between different monkeys. Figure 1B, middle row and Figure 4I suggest that the different monkeys were using somewhat different strategies when biasing their decisions, and such differences could be reflected in the neural data. We hope that this concern can be addressed with the existing dataset, but collecting additional data (caudate data from monkey A or more FEF data from monkeys C and/or F) would also be an option. Can the same dissociation be demonstrated when restricting the analysis to data from monkeys C and F? If not, the authors should come up with alternative strategies for demonstrating that the results are not related to monkey identity.

3) The heterogeneity across different neurons might be better handled by a more modern method, such as Targeted Dimensionality Reduction or demixed PCA, which would provide more rigorous and easier interpretability. For example, Gaussian-smoothed firing rates can be used as input to dPCA and the target + motion epochs and another one aligned to movement onset. This would allow readers who think more in terms of neural populations to appreciate this paper more. Interaction terms etc can be easily included in the analysis.

4) Alternative hypotheses, such as two stage model, pre-stimulus urgency signals, starting point hypothesis, need to be tested and rejected more convincingly. Currently, much larger effects in baseline state across a large neural population in FEF is deemphasized compared to the smaller effects seen in the non-choice selective caudate neurons. This might be because the analysis of neural data is based on the conviction that their behavioral model is also the best model for the neural data, but it might be necessary to explore the possibilities beyond their best behavioral model, which might still be wrong. This is also where population analysis (e.g., dPCA) might be helpful. On the one hand, it may be reasonable to only select choice-selective neurons but in doing so we are tossing 70% of the dataset. Only 44 and 36 neurons are now going into your analysis. If the authors decode variables from dPCA with reasonable variance you can just directly look at them in relation to model predictions. It uses the whole dataset and that is the advantage of such an approach.

5) The regression analysis used in this analysis shown in Figure 6 was based on the rate of change in firing rate (not the firing rate itself), but how the slope was calculated was not explicitly explained. The details of this should be given in the Materials and methods. For example, to cleanly separate the effect related to drift rate, it might be necessary to remove the contribution from the changes in the baseline firing rate, and simply calculating the slope of the regression model applied to different bins of the neural activity during the 200 ms window might not be sufficient for this.

Also, isn't it possible that at least some of the effects illustrated in Figure 6C are mediated by the changes in neural activity related to RT? If so and if the contribution of RT is not controlled for, how does this affect the interpretation?

6) To distinguish between different scenarios depicted in Figure 4, the authors applied a regression model that relates to the slope of firing rate to coherence, reward context, and their interaction. However, since this model includes the interaction term, the ratio between the regression coefficients for the two main terms is not the most appropriate quantity to test the scenario in Figure 4B. An alternative and simpler method might be to examine the coefficient for coherence and the difference in the average slope in the two reward contexts. For example, although the example neuron in Figure 6A might have a significant effect of reward context (in terms of intercept), the effect of reward context varies with coherence effect and reverses for high coherence, which is not consistent with the pattern expected from the biased drift model (Figure 4B). In addition, the ratio between the two regression coefficients might produce unreliable results, because they are disproportionally influenced by the denominator (log transformation might be appropriate). The negative ratios shown for some neurons are also difficult to interpret.

7) The dissociation between Figure 6C and D is based on finding a significant correlation in C, but not in D. The result would be strengthened by being able to show that there is a statistical difference between one particular parameter that can be estimated for both FEF and caudate neurons. For example, is it possible to estimate both linear regression slopes and to show that they are statistically different?

8) To test whether reward-related changes in neural activity was related to the variability in the bound height, they focused on the consistency in the signs of the regression coefficients for reward context and bias in the bound heights (Figure 7D). However, these coefficients are negative for the majority of the neurons, so they do not address the question of whether the variability in these two measures are correlated across sessions (for example, what was the correlation coefficient for the data shown in Figure 7D?).

9) The authors have used two regression models to examine the activity of FEF and CD neurons, one including coherence (sensory variable), and the other including RT (motor variable). Since the effects of these variables were modeled separately for each choice (e.g., ipsi vs. contra), they are correlated with each other, but in principle, this should not prevent the authors from including both of these regressors in the same model. This would be preferred, because it is likely that these two factors still influence the activity in either or both of these structures differently. Also, when the effects of coherence and RT were summarized in Figure 3, it would be better to correct them for multiple comparisons. In addition, the reward context was varied across blocks, which would introduce serial correlation in both independent and dependent variables of the regression model, making a standard t-test no longer appropriate (c.f., Elber-Dorozko and Loewenstein, 2018).

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Frontal eye field and caudate neurons encode complementary features of reward-biased perceptual decisions" for further consideration by eLife. Your revised article has been evaluated by Michael Frank (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

This is a significant study that advances our understanding on the functional specialization of the frontal eye field and caudate nucleus during a perceptual decision making with asymmetric reward outcomes. The reviewers appreciated the extensive amount of work performed by the authors to address their previous concerns, but have one important remaining concern. In particular, the authors are concerned that the relationship between the behavioral bias in drift and the neural effects of reward context seen in the FEF (Figure 6) might be driven by the results from one animal (monkey A). In addition, it is possible that the results might be due to the individual differences among different monkeys rather than the session-by-session (or neuron-by-neuron) changes in behaviors (i.e., Simpson's paradox). Therefore, it is strongly suggested that the authors should analyze and report the results separately for individual animals, or at least repeat this analysis after excluding the results from monkey A.

https://doi.org/10.7554/eLife.60535.sa1

Author response

Essential revisions:

1) All the reviewers felt that the reward-dependent changes in the baseline firing rate before motion set must be analyzed more thoroughly and discussed better, especially given the opposite conclusion in a previous report by Rorie and Newsome ("Overall, detailed analysis and computer simulation reveal that our data are consistent with a two-stage drift diffusion model proposed by Diederich and Bussmeyer, 2006 for the effect of payoffs in the context of sensory discrimination tasks. Initial processing of payoff information strongly influences the starting point for the accumulation of sensory evidence, while exerting little if any effect on the rate of accumulation of sensory evidence.").

We thank the reviewers for this comment. We understand that this negative finding is contrary to what the field, including ourselves, have hypothesized for these two brain regions and thus requires more thorough documentation. We therefore completely revised the section about the baseline activity the presentation of results regarding the baseline activity, including new figures and analysis results to illustrate the lack of expected relationships between reward-context modulation of baseline activity and behavioral bias in bound heights, in terms of either the sign or magnitude of reward modulation. We also emphasized the difference between our FEF data and Rorie and colleagues’ LIP data in the Discussion:

“FEF shares many response properties with the lateral parietal area (LIP), particularly for decisions based on random-dot motion stimulus (e.g., Shadlen and Newsome, 1996; Kim and Shadlen, 1999; Roitman and Shadlen, 2002; Ding and Gold, 2012c; Meister et al., 2013). Interestingly, a previous study of monkey LIP activity for an asymmetric-reward motion discrimination task showed opposite relationships with behavioral reward biases than what we found for FEF (Rorie et al., 2010): LIP activity was consistent with an involvement in reward-biased bound heights but not drift rates. The contrasts between that study and ours suggest two possible interpretations. One possibility is that LIP and FEF perform complementary roles by implementing reward biases in relative bound heights and drift rate, respectively. Another possibility is that the two regions share similar roles, and the apparent differences from the two studies reflect differences in their task designs. Rorie and colleagues used a substantially different task design from ours, including experimenter- versus subject-controlled motion viewing and trial- versus block-wise manipulations of reward contexts. In principle, these differences could influence not only what strategy monkeys use, but also which brain regions are employed to implement the required computations through training. A direct comparison between LIP and FEF neurons in the same monkeys performing the same decision task would help disambiguate these possibilities.”

The relationship between the result in Figure 3F, middle column, suggests that, at the time of motion onset, a substantial fraction of FEF neurons are modulated by reward context, while the result in Figure 7B show that this modulation was not particularly congruent with relative bound heights in the DDM. This might be possible when there is a sizeable number of sessions with a positive decision bound asymmetry and a sizeable number of sessions with a negative decision bound asymmetry (which seems consistent with what is plotted in Figure 4I), although the sign of the behavioral bias does not change. However, it is very surprising to see Figure 7D, which shows hardly any sessions with a positive bias in bound heights. This should be explained better.

We shared with the reviewers the same prior expectation that reward context-modulated baseline activity in FEF neurons should be directly linked to behavioral bound asymmetry, i.e., the signs should be congruent. We were surprised to find that this was not the case. We have shown and discussed extensively in Fan et al., 2018 that the monkeys tended to use negative bound biases (i.e., in the non-adaptive direction, to compensate for excessive drift-rate biases). We made new figures to show that, despite the dominance of behavioral bound bias favoring the small-reward choice, FEF neurons tended to show higher baseline activity when their preferred choice was paired with large reward (Figure 7A) and caudate neurons showed a roughly even mix of higher or lower activity for this reward context. We performed additional analysis using trials split into two groups, with larger and smaller reward-context modulation of baseline activity, respectively. Contrary to the prediction that the behavioral reward bias should be larger for the former group, we did not find any consistent differences between the groups (Figure 8).

We added our interpretation of these results in Discussion:

“A striking observation for FEF was the relatively consistently opposite signs in the reward bias in bound heights and the reward-context modulation of pre-motion baseline activity in choiceselective neurons. This finding raises several possibilities, including: 1) the DDM framework does not accurately capture the monkeys’ decision-related computations; 2) the reward-context modulation of pre-motion baseline activity contributes to the reward bias in bound heights through an intermediary, sign-reversing mechanism; and/or 3) such activity does not contribute to the reward bias in bound heights. Relevant to the first possibility, we previously fitted the monkeys’ performance using two model variants (fixed-bound and collapsing-bound) and two fitting procedures (Hierarchical DDM using MCMC sampling and single-session DDM fits with multiple runs using maximum a posteriori) (Fan et al., 2018; Doi et al., 2020). These different ways of model fitting resulted in similar patterns of the signs of reward biases in bound heights and drift rates. These data argued against gross inaccuracy in DDM fits of reward bias in bound heights, but it remains to be tested whether a non-DDM framework could capture the monkey’s performance and predict modulations of decision variables more in line with those observed in FEF activity.”

Related to this, when assessing congruency for the statistical analysis shown in Figure 7B, did the authors use only the sessions with context modulations and biases that were significantly different from zero (i.e., the red dots in Figure 7—figure supplement 2)? Or all sessions (i.e., you also used the sign of the parameters to assess congruency, even when the parameters were not both significantly different from zero, meaning you used all of the dots in Figure 7—figure supplement 2, regardless of their color)? If the latter, do they get the same result when limiting your analysis to the red dots? It is possible that the noise contributed by parameters with very small absolute values could have differentially affected the results for the different brain areas, as the dissociation is based on the fraction of congruent modulations being significantly different from 0.5 in caudate neurons, but not so for FEF neurons.

In the original manuscript, the results in Figure 7B used only sessions with significant reward context modulation (i.e., the red dots in Figure 7—figure supplement 2). The fraction of congruent modulation was thus not an artifact from very small absolute values. Note that we have made major revisions in Figure 7 and those original figures were removed.

2) The reviewers are concerned about the fact that the caudate data were collected from monkeys C and F, whereas the majority of FEF data were collected from a different animal, monkey A. It is therefore important to make sure that the reported dissociation is indeed one between different brain areas, and not one between different monkeys. Figure 1B, middle row and Figure 4I suggest that the different monkeys were using somewhat different strategies when biasing their decisions, and such differences could be reflected in the neural data. We hope that this concern can be addressed with the existing dataset, but collecting additional data (caudate data from monkey A or more FEF data from monkeys C and/or F) would also be an option. Can the same dissociation be demonstrated when restricting the analysis to data from monkeys C and F? If not, the authors should come up with alternative strategies for demonstrating that the results are not related to monkey identity.

The reviewers raised an important caveat for our interpretation. Given the noise in both behavioral and neural measurements, a sizeable range of drift-rate bias values is necessary to detect any relationship between the behavioral drift-rate bias and its neural representation. As we showed before (Fan et al., 2018) and here (Figure 4E), monkey A tended to use larger driftrate biases than monkeys C and F, while the latter two monkeys tended to use smaller and, more importantly for this analysis, less variable drift-rate biases across sessions. The differences in the daily variations among monkeys in the FEF and caudate samples may contribute to the apparent inter-regional differences.

We have obtained caudate recordings from monkey A (before it was euthanized for clinical reasons) and applied the same inclusion criteria for all caudate and FEF recordings, resulting in 18, 49, and 73 caudate neurons and 85, 24, and 40 FEF neurons from monkeys A, C and F, respectively. There was a significant positive cross-neuron/session correlation (conforming to DDM predictions) between reward context-modulation of the slope of FEF, but not caudate, activity and reward bias in drift rate (Figure 6). The inclusion of monkey A’s data actually resulted in a significant negative correlation for the caudate sample. That is, the sampling of different monkeys changed the correlation, but the difference between FEF and caudate neurons holds.

As mentioned above, for analysis related to bound bias, we removed the results from not consistently choice-selective neurons in both regions. We added new figures and analyses, but the basic finding holds for choice-selective neurons: reward-context modulation of baseline activity in both FEF and caudate do not appear to be closely linked to reward biases in relative bound heights.

We completely revised the text in the last two sections of Results accordingly.

3) The heterogeneity across different neurons might be better handled by a more modern method, such as Targeted Dimensionality Reduction or demixed PCA, which would provide more rigorous and easier interpretability. For example, Gaussian-smoothed firing rates can be used as input to dPCA and the target + motion epochs and another one aligned to movement onset. This would allow readers who think more in terms of neural populations to appreciate this paper more. Interaction terms etc can be easily included in the analysis.

We performed the dPCA as the reviewers suggested and present these results in new figures (Figure 3—figure supplement 4, Figure 3—figure supplement 5, Figure 3—figure supplement 6, Figure 3—figure supplement 7). We note, however, our dataset is relatively small for reliable estimates. As Kobak et al. pointed out in their 2016 paper, “…at least ~100 neurons were needed to achieve satisfying demixing” for three task parameters (stimulus, choice, and their interactions). “In cases when there are many task parameters of interest, dPCA is likely to be less useful than the more standard parametric single-unit approaches (such as linear regression)”. In our data set, we have ~140 neurons for each region and 7 task parameters (coherence, choice, reward context, and their interactions). Moreover, because we used a decision task with biased reward contexts, the datasets were inherently unbalanced across different combinations of trial types. Because of these constraints, although dPCA provides a helpful characterization of the modulation patterns at the population level, we believe that the multiple linear regression results are likely to be more robust and therefore chose to keep the latter in main figures and the former as supplements.

4) Alternative hypotheses, such as two stage model, pre-stimulus urgency signals, starting point hypothesis, need to be tested and rejected more convincingly.

We apologize for the apparent lack of effort in relating models to the monkeys’ performance. Because we submitted this manuscript as Research Advances, i.e., a follow-up for our previous eLife papers that included detailed model-comparison results, we followed the journal’s recommendation of limiting redundant presentation of previous results. We showed in Fan et al., 2018 that pre-stimulus offset alone (i.e., starting point hypothesis) cannot capture the monkeys’ reward-biased performance. A choice-non-specific, pre-stimulus urgency signals cannot capture the asymmetric effects of reward context on the monkeys’ choices. The model we used here is a formulation of the two-stage processing that Diederich and Busemeyer proposed, but for a RT-version task.

Currently, much larger effects in baseline state across a large neural population in FEF is deemphasized compared to the smaller effects seen in the non-choice selective caudate neurons. This might be because the analysis of neural data is based on the conviction that their behavioral model is also the best model for the neural data, but it might be necessary to explore the possibilities beyond their best behavioral model, which might still be wrong. This is also where population analysis (e.g., dPCA) might be helpful. On the one hand, it may be reasonable to only select choice-selective neurons but in doing so we are tossing 70% of the dataset. Only 44 and 36 neurons are now going into your analysis. If the authors decode variables from dPCA with reasonable variance you can just directly look at them in relation to model predictions. It uses the whole dataset and that is the advantage of such an approach.

We apologize for the confusion. In the original manuscript, the analyses were applied to all neurons, regardless of their choice selectivity (e.g., original Figures 3, Figure 7, Figure 7—figure supplement 1, Figure 7—figure supplement 2). Because all accumulation-to-bound models assume that asymmetric bounds-induced choice biases are directional, the choice-selective neurons, if directly contributing to the bound bias, would share the same directionality. In contrast, not-consistently choice-selective neurons do not directly map onto the model assumption. Because these two groups of neurons participate differently in the decision process, we presented their results separately in the original Figure 7 and supplements.

The reviewers’ comments prompted us to more carefully consider how to interpret results from the not-consistently choice-selective neurons. Because accumulation-to-bound models (including the DDM) do not have predictions for this type of neurons, we have decided that our original congruency-based interpretation was not valid. We now present the results from these neurons as supplement to show that they can potentially participate in the decision process (Figure 6—figure supplement 1 and Figure 7—figure supplement 2), but do not assign them any functional roles.

Please see our response above regarding dPCA.

5) The regression analysis used in this analysis shown in Figure 6 was based on the rate of change in firing rate (not the firing rate itself), but how the slope was calculated was not explicitly explained. The details of this should be given in the Materials and methods. For example, to cleanly separate the effect related to drift rate, it might be necessary to remove the contribution from the changes in the baseline firing rate, and simply calculating the slope of the regression model applied to different bins of the neural activity during the 200 ms window might not be sufficient for this.

We apologize for not explaining the method for measuring the slope of change in firing rates. We have added the following text:

“Measuring the slope of change in firing rates

Only correct trials were included for this analysis. Spike trains were aligned to motion onset and grouped by coherence x reward context combinations. The average firing rates were computed for each combination, truncated at median RT for the combination, and convolved with a Gaussian kernel (σ = 20 ms). The slope of change was measured from 200 ms running windows (in 20 ms steps) of the smoothed firing rates for each combination, using a linear regression with time as the independent variable. For each running window, a multiple linear regression was performed, using coherence, reward context, and their interaction as the independent variable and the slopes of change as the dependent variable. Significance for individual regressors was assessed using t-test (criterion: p=0.05).”

With this method, contributions of changes in baseline firing rates were excluded.

Also, isn't it possible that at least some of the effects illustrated in Figure 6C are mediated by the changes in neural activity related to RT? If so and if the contribution of RT is not controlled for, how does this affect the interpretation?

Because the drift rate controls RT, a neural correlate of drift-rate bias is necessarily related to RT. Accordingly, removing the correlation between neural activity and modulations of RT would be equivalent to removing contributions of the drift-rate bias, making it impossible to identify neural correlates of these biases.

6) To distinguish between different scenarios depicted in Figure 4, the authors applied a regression model that relates to the slope of firing rate to coherence, reward context, and their interaction. However, since this model includes the interaction term, the ratio between the regression coefficients for the two main terms is not the most appropriate quantity to test the scenario in Figure 4B. An alternative and simpler method might be to examine the coefficient for coherence and the difference in the average slope in the two reward contexts.

We apologize for not making our rationale clear. In the DDM, the slope of change in the decision variable is governed by a scaling factor (k), the actual evidence strength (Coh), and a bias in the momentary evidence (drift-rate bias, me). The drift rate for a given coherence level is:

Drift(pref − small − reward blocks) = k0×(Coh+me0) - Eq. 1

Drift(pref − large − reward blocks) = (k0+krew)×(Coh+me0+merew), - Eq. 2

In this formulation,k0 and me0 represent reward context-independent baseline values of the scaling factor and drift-rate bias, and krew and merew represent corresponding reward context-dependent changes, respectively. 𝑚𝑒𝑟𝑒𝑤 corresponds to the behaviorally estimated reward-driven drift-rate bias. If a neuron’s firing rate faithfully follows the decision variable in the DDM,

Slope of firing rate =k0×me0+k0×Coh+krew×Coh×Irew+(k0×merew+krew×me0+krew×merew)×Irew - Eq. 3

Where Irew = {0 for pref-small-reward blocks, 1 for pref-large-reward blocks}. The ratio between the two main regressors (i.e., the fourth divided by the second term in Eq. 3) is thus

Coefficient ratio (reward contextcoherence)=merew+(me0+merew)×krew/k0 - Eq. 4

Because krew/ k0 tended to be much less than 1 (mean absolute value across all sessions = 0.14), this coefficient ratio is close to the behavioral drift-rate bias (merew) across sessions. In comparison, the difference in the average slope,

krew×Coh¯+(k0+krew)×merew+krew×me0,

has a more complex relationship with merew that depends on the values of the krew,k0,me0, and average coherence values Coh¯ that differed among the monkeys. This complex relationship can be seen in the example in Figure 6A and B. We thus consider the coefficient ratio we used (Eq. 4) to be appropriate for testing for a link between neural activity and the behaviorally estimated drift-rate biases.

For example, although the example neuron in Figure 6A might have a significant effect of reward context (in terms of intercept), the effect of reward context varies with coherence effect and reverses for high coherence, which is not consistent with the pattern expected from the biased drift model (Figure 4B).

We agree with the reviewers that the significant modulation by reward context-coherence interaction in the example neuron was not expected from a DDM with only biased drift rates. This difference supports our interpretation that the FEF activity does not fully follow the trajectory of a decision variable in the DDM. We added a reference to this observation in the Discussion.

“For our task, neither FEF nor caudate activity represented the full, latent decision variable as predicted in the DDM framework. For example, in addition to the disconnect between bound bias and reward context-modulated baseline activity, the example FEF neuron in Figure 6A showed a strong modulation by the coherence-reward context interaction, which was not predicted by the DDM.”

In addition, the ratio between the two regression coefficients might produce unreliable results, because they are disproportionally influenced by the denominator (log transformation might be appropriate).

We use the ratio of coefficients for reward context and coherence to control for the variations in firing ranges during motion viewing among neurons. To guard against exactly the reviewers’ concern, we performed this analysis for only neurons with significant coherence modulation, i.e., the denominator was not near zero.

The negative ratios shown for some neurons are also difficult to interpret.

We agree with the reviewers that the negative ratios are not compatible with neural correlates of decision variable in the DDM. These neurons are clearly in the minority in our samples for both regions. In theory, the negative ratio means that a rise in activity is greater when the coherence is higher and when the monkey chose the small-reward target. In a separate, unpublished study, we found that a choice-confidence signal can predict such patterns for our task. Choice confidence is expected to be higher with stronger evidence. Less intuitively, a monkey would only choose the small-reward target when the confidence is high (i.e., confidence is higher on average for small-reward choice trials). A combination of these two effects could lead to negative ratios between coefficients of reward context and coherence for signals related to choice confidence. In the interest of keeping this manuscript focused on model-predicted decision variable-related signals, we added the following text in Discussion to raise this possibility without going into more details:

“Besides decision formation, both FEF and the caudate nucleus have other hypothesized decision-related roles, including performance monitoring (Ding and Gold, 2010, 2012c, 2013; Teichert et al., 2014; Yanike and Ferrera, 2014a). As we showed, a substantial fraction of reward context-modulated neural activity were present in neurons that were not consistently selective for choice. These activity patterns were sensitive to reward biases in drift rates during motion viewing (Figure 6—figure supplement 1) or in the baseline firing before motion onset and saccade onset (Figure 7—figure supplement 2). In addition, some choice-selective neurons showed negative ratios of reward context and coherence coefficients (Figure 6D), which are not consistent with the decision variable predicted by the DDM but could reflect a choice confidence signal instead. It would be important to investigate further the exact functional roles of these activity patterns for solving decision-making tasks.”

7) The dissociation between Figure 6C and D is based on finding a significant correlation in C, but not in D. The result would be strengthened by being able to show that there is a statistical difference between one particular parameter that can be estimated for both FEF and caudate neurons. For example, is it possible to estimate both linear regression slopes and to show that they are statistically different?

As mentioned above, after including monkey A’s data, there was a significant negative correlation for caudate neurons, in contrast to the significant positive correlation for FEF neurons. Given the opposite signs, a statistical test is no longer necessary.

8) To test whether reward-related changes in neural activity was related to the variability in the bound height, they focused on the consistency in the signs of the regression coefficients for reward context and bias in the bound heights (Figure 7D). However, these coefficients are negative for the majority of the neurons, so they do not address the question of whether the variability in these two measures are correlated across sessions (for example, what was the correlation coefficient for the data shown in Figure 7D?).

As mentioned above, this figure is no longer presented. But the reviewers’ comment still applies to analysis of potentially bound asymmetry-related neural signals. Because the value of the reward-context coefficient in our regression analysis depends on a neuron’s firing range, a (lack of) correlation between the coefficient and the behaviorally measured bias cannot be interpreted. Ideally, the bound bias-related neural activity should be normalized to the true value of the bound-related firing rate. However, the latter cannot be measured in practice, particularly because the final bound can change with reward context.

This is an admittedly difficult challenge. Even including trials with equal reward would not necessarily address this challenge, because monkeys can change their total bound heights between equal and asymmetric reward contexts. This is also different from analysis of activity related to drift-rate bias, where the coherence modulation in the same neuron can be used for normalization.

9) The authors have used two regression models to examine the activity of FEF and CD neurons, one including coherence (sensory variable), and the other including RT (motor variable). Since the effects of these variables were modeled separately for each choice (e.g., ipsi vs. contra), they are correlated with each other, but in principle, this should not prevent the authors from including both of these regressors in the same model. This would be preferred, because it is likely that these two factors still influence the activity in either or both of these structures differently.

Because RT depends on reward context and coherence in complicated, non-linear ways, we do not believe that regression results with all three parameters are easily interpretable.

Also, when the effects of coherence and RT were summarized in Figure 3, it would be better to correct them for multiple comparisons.

Our interpretation of the fraction of neurons is based on chance levels corrected for multiple comparisons (e.g., Figure 3A caption: Dashed lines: chance level, adjusted for the number of comparisons. Filled circles: the fraction as significantly greater than chance level (Chi-square test, p<0.05/72 (8 epochs x 9 comparisons).”.

In addition, the reward context was varied across blocks, which would introduce serial correlation in both independent and dependent variables of the regression model, making a standard t-test no longer appropriate (c.f., Elber-Dorozko and Loewenstein, 2018).

We agree with the reviewers that serial correlation due to slow fluctuations in neural activity should be considered for blocked designs. The potential effects of serial correlation are expected to be similar for all time points within the short time scale of a trial. Because our analysis were performed in multiple epochs and time windows (Figure 3) and still showed strong dependence on time/epoch, we did not consider this to be a major caveat. The comment prompted us to perform the permutation test as suggested by Elber-Dorozko and Loewenstein. Now only neurons that passed the p=0.05 cutoff for the standard t-test and p=0.05 for the permutation test are identified as significant in Figure 3A-D and Figure 8. We note that applying this additional criterion did not qualitatively change our results.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

This is a significant study that advances our understanding on the functional specialization of the frontal eye field and caudate nucleus during a perceptual decision making with asymmetric reward outcomes. The reviewers appreciated the extensive amount of work performed by the authors to address their previous concerns, but have one important remaining concern. In particular, the authors are concerned that the relationship between the behavioral bias in drift and the neural effects of reward context seen in the FEF (Figure 6) might be driven by the results from one animal (monkey A). In addition, it is possible that the results might be due to the individual differences among different monkeys rather than the session-by-session (or neuron-by-neuron) changes in behaviors (i.e., Simpson's paradox). Therefore, it is strongly suggested that the authors should analyze and report the results separately for individual animals, or at least repeat this analysis after excluding the results from monkey A.

We followed the reviewers’ suggestion and performed the correlation analysis separately for the three monkeys (Figure 6—figure supplement 2, also see below for your convenience). As expected, given the smaller sample sizes, none of the per-monkey results was statistically significant (monkey C had the highest correlation coefficient, then monkey A, with monkey F’s below zero). We also repeated the analysis with Spearman’s correlation, which tests for a monotonic (not necessarily linear) relationship, and found that monkey C showed a positive coefficient of 0.9 (p=0.037), monkey A showed a non-significant positive coefficient of 0.14 (p=0.66), and monkey F showed a coefficient of -0.1 (p=0.87). These results suggest that the effect we reported in the main figure was not driven by only monkey A or an artifact due to Simpson’s paradox. We have added the following text to clarify our interpretation of these across- (not within-) monkey findings: “As expected given the smaller sample sizes, none of the per-monkey results was statistically significant (Figure 6—figure supplement 2). These results indicated a close relationship between FEF neurons and the neural implementation of reward biases in drift rates assessed across monkeys.”

We hope these new results and explanation address the reviewers’ concern. In addition, because of the somewhat complicated nature of the results that do not necessarily support the idea of “complementary” roles of FEF and caudate, we changed the title to “Frontal eye field and caudate neurons make different contributions to reward-biased perceptual decisions”.

https://doi.org/10.7554/eLife.60535.sa2

Article and author information

Author details

  1. Yunshu Fan

    Department of Neuroscience and Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, United States
    Contribution
    Data curation, Formal analysis, Investigation, Methodology, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2597-5173
  2. Joshua I Gold

    Department of Neuroscience and Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, United States
    Contribution
    Conceptualization, Funding acquisition, Visualization, Writing - review and editing
    Competing interests
    Senior Editor, eLife
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6018-0483
  3. Long Ding

    Department of Neuroscience and Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    lding@pennmedicine.upenn.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1716-3848

Funding

National Institutes of Health (R01-EY022411)

  • Long Ding
  • Joshua I Gold

University of Pennsylvania (University Research Foundation Pilot Award)

  • Long Ding

Hearst Foundations (Graduate student fellowship)

  • Yunshu Fan

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Jean Zweigle for animal care and Drs. Kae Nakamura and Takahiro Doi for helpful comments. This work was supported by NIH National Eye Institute (R01-EY022411; LD and JIG), University of Pennsylvania (University Research Foundation Pilot Award; LD), and Hearst Foundations Graduate student fellowship (YF).

Ethics

Animal experimentation: All training and experimental procedures were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the University of Pennsylvania Institutional Animal Care and Use Committee (protocol #804726). Details about monkey training, behavioral tasks, and caudate recording were reported previously (Fan et al., 2018; Doi et al., 2020).

Senior Editor

  1. Michael J Frank, Brown University, United States

Reviewing Editor

  1. Daeyeol Lee, Johns Hopkins University, United States

Reviewers

  1. Chandramouli Chandrasekaran
  2. Jochen Ditterich, University of California, Davis, United States

Publication history

  1. Received: July 1, 2020
  2. Accepted: November 18, 2020
  3. Version of Record published: November 27, 2020 (version 1)

Copyright

© 2020, Fan et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 570
    Page views
  • 81
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Neuroscience
    Yunshu Fan et al.
    Research Article Updated

    Decision-making is often interpreted in terms of normative computations that maximize a particular reward function for stable, average behaviors. Aberrations from the reward-maximizing solutions, either across subjects or across different sessions for the same subject, are often interpreted as reflecting poor learning or physical limitations. Here we show that such aberrations may instead reflect the involvement of additional satisficing and heuristic principles. For an asymmetric-reward perceptual decision-making task, three monkeys produced adaptive biases in response to changes in reward asymmetries and perceptual sensitivity. Their choices and response times were consistent with a normative accumulate-to-bound process. However, their context-dependent adjustments to this process deviated slightly but systematically from the reward-maximizing solutions. These adjustments were instead consistent with a rational process to find satisficing solutions based on the gradient of each monkey’s reward-rate function. These results suggest new dimensions for assessing the rational and idiosyncratic aspects of flexible decision-making.

    1. Neuroscience
    Benoit P Delhaye et al.
    Research Article

    Human tactile afferents provide essential feedback for grasp stability during dexterous object manipulation. Interacting forces between an object and the fingers induce slip events that are thought to provide information about grasp stability. To gain insight into this phenomenon, we made a transparent surface slip against a fixed fingerpad while monitoring skin deformation at the contact. Using microneurography, we simultaneously recorded the activity of single tactile afferents innervating the fingertips. This unique combination allowed us to describe how afferents respond to slip events and to relate their responses to surface deformations taking place inside their receptive fields. We found that all afferents were sensitive to slip events, but FA-I afferents in particular faithfully encoded compressive strain rates resulting from those slips. Given the high density of FA-I afferents in fingerpads, they are well suited to detect incipient slips and to provide essential information for the control of grip force during manipulation.