1. Neuroscience
Download icon

Basal ganglia output reflects internally-specified movements

  1. Mario J Lintz
  2. Gidon Felsen  Is a corresponding author
  1. University of Colorado School of Medicine, United States
Research Article
  • Cited 2
  • Views 1,190
  • Annotations
Cite as: eLife 2016;5:e13833 doi: 10.7554/eLife.13833

Abstract

How movements are selected is a fundamental question in systems neuroscience. While many studies have elucidated the sensorimotor transformations underlying stimulus-guided movements, less is known about how internal goals – critical drivers of goal-directed behavior – guide movements. The basal ganglia are known to bias movement selection according to value, one form of internal goal. Here, we examine whether other internal goals, in addition to value, also influence movements via the basal ganglia. We designed a novel task for mice that dissociated equally rewarded internally-specified and stimulus-guided movements, allowing us to test how each engaged the basal ganglia. We found that activity in the substantia nigra pars reticulata, a basal ganglia output, predictably differed preceding internally-specified and stimulus-guided movements. Incorporating these results into a simple model suggests that internally-specified movements may be facilitated relative to stimulus-guided movements by basal ganglia processing.

https://doi.org/10.7554/eLife.13833.001

eLife digest

An important role of the nervous system is to allow us to move around in the world. These movements are typically influenced by the goal that we want to achieve (for example, finding food) as well as stimuli that we sense in our environment (for example, the smell of pizza). Yet we understand little about how the brain controls these sorts of goal-directed movements, even under normal conditions. This lack of basic understanding presents a big problem when it comes to treating movement disorders like Parkinson’s disease.

For a long time, a collection of brain regions called the basal ganglia have been known to be important for controlling movements, although the specific role that they play in this process is not well understood. Does the brain activity that controls movements differ depending on whether the movement is made in response to a stimulus or not?

Using mice, Lintz and Felsen have now recorded the activity of individual neurons in the basal ganglia that signal to other brain regions as the animals performed a behavioral task. Different trials in the task required the mouse to make two types of otherwise-identical movements: movements based on a stimulus, and movements based on recent experiences (and not triggered by a stimulus). The output activity of the basal ganglia differed under these two conditions, suggesting that the basal ganglia may play different roles in each type of movement.

From the results, Lintz and Felsen could make some predictions about how the basal ganglia influence the activity of downstream regions of the nervous system that control movement. Further studies are now required to test these predictions.

https://doi.org/10.7554/eLife.13833.002

Introduction

As we interact with the world, our movements are selected based on external sensory stimuli and internal variables representing action value, learned stimulus-response contingencies, and prior experiences (Gold and Shadlen, 2007). Selecting the movement associated with the most desirable outcome requires appropriately weighting each of these factors. While the neural substrates for movements based on external sensory stimuli have been the focus of much research (Hall and Moschovakis, 2003), where, how, and when internal goals influence movement selection is less well understood. The basal ganglia (BG) are known to be involved in motor control (Hikosaka and Wurtz, 1989; Mink, 1996), contributing to movement selection by modulating inhibition on competing downstream motor structures (Basso and Wurtz, 2002; Di Chiara et al., 1979; Hikosaka and Wurtz, 1985). In particular, the BG have been thought to bias the selection of movements towards those associated with the highest value (Hikosaka et al., 2006). This 'value-biasing' hypothesis is supported by much evidence showing that activity in several BG nuclei is modulated, prior to stimulus presentation, by reward expectation (Bryden et al., 2011; Handel and Glimcher, 2000; Hikosaka et al., 2006; Kawagoe et al., 1998; Sato and Hikosaka, 2002) such that movements toward high-value targets are disinhibited relative to movements toward low-value targets (Hikosaka et al., 2006; Lauwereyns et al., 2002). Anatomical evidence is consistent with a primary role for the BG in mediating the integration of value-based information into motor plans (Bolam et al., 2000; Gerfen and Surmeier, 2011). However, movement selection may also be guided by other internal representations, such as recent movements and their outcomes (Corrado et al., 2005; Fecteau and Munoz, 2003; Lau and Glimcher, 2005). We therefore asked whether BG activity mediates the influence of internal goals, in addition to value, on movement selection.

We reasoned that, if this were the case, BG output would differ when selecting equally valuable stimulus-guided and internally-specified movements. Specifically, we would expect that internally-specified movements would be promoted relative to otherwise-identical stimulus-guided movements, just as more valuable movements have been shown to be promoted relative to otherwise-identical less valuable movements (Hikosaka et al., 2006; Sato and Hikosaka, 2002). Notably, it is has been proposed that Parkinsonian patients exhibit more pronounced bradykinesia when initiating internally-specified than stimulus-guided movements because the latter engage pathways outside of the BG (Glickstein and Stein, 1991). However, whether the BG themselves are differentially engaged by these two types of movements has not been tested.

We distinguished between these two possibilities by recording from neurons in the substantia nigra pars reticulata (SNr), an output nucleus of the BG critical for orienting movements (Basso and Sommer, 2011; Handel and Glimcher, 1999; Hikosaka and Wurtz, 1983a), in mice performing a behavioral task in which a sensory stimulus either was or was not informative of the rewarded direction of an orienting movement. Using a design akin to that of other recent studies (Pastor-Bernier and Cisek, 2011; Seo et al., 2012; Ito and Doya, 2015), in alternating blocks of trials, the rewarded direction was either determined by a sensory cue or by internal representations informed by recent trial history. Critically, we designed the task such that correct movements were equally valuable in both conditions. We found that SNr activity predictably differed between these two conditions, supporting the idea that the BG mediate the influence on movement selection of internal goals. We interpret these results, in the context of a simple model of BG output (Hikosaka et al., 2006), as suggesting that internally-specified movements may be promoted over stimulus-guided movements by BG activity.

Results

Behavioral assay dissociates stimulus-guided and internally-specified movements

We trained mice on a delayed-response spatial choice task comprised of interleaved blocks of 'stimulus-guided' (SG) trials, in which the direction of movement is selected based on a sensory stimulus (Uchida and Mainen, 2003), and 'internally-specified' (IS) trials, in which the direction of an otherwise-identical movement is selected based on internal representations informed by recent trial history (see Materials and methods; Figure 1A,B). In each trial of the task, the mouse is presented with a binary odor mixture at a central port, waits for an auditory go cue, and moves to the left or right reward port for a water reward. In SG trials, the dominant component of the odor mixture – which varies trial by trial – determines the side at which reward will be delivered, while in IS trials, a balanced mixture of the two odors is always presented but reward is delivered at only one side throughout the block (see Materials and methods; Figure 1B). Thus, while both trial types require the mouse to sample the stimulus, in SG trials the stimulus indicates that the rewarded side is determined by the odor mixture and in IS trials the stimulus indicates that the rewarded side is determined by the recent history of choices and outcomes. We found that mice were able to infer (unsignaled) transitions between the SG and IS blocks and switch their response mode accordingly: during SG blocks, mice were equally likely to choose the left and right port (Figure 1C, gray boxes) reflecting a dependence on the odor mixture (Figure 1D), while during IS blocks, mice reliably returned to the same (rewarded) port in each trial (Figure 1C, white boxes). We quantified performance in IS blocks by calculating, for each block, the percentage of correct trials and the number of error events, defined as a run of consecutive incorrect choices (Figure 1E). Finally, we reasoned that if a mouse were to recognize that a given trial belonged to an IS block, it could prepare its movement in advance and would therefore be able to reach the reward port faster (Niemi and Näätänen, 1981; Poulton, 1950; Seo et al., 2012). Indeed, across the population of sessions, we found that reaction time – defined as the time from the go cue to reward port entry – was shorter in IS trials than in the 'easy' SG trials in the same session [Figure 1F; we used only easy SG trials (mixture ratios of 95/5, 80/20, 20/80, and 5/95) to control for a potential dependence of reaction time on difficulty; population of sessions: p = 1.7 × 10−11, paired t-test; individual sessions: IS shorter than SG in 46/108, SG shorter than IS in 3/108, p<0.05, Wilcoxon rank-sum test; ipsiversive and contraversive trials compared separately]. Together, these data suggest that, as intended, the direction of movement in SG blocks is selected based on the stimulus while the direction of movements in IS blocks is selected based on recent trial history. We therefore utilized this behavioral assay to compare how stimulus-guided and internally-specified movements are mediated by the BG, as described below.

Behavioral task and performance.

(A) Timing of events in each trial. The mouse enters the odor port, receives an odor mixture, waits for the go signal, exits the odor port, moves to one of the reward ports, and receives a water reward for a correct choice. Gray box, delay epoch. (B) Organization of SG (gray) and IS (white) blocks within a session. All sessions start with an SG block and alternate between SG and IS blocks. In SG blocks, reward side corresponds to the dominant odor in the mixture [(-)-carvone, left; (+)-carvone, right]; when the odors are balanced ([(-)-carvone] = [(+)-carvone]), the probability of reward at both reward ports is 0.5. In IS blocks, odors are balanced in every trial and reward is available at the same side in each trial. Thickness of horizontal lines corresponds to probability of reward. SG, stimulus guided; IS, internally specified; L, left; R, right. (C) Fraction of left choices across block types throughout the session. Dashed line shows an example session (boxcar smoothed over 7 trials), solid black line shows mean over all sessions (54, from 4 mice), horizontal black lines show block means, horizontal gray lines show ideal block means (if all choices were correct). To account for different numbers of trials per block across sessions, trials that occur in < 60% of sessions are excluded. In SG blocks only difficult trials [(+)-carvone/(-)-carvone = 60/40, 50/50, or 40/60] are shown. (D) Mean performance in SG blocks over all sessions, separated by rewarded side of first IS block in the session. Lines show best fit to p=11+eabx, where x is the proportion of the left odor [(-)-carvone)] in the mixture, p is the fraction of right choices, and a and b are free parameters. While choices were slightly biased by the rewarded direction in the first IS block (center panels), they were much more strongly influenced by the stimulus. (E) Performance in IS blocks. Histograms of percent correct choices (top) and number of error events (run of consecutive incorrect choices, bottom) across blocks over all sessions. (F) Mean reaction time in easy SG trials plotted against mean reaction time in IS trials in the corresponding session, separately for each direction of movement.

https://doi.org/10.7554/eLife.13833.003

SNr activity differs for stimulus-guided and internally-specified movements

If the BG integrates not only value but also other internal representations, then stimulus-guided and internally-specified movements may differentially engage the BG despite being equally valuable. In this case, we would predict that BG output would depend on whether the movement was internally specified or stimulus guided, and specifically, in our task, on the degree to which recent trial history is informative of rewarded direction. To test this prediction we examined activity in the SNr, a BG output known to be involved in orienting movements (Basso and Sommer, 2011; Handel and Glimcher, 2000, 1999; Hikosaka and Wurtz, 1983a, 1983b). We recorded from 296 well-isolated left SNr neurons (see Materials and methods; Figure 2) in four mice performing the task (see Supplementary file 1). Data from one example neuron are shown in Figure 3A,B, segregated by reward port selected (ipsilateral vs. contralateral to the recording side) and by trial type (SG vs. IS). The activity of this neuron clearly depends on both movement direction and trial type. To examine these dependencies across the population of neurons we first examined firing rate during the delay epoch, defined as the time from odor valve open to the time of odor port exit (Figure 1A, gray box), which most directly captures, across trial types, activity underlying selection of the direction of movement (but since activity in IS trials may, by design, reflect direction selection even before stimulus delivery, we subsequently examine activity in other epochs). Based on the firing rate during this epoch in SG and IS trials, we then calculated direction preference (see Materials and methods). This value ranges from -1 (strongly 'prefers' ipsiversive) to 1 (strongly prefers contraversive), where 0 represents no preference. We found that 216/296 neurons displayed a significant direction preference (p<0.05) during the delay epoch, with about as many preferring ipsiversive (94/216) as contraversive (122/216) movements (Figure 3C). Since SNr activity has been shown to exhibit both movement-related increases and decreases (Bryden et al., 2011; Gulley et al., 2002, 1999; Handel and Glimcher, 1999; Sato and Hikosaka, 2002), we next asked whether a relationship existed between direction preference and the sign of activity change during the delay epoch, relative to baseline (see Materials and methods). We found that 188/296 neurons exhibited an increase in activity during this epoch, 91/296 exhibited a decrease, and 17/296 exhibited no change (Table 1), consistent with previous studies (Bryden et al., 2011; Gulley et al., 2002, 1999; Handel and Glimcher, 2000, 1999; Sato and Hikosaka, 2002). Within these groups, neurons exhibited a preference for ipsiversive, contraversive, or neither direction in roughly equal numbers (Table 1).

Confirmation of recording sites and spike clustering.

(A) Schematic (left) shows targeted recording extent (bar) within SNr; coronal section (right, 3.3 mm caudal from bregma) shows representative tetrode track (arrow) in SNr. (B) Left, peaks of waveforms from lead 1 plotted against peaks of waveforms from lead 3 of one tetrode for a representative recording session. Note that clustering was performed using additional features to those shown here. Red and green points show waveform peaks recorded from neurons considered to be distinct. Right, waveforms (mean ± SD) corresponding to red and green points.

https://doi.org/10.7554/eLife.13833.004
SNr activity during the delay epoch depends on movement direction and trial type.

(A) Rasters for an example neuron grouped by movement direction (rows) and trial type (columns). For each raster, each row shows spikes (black ticks) in one trial, aligned to time of odor valve open (red line) and sorted by duration of delay epoch. Green ticks, times of go signal; blue ticks, times of odor port exit. Fifty pseudo-randomly selected trials are shown per group. (B) Peri-event histograms showing average activity, separately by direction, in stimulus-guided (left) and internally-specified (right) trials. Shading, ± SEM. Histograms are smoothed with a Gaussian filter (σ = 15 ms). Ipsi., ipsiversive; Contra., contraversive. (C) Histogram of direction preferences during delay epoch across population of neurons. Arrowhead corresponds to example neuron in A. (D) Difference in delay-epoch firing rate between ipsiversive and contraversive trials in SG vs. IS trials in the same session, separately for ipsiversive-preferring neurons (left subpanel, corresponding to left black bars in C) and contraversive-preferring neurons (right subpanel, corresponding to right black bars in C). Only correct trials are included; all choices on 50/50 SG trials were considered correct regardless of whether they were rewarded. Dashed lines show x = 0, y = 0, and x = y. Red marker corresponds to example neuron from A and B. FR, firing rate.

https://doi.org/10.7554/eLife.13833.005
Table 1

Direction preference and activity change during delay epoch. Neurons are grouped by direction preference and whether activity in the preferred direction increased or decreased relative to baseline (see Materials and methods), during the delay epoch. Numbers and percentages of grand total (279) are shown; note that 17 neurons exhibited no change in activity and are not included here.

https://doi.org/10.7554/eLife.13833.006
PreferenceIncreaseDecreaseTotal
Contraversive8832%2710%11541%
Ipsiversive5219%3814%9032%
Nonselective4817%269%7427%
Total18867%9133%279100%

We next examined whether activity during the delay epoch of direction-selective neurons (Figure 3C, black bars) differed between SG and IS trials, in two complementary ways. First, we examined whether the difference in activity between ipsiversive and contraversive trials depended on whether the movement was stimulus guided or internally specified. Across our population, neurons tended to show a larger difference in firing rate preceding ipsiversive and contraversive movements in IS than in SG trials [Figure 3D; ipsiversive-preferring neurons: p = 2.2 × 10−10, paired t-test; contraversive-preferring neurons: p = 4.0 × 10−4, paired t-test]. Second, we determined whether neurons were trial-type-dependent by comparing firing rates between SG and IS trials in which the selected movement was correct and in the preferred direction of the neuron; we then repeated this comparison for the antipreferred direction. For trials in the preferred direction, we found that the activity of approximately half of the direction-selective neurons was modulated by trial type (101/216; p<0.05, unpaired t-test), with more neurons exhibiting higher activity in IS trials than SG trials (84/101 vs. 17/101; p = 2.6 × 10−11, X2 test). Conversely, for trials in the antipreferred direction, we again found that the activity of approximately half of the direction-selective neurons was modulated by trial type (109/216; p<0.05, unpaired t-test; 158/216 direction-selective neurons were modulated by trial type in at least one direction), but that more neurons exhibited higher activity in SG trials than IS trials (76/109 vs. 33/109; p = 3.8 × 10−5, X2 test). Therefore, while we found that neurons were about equally likely to prefer upcoming ipsiversive and contraversive movements (Figure 3C), their activity depended, in a predictable manner, on trial type (Figure 3D).

While these findings suggest that the BG differentially mediate internally-specified and stimulus-guided movements, as we had predicted, a few differences between SG and IS trials may have contributed to this observation. We therefore sought to identify these differences and determine their influence, in several ways. First, we reasoned that, if neural activity indeed reflects trial type, firing rate would systematically change during the IS block as the mouse increasingly based its movement choice on internal representations instead of the stimulus (recall that the transitions from SG to IS blocks were unsignaled). To test this idea, we calculated the correlation between the trial-by-trial firing rate during the delay epoch and the extent to which the mouse had learned that its movement choice should be internally specified, estimated with a reinforcement learning algorithm (see Materials and methods). We performed this analysis on the 158 neurons with firing rates that depended on both direction and trial type, separately for choices in the preferred and antipreferred direction. Figure 4A shows data from an example neuron displaying a significant correlation for trials in the preferred direction of the neuron (r = 0.65, p = 7.0 × 10−9), and no correlation for trials in the antipreferred direction (r = 0.066, p = 0.60). Overall, 77/158 neurons exhibited a significant correlation (p<0.05) between firing rate and the number of consecutive correct trials for either direction [Figure 4B,C; 35/77 for trials in the preferred direction (red circles), 29/77 for trials in the antipreferred direction (blue circles), and 13/77 for trials in both directions (purple circles)], with more positive correlations for trials in the preferred direction (p = 2.4 × 10−6, X2 test) and negative correlations for trials in the antipreferred direction (p = 5.9 × 10−6, X2 test), as we would expect given the pattern of results shown in Figure 3D. These results support the idea that SNr activity reflects the degree to which movements are selected based on internal representations.

Activity depends on the extent to which movements are internally specified.

(A) Firing rate during delay epoch plotted as a function of the value of the rewarded side, estimated via reinforcement learning (Vdir, t), for both IS blocks in a session, for one example neuron. Each circle corresponds to one trial. (B) Correlations (as in panel A) for ipsiversive movement plotted against contraversive movement, for the population of ipsiversive-preferring neurons (left black bars in Figure 3C) with activity that depended on trial type (SG vs. IS). Each circle corresponds to one neuron. (C) Same as B, for contraversive-preferring neurons (right black bars in Figure 3C). Black box corresponds to example neuron from A.

https://doi.org/10.7554/eLife.13833.007

Modulation of SNr activity by task-relevant variables

We next examined the potential influence of other factors on the observed difference in neural activity during SG and IS trials. One difference between these trial types, by design, is that in IS trials the decision (to move left or right) is relatively easy, while in some SG trials this decision is more difficult (Figure 1D). The difficulty of this decision – or an associated variable, such as uncertainty, or the estimated value of each movement direction – could, in principle, affect SNr activity. Were this the case, we would expect to observe a difference in activity between those SG trials requiring an easy discrimination (mixture ratios of 95/5, 80/20, 20/80, and 5/95) and those SG trials requiring a 'difficult' discrimination (mixture ratios of 60/40, 50/50, and 40/60), since easy trials resulted in a larger fraction of correct choices (p = 3.4 × 10–27, paired t-test; see Figure 1D), corresponding to a higher likelihood of reward. We therefore compared firing rate during the delay epoch between easy and difficult SG trials, separately for trials in the ipsiversive (Figure 5A) and contraversive (Figure 5B) direction, for the 216 direction-selective neurons (Figure 3C, black bars). We found that the activity of some individual neurons depended on difficulty (or an associated variable) (ipsiversive direction: 39/216 neurons; contraversive direction: 31/216 neurons, p<0.05, 1-way ANOVA across mixture ratios, Figure 5A,B), as would be predicted by the value-biasing view of BG function. However, there was little overlap (purple circles) between this small population of difficulty-dependent neurons (blue circles) and those neurons that we classified as trial-type-dependent [ipsiversive direction: 21/109 trial-type-dependent, and 18/107 non-trial-type-dependent, neurons exhibited difficulty dependence (these ratios did not differ: p = 0.32, X2 test); contraversive direction: 16/101 trial-type-dependent, and 15/115 non-trial-type-dependent, neurons exhibited difficulty dependence (these ratios did not differ: p = 0.56, X2 test); p<0.05, 1-way ANOVA across mixture ratios]. These results suggest that differences in decision difficulty, uncertainty, and the value associated with the direction of movement do not account for trial-type dependence or the differences in activity between SG and IS trials shown in Figure 3D.

Dependence of firing rate on trial type cannot be explained by discrimination difficulty or an associated variable.

(A) Mean normalized change from baseline (Fc, see Materials and methods) during delay epoch of easy vs. difficult ipsiversive SG trials of direction-selective neurons (black bars in Figure 3C). Each circle corresponds to one neuron. Red circles indicate that activity differs between SG and IS trials, and does not depends on mixture ratio (or an associated variable such as discrimination difficulty). (B) Same as A, for contraversive trials.

https://doi.org/10.7554/eLife.13833.008

We then examined whether reaction time (which differs between trial types; Figure 1F) and the choice on the previous trial (which, by design, is more likely to correlate with the current choice in IS blocks than SG blocks; Figure 1C) could explain the difference in activity between SG and IS trials. Preliminary analyses of each of these factors in isolation indicated that, as opposed to discrimination difficulty or an associated variable such as value (Figure 5), they often correlated with firing rate during the delay epoch. In order to determine the relative influence of these factors, as well as other factors that correlate with firing rate – current choice (Figure 3C) and trial type (Figure 3D) – on neural activity during the delay epoch, we performed a linear regression analysis with previous choice, current choice, trial type and reaction time as predictor variables (see Materials and methods). By considering all of these factors simultaneously, this analysis provides an unbiased method for determining their influence on neural activity.

Across our population of neurons, the vast majority were influenced by at least one of these factors (281/296, p<0.05), and we found neurons with firing rates influenced by all possible combinations of factors (Figure 6A). Consistent with the results shown in Figure 3C and D, respectively, this analysis confirms that, as the mouse is selecting its direction of movement, the activity of many SNr neurons was modulated by current choice (167/296) and trial type (142/296). We also found that the activity of many neurons depended on reaction time (111/296). Surprisingly, the largest fraction of neurons exhibited activity modulated by previous choice (188/296). This is particularly interesting because this variable is critical for determining, in an IS block, which direction is associated with reward.

SNr activity is influenced by several task-relevant factors throughout the trial.

(A) Venn diagram showing the number of neurons whose firing rate during the delay epoch was significantly influenced (p < 0.05) by previous choice, current choice, trial type, reaction time, and all combinations of these factors, or by no factor. (B) β coefficients estimated based on firing rate in 100 ms bins aligned to three different trial events for one example neuron (reaction time coefficient not shown, for clarity). Shading, ± 95% confidence interval. (C) Fraction of neurons with a significant β coefficient corresponding to each predictor variable in each 100 ms bin, aligned as in panel B. All 296 neurons were included in this analysis.

https://doi.org/10.7554/eLife.13833.009

Given that movements can initially be selected earlier in IS than SG trials (Figure 3A,B), we wondered how firing rate at other times during the trial depended on previous choice, current choice, trial type, and reaction time. We therefore extended our regression analysis to examine how firing rate is modulated by these factors during overlapping 100 ms windows throughout the trial (see Materials and methods). In the example shown in Figure 6B, the activity of the neuron is modulated by previous choice (cyan line) – i.e., the confidence interval (shading) for this coefficient does not include 0 – even before the odor is delivered (odor valve open), and this influence persists until the movement is initiated (odor port exit). The current choice (black line) does not influence neural activity until ramping up just prior to movement initiation, but then continues to exert an influence for the remainder of the trial. The trial type (magenta line), meanwhile, exerts a moderate influence on the firing rate – specifically, activity is higher for IS trials – until just before movement initiation, after which this influence is diminished. Reaction time was a relatively poor predictor of firing rate (not shown, for clarity). To examine the dynamics of the weights of these factors across our population of neurons, we calculated the fraction of neurons with significant weights in each time window (Figure 6C). The pattern of results was similar to that shown in the example neuron (Figure 6B). Before the odor is delivered, the firing rates of about half of the neurons are influenced by the previous choice. However, as the odor is sampled, the influence of the previous choice decreases and the influence of the current choice increases, with about two thirds of neurons exhibiting a significant weight for the current choice by the time the movement is initiated. Interestingly, trial type modulates the activity of about one third of neurons throughout the trial. The influence of reaction time is strongest during movement but has relatively little influence on the population (not shown). These results indicate that SNr activity dynamically reflects trial type and other task-relevant variables throughout the trial, as would be expected if the BG are differentially involved in mediating stimulus-guided and internally-specified movements.

Discussion

We have shown that SNr activity preceding orienting movements depends on whether the direction of movement was indicated by a stimulus or was specified by internal variables (Figures 3, 4). While we designed the task such that correct movements were equally valuable across these conditions (Figure 1), given imperfect (and stochastic) choice behavior, the experienced value was not necessarily identical. However, the dependence on trial type could not be accounted for by differences in the estimated value of each movement direction – or an associated variable, such as difficulty or uncertainty in selecting the movement – between the trial types (Figure 5). In some neurons this dependence could be explained, in part, by the choice on the previous trial (Figure 6A), which is informative of the rewarded direction in IS blocks. Over the course of the trial, while the influence on SNr activity of the previous choice decreased and that of the current choice increased, as might be expected given the demands of the task, the influence of trial type remained relatively constant (Figure 6C). These results suggest that the SNr is differentially engaged by stimulus-guided and internally-specified movements.

Previous studies in primates have shown that movement-related SNr activity was higher for memory-guided than visually-guided saccades (Hikosaka and Wurtz, 1983a), and that SNr stimulation had a larger effect on memory- than visually-guided saccades (Basso and Liu, 2007). Movements selected based on a remembered stimulus can be thought of as internally specified, and in this sense our results (Figure 3D) are consistent with these findings and demonstrate that they generalize across species and movement types (saccades and full-body orienting). However, the rewarded direction in both memory- and visually-guided trials was indicated by the stimulus, which was not the case in our IS trials, in which we sometimes observed that direction preference emerged even before stimulus delivery (see example in Figure 3A,B). Further, the difference between direction preference in IS and SG trials during the delay epoch was correlated with the difference between preference in IS and SG trials during the epoch from odor port entry to odor valve open (r = 0.47, p = 4.8 × 10−18). These results demonstrate that, in IS trials, the direction of movement was initially selected independent of the stimulus, which contributes to the difference in activity during the delay epoch that we observe between SG and IS trials. In addition, while Hikosaka and Wurtz (1983a) examined only neurons that exhibited a decrease in activity around the time of contraversive saccades, we examined increasing and decreasing neurons that prefer both ipsiversive and contraversive movement (Gulley et al., 2002, 1999; Handel and Glimcher, 2000; Sato and Hikosaka, 2002) and found that all of these groups exhibited a difference in activity between stimulus-guided and internally-specified movements. Therefore, the differences we observed between internally-specified and stimulus-guided movements extend our understanding of SNr function.

Interestingly, patients with Parkinson’s disease and other BG pathologies have been reported to exhibit greater deficits in the initiation of internally-specified than visually-guided movements (Forssberg et al., 1984; Laplane et al., 1984; Azulay et al., 1999). While the neural basis for this phenomenon is not well understood and remains an active area of study (Distler et al., 2016), it has been suggested that visual cues engage (intact) sensorimotor pathways outside of the BG, such as the cerebellum (Glickstein and Stein, 1991). Our results suggest that differential processing of internally-specified and visually-guided movements within the BG themselves may also contribute to this clinical observation.

As noted above, other studies have found that movement-related SNr activity is modulated by the relative value associated with a movement (Bryden et al., 2011; Sato and Hikosaka, 2002), including whether the movement will be rewarded at all (Handel and Glimcher, 2000). This value dependence likely arises from dopaminergic input to the BG that is thought to convey reward-related information (Schultz et al., 1997), and has been accounted for by a model in which, prior to stimulus presentation, reward expectation modulates striatal inputs to the SNr in order to bias downstream superior colliculus (SC) activity such that the most valuable movement is facilitated (Hikosaka et al., 2006; Wolf et al., 2015). We propose that a similar model can also explain how internally-specified movements, more generally, are facilitated (Figure 7).

Model proposing how the observed activity of ipsiversive-preferring SNr neurons could facilitate internally-specified movements relative to stimulus-guided movements.

(A) Line thickness corresponds to level of activity. Activity preceding stimulus-guided rightward movement. A left SNr neuron is moderately weakly active, providing moderately weak inhibition to the left SC (superior colliculus). A right SNr neuron is moderately strongly active, providing moderately strong inhibition to the right SC. This pattern of activity in the SC moderately promotes rightward movement. (B) Activity preceding internally-specified rightward movement. Compared to A, a left SNr neuron is very weakly active, providing very weak inhibition to the left SC; and a right SNr neuron is very strongly active, providing very strong inhibition to the right SC. This pattern of activity in the SC strongly promotes rightward movement.

https://doi.org/10.7554/eLife.13833.010

To illustrate this idea, consider how, given the data presented here, the relative activity between ipsiversive-preferring left and right SNr neurons would relate to an upcoming rightward movement (we consider relative, rather than absolute, activity since this is most directly relevant to the decision – move left vs. move right – required by our task). Left and right SNr neurons would exhibit a larger difference in activity in IS trials than in SG trials (Figure 3D, left). If ipsiversive-preferring SNr neurons primarily project to the ipsilateral SC (Hikosaka and Wurtz, 1983b), then a downstream left SC neuron, the activity of which promotes rightward movement (Felsen and Mainen, 2012; Horwitz and Newsome, 2001; Stubblefield et al., 2013) will receive less inhibition from the left SNr when the movement is internally specified, thereby facilitating rightward movements that are internally specified (Figure 7). Preceding the same movement, contraversive-preferring left and right SNr neurons would also exhibit a larger difference in activity in IS trials than in SG trials (Figure 3D, right). If contraversive-preferring SNr neurons comprise the 'crossed' projection to the contralateral SC (Jiang et al., 2003), then a downstream right SC neuron would receive more inhibition from the left SNr when the movement is internally specified, again facilitating the rightward movement. However, we observed that slightly more SNr neurons prefer contraversive than ipsiversive movement (Figure 3C) but many fewer SNr neurons project to the contralateral than ipsilateral SC, particularly in rodents (Beckstead et al., 1981; Deniau et al., 1977; Gerfen et al., 1982; Jayaraman et al., 1977), and contraversive-preferring SNr neurons may preferentially project to non-tectal targets. Thus, SNr activity may be consistent with the facilitation of internally-specified contraversive movements. Our results therefore extend the model underlying the value-biasing view of BG function (Hikosaka et al., 2006) by suggesting that the influence of the SNr on downstream motor regions is modulated by internal representations in addition to value.

In summary, we have shown that SNr activity depends on whether otherwise-identical movements are specified by internal representations of task variables or guided by an external stimulus. We suggest that this dependence may reflect a facilitation for internally-specified movements, consistent with the view that, although movements are often made in response to sensory stimuli, internal representations of priors play a critical role in guiding motor output (Wolpert and Landy, 2012). Our results are sufficiently consistent with results in primate SNr (Handel and Glimcher, 2000, 1999; Hikosaka and Wurtz, 1983a; Liu and Basso, 2008; Sato and Hikosaka, 2002) that they can inform the interpretation of previous studies (e.g., our proposed extensions of the model explaining the value-biasing role of the BG described above), while also offering novel insight into BG function. Future studies can utilize the task established here, in the experimentally-advantageous awake-behaving mouse model (Carandini and Churchland, 2013), to examine whether the difference in SNr activity preceding internally-specified and stimulus-guided movements is established by local processing or via striatal inputs (Hikosaka et al., 2006; Lauwereyns et al., 2002) and to further elucidate how the BG control goal-directed movements.

Materials and methods

Animal subjects

All experiments were performed according to protocols approved by the University of Colorado School of Medicine Institutional Animal Care and Use Committee. We used male adult C57BL/6J mice (n = 4, determined by estimating the number of neurons required for our analyses and by the number of neurons recorded per mouse in initial experiments; aged 7–14 months at the start of experiments; Jackson Labs) housed in a vivarium with a 12-hr light/dark cycle with lights on at 5:00 am. Food (Teklad Global Rodent Diet No. 2918; Harlan) was available ad libitum. Access to water was restricted to the behavioral session to motivate performance; however, if mice did not obtain ~1 ml of water during the behavioral session, additional water was provided for ~2–5 min following the behavioral session (Smear et al., 2011; Thompson and Felsen, 2013). All mice were weighed daily and received sufficient water during behavioral sessions to maintain >85% of pre-water restriction weight.

Behavioral task

In general, mice were first trained to perform an odor-guided spatial choice task – which was comprised of 'stimulus-guided' (SG) trials – as described in Stubblefield et al. (2013), and were then trained to perform 'internally-specified' (IS) trials. Briefly, each mouse was water-restricted and trained to interact with three ports (center: odor port; sides: reward ports) along one wall of a behavioral chamber (Island Motion). In each trial, the mouse entered the odor port, triggering the delivery of an odor; waited 488 ± 104 ms (mean ± SD) for a go signal (auditory tone); exited the odor port; and entered one of the reward ports (Figure 1A). Premature exit from the odor port resulted in the unavailability of reward on that trial. Odors were comprised of binary mixtures of (+)-carvone and (-)-carvone, commonly perceived as caraway and spearmint, respectively; an enantiomeric odor pair was selected to control for differences in molecular structure of odorant stimuli. In each SG trial, one of seven odor mixtures was presented via an olfactometer (Island Motion): volume (+)-carvone/(-)-carvone = 95/5, 80/20, 60/40, 50/50, 40/60, 20/80, or 5/95. Mixtures in which (+)-carvone > (-)-carvone indicated reward availability only at the right port and mixtures in which (-)-carvone > (+)-carvone indicated reward availability only at the left port [we therefore refer to (-)-carvone as the 'left odor' (e.g., Figure 1D) for simplicity]. In trials in which (+)-carvone = (-)-carvone, the probability of reward at the left and right ports, independently, was 0.5. Reward, consisting of 4 μl of water, was delivered by transiently opening a calibrated water valve 10–100 ms after reward port entry. Odor and water delivery were controlled, and port entries and exits were recorded, using custom software (available at https://github.com/felsenlab; adapted from C. D. Brody) written in MATLAB (MathWorks).

Mice learned to perform SG trials within ~48 sessions (1 session/day); detailed training stages are described in Stubblefield et al. (2013). Mice required an additional ~5 sessions to learn to perform interleaved blocks of SG and IS trials. In every IS trial the 50/50 mixture was presented, and reward was available only at one side throughout the block. Mice were first introduced to interleaved blocks, each of which required 25 correct trials to advance to the next block. Once they performed ~70% of trials in the session correctly, the number of correct trials required per block was increased to 50. Mice performed 5 blocks (SG, IS, SG, IS, SG) per session (Figure 1B); the side associated with reward switched between each IS block. Upon completing training, mice were implanted with microdrives for neural recording (see below). During each of the 54 recording sessions, mice performed 321.81 ± 89.49 (mean ± SD) trials.

Surgery

Details of the surgical procedure are provided in Thompson and Felsen (2013). Briefly, once the mouse was fully trained on the task, it was anesthetized with isoflurane and secured in a stereotaxic device, the scalp was incised and retracted, 2 small screws were attached to the skull, and a craniotomy targeting the left SNr was performed, centered at 3.27 mm posterior from bregma and 1.4 mm lateral from the midline (Paxinos and Franklin, 2004). A VersaDrive 4 microdrive (Neuralynx), containing 4 independently adjustable tetrodes, was affixed to the skull via the screws, luting (3M), and dental acrylic (A-M Systems). A second small craniotomy was performed in order to place the ground wire in direct contact with the brain. After the acrylic hardened, a topical triple antibiotic ointment (Major) mixed with 2% lidocaine hydrochloride jelly (Akorn) was applied to the scalp, the mouse was removed from the stereotaxic device, the isoflurane was turned off, and oxygen alone was delivered to the animal to gradually alleviate anesthetic state. Mice were administered sterile isotonic saline (0.9%) for rehydration and an analgesic (Ketofen; 5 mg/kg) for pain management. Analgesic and topical antibiotic administration was repeated daily for up to 5 days, and animals were closely monitored for any signs of distress.

Electrophysiology

Neural recordings were collected using four tetrodes, wherein each tetrode consisted of four polyimide-coated nichrome wires (Sandvik; single-wire diameter 12.5 μm) gold plated to 0.2–0.4 MΩ impedance. Electrical signals were amplified and recorded using the Digital Lynx S multichannel acquisition system (Neuralynx) in conjunction with Cheetah data acquisition software (Neuralynx).

Tetrode depths were adjusted approximately 23 hr before each recording session in order to sample an independent population of neurons across sessions. To estimate tetrode depths during each session we calculated distance traveled with respect to rotation fraction of the screw that was affixed to the shuttle holding the tetrode. One full rotation moved the tetrode ~250 μm and tetrodes were moved ~62.5 μm between sessions. The final tetrode location was confirmed through histological assessment using electrolytic lesions and tetrode tracks (see below).

Offline spike sorting and cluster quality analysis was performed using MClust software (MClust-3.5, A.D. Redish, et al.) in MATLAB. Briefly, for each tetrode, single units were isolated by manual cluster identification based on spike features derived from sampled waveforms (Figure 2B). Identification of single units through examination of spikes in high-dimensional feature space allowed us to refine the delimitation of identified clusters by examining all possible two-dimensional combinations of selected spike features. We used standard spike features for single unit extraction: peak amplitude, energy (square root of the sum of squares of each point in the waveform, divided by the number of samples in the waveform), and the first principal component normalized by energy. Spike features were derived separately for individual leads. To assess the quality of identified clusters we calculated two standard quantitative metrics: L-ratio and isolation distance (Schmitzer-Torbert et al., 2005). Clusters with an L-ratio of less than 0.70 and isolation distance greater than 6.5 were deemed single units, which resulted in the exclusion of 12% of the identified clusters. Although units were clustered without knowledge of interspike interval, only clusters with few interspike intervals less than 1 ms were considered for further examination. Furthermore, we excluded the possibility of including data from the same neuron twice by ensuring that both the waveforms and response properties sufficiently changed across sessions. If they did not, we conservatively assumed that we were recording from the same neuron, and only included data from one session.

Lesioning and histology

To verify final tetrode location we performed electrolytic lesions (100 μA, ~1.5 min per lead) after the last recording session. One day following lesion, mice were overdosed with an intraperitoneal injection of sodium pentobarbital (100 mg/kg) and transcardially perfused with saline followed by ice-cold 4% paraformaldehyde (PFA) in 0.1 M phosphate buffer (PB). After perfusion, brains were submerged in 4% PFA in 0.1 M PB for 24 hr for post-fixation and then cryoprotected for 24 hr by immersion in 30% sucrose in 0.1 M PB. The brain was encased in the same sucrose solution, and frozen rapidly on dry ice. Serial coronal sections (60 μm) were cut on a sliding microtome for reconstruction of the lesion site and tetrode tracks. Fluorescent Nissl (NeuroTrace, Invitrogen) was used to identify cytoarchitectural features of the SNr and verify tetrode tracks and lesion damage within or below the SNr. Images of SNr (see Figure 2A) were captured with a 10x objective lens, using an LSM 5 Pascal series Axioskop 2 FS MOT confocal microscope (Zeiss).

Analyses and statistics

All analyses were performed in MATLAB.

Direction preference

To quantify the selectivity of single neurons for movement direction, we used an ROC-based analysis (Green and Swets, 1966). This analysis calculates the ability of an ideal observer to classify whether a given spike rate was recorded in one of two conditions (here, preceding leftward or rightward movement). We defined 'preference' as 2(ROCarea − 0.5), a measure ranging from −1 to 1, where −1 signifies the strongest possible preference for left, 1 signifies the strongest possible preference for right, and 0 signifies no preference (Feierstein et al., 2006). Statistical significance was determined with a permutation test: we recalculated the preference after randomly reassigning all firing rates to either of the two groups arbitrarily, repeating this procedure 500 times to obtain a distribution of values, and calculated the fraction of random values exceeding the actual value. We tested for significance at α = 0.05. Trials in which the movement time (between odor port exit and reward port entry) was > 1.5 s were excluded from all analyses. Neurons with fewer than 100 trials of each type (SG and IS) or with a firing rate below 2.5 spikes/s for either trial type or across the entire session (Fc, described below), were excluded from all analyses.

Sign of activity change during delay epoch

We calculated the normalized response (NR) for each neuron as NR = Ft/Fc where Ft is the mean firing rate in the 'test' window (delay epoch) and Fc is the mean firing rate in the 'control' window (Sato and Hikosaka, 2002) across all trials in the preferred direction of the neuron (or, for neurons with no direction preference, across all trials). Since the structure of our task does not include a natural 'control' epoch – i.e., in which the animal is in a motionless state unaffected by task demands – our control window was defined as the time of odor port entry to reward port exit (i.e., the duration of the trial). Neurons with NR < 1 were defined as decreasing and neurons with NR > 1 were defined as increasing (Table 1). Note that, by convention, a decreasing neuron that decreases more for contraversive than ipsiversive movement would be considered to have an ipsiversive direction preference (as calculated above), because firing rate is higher for ipsiversive movement (cf. Sato and Hikosaka, 2002).

Reinforcement learning model

In order to estimate the value associated with each direction of movement, we iteratively updated the value of each direction in each trial as Vdir,t=Vdir,t1+α(Rdir,t1Vdir,t1), where Rdir,t1 is the reward for the given direction in the previous trial in which that direction was chosen [0 for unrewarded (which includes trials in which the correct choice was made but the odor port was exited before the go signal) and 1 for rewarded] and α is the learning rate (we set α=0.1; values of 0.03 and 0.3 did not affect the results). The value of each direction was updated independently. This estimate is based on the Q-learning algorithm (Sutton and Barto, 1998; note that we excluded a term for maximum future value because this was independent of the choice on the current trial). Vdir,t therefore ranged from 0 to 1. Since we calculated Vdir,t in each trial of the session (including SG and IS trials), it tended to start near 0.5 (but was not exactly 0.5) at the beginning of each IS block. In well-behaved IS blocks, Vdir,t tended to asymptotically approach 1 as the mouse consistently returned to the rewarded port. Vdir,t was calculated separately for the ipsiversive and contraversive directions and was only updated in trials in which that direction was selected.

Regression model

To assess the influence of several factors on SNr activity (Figure 6), we fit the electrophysiological data with a multi-variable linear regression model of the form

FR=β0+βPrevious choiceχPrevious choice+ βCurrent choiceχCurrent choice+βTrial typeχTrial type+βReaction timeχReaction time,

where FR is the firing rate during the delay epoch,

χPrevious choice={1 for an ipsiversive choice0 for no choice1 for a contraversive choicein the previous trial,

χCurrent choice={1 for an ipsiversive choice0 for no choice1 for a contraversive choicein the current trial,

χTrial type={1 for an IS trial1 for an SG trial,

χReaction time=time from go cue to reward port entry (normalized between 0 and 1),

β0 represents the mean firing rate across trials during the delay epoch, and βPrevious choice, βCurrent choice, βTrial type, and βReaction time represent the influence on firing rate of the previous choice, the current choice, the trial type, and the reaction time corresponding to that trial. Positive values for βPrevious choice and βCurrent choice indicate that firing rate is increased by contraversive choices and negative values indicate that firing rate is increased by ipsiversive choices. Positive values for βTrial type indicate that firing rate is increased by stimulus-guided trials and negative values indicate that firing rate is increased by internally-specified trials. The sign of βReaction time indicates the sign of the correlation between reaction time and firing rate. We used the MATLAB function fitlm to estimate the βs and calculate their significance and confidence intervals. We also performed this same regression analysis with two additional terms, for value [estimated in IS trials with the reinforcement learning model (see above) and in SG trials as the average performance by mixture ratio within the block (Figure 1D)], and the interaction between value and trial type. We found that, while the firing rate of some neurons was influenced by these additional factors, as expected given the value-biasing view of BG function (Hikosaka et al., 2006), including them – or including the log of the value – did not affect the overall results shown in Figure 6A. To examine how firing rate throughout the trial depended on these factors (Figure 6B,C), we repeated the above analysis with respect to firing rate in overlapping 100 ms bins, shifted by 10 ms, aligned to three behavioral events: odor valve open, odor port exit and reward port entry.

References

  1. 1
  2. 2
  3. 3
  4. 4
    Neuronal activity in substantia nigra pars reticulata during target selection
    1. MA Basso
    2. RH Wurtz
    (2002)
    Journal of Neuroscience  22:1883–1894.
  5. 5
    A comparison of the intranigral distribution of nigrotectal neurons labeled with horseradish peroxidase in the monkey, cat, and rat
    1. RM Beckstead
    2. SB Edwards
    3. A Frankfurter
    (1981)
    Journal of Neuroscience  1:121–125.
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
    Paradoxical kinesia in parkinson's disease revisited: anticipation of temporal constraints is critical
    1. M Distler
    2. JC Schlachetzki
    3. Z Kohl
    4. J Winkler
    5. T Schenk
    (2016)
    Neuropsychologia, 86, 10.1016/j.neuropsychologia.2016.04.012.
  13. 13
  14. 14
  15. 15
  16. 16
    Is parkinsonian gait caused by a regression to an immature walking pattern?
    1. H Forssberg
    2. B Johnels
    3. G Steg
    (1984)
    Advances in Neurology 40:375–379.
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
    Signal Detection Theory and Psychophysics
    1. DM Green
    2. JA Swets
    (1966)
    New York, USA: John Wiley & Sons, Inc.
  22. 22
  23. 23
  24. 24
  25. 25
    Quantitative analysis of substantia nigra pars reticulata activity during a visually guided saccade task
    1. A Handel
    2. PW Glimcher
    (1999)
    Journal of Neurophysiology 82:3458–3475.
  26. 26
    Contextual modulation of substantia nigra pars reticulata neurons
    1. A Handel
    2. PW Glimcher
    (2000)
    Journal of Neurophysiology 83:3042–3048.
  27. 27
  28. 28
    Visual and oculomotor functions of monkey substantia nigra pars reticulata. III. Memory-contingent visual and saccade responses
    1. O Hikosaka
    2. RH Wurtz
    (1983a)
    Journal of Neurophysiology 49:1268–1284.
  29. 29
    Visual and oculomotor functions of monkey substantia nigra pars reticulata. IV. Relation of substantia nigra to superior colliculus
    1. O Hikosaka
    2. RH Wurtz
    (1983b)
    Journal of Neurophysiology 49:1285–1301.
  30. 30
    Modification of saccadic eye movements by GABA-related substances. II. Effects of muscimol in monkey substantia nigra pars reticulata
    1. O Hikosaka
    2. RH Wurtz
    (1985)
    Journal of Neurophysiology 53:292–308.
  31. 31
    The basal ganglia
    1. O Hikosaka
    2. RH Wurtz
    (1989)
    Reviews of Oculomotor Research 3:257–281.
  32. 32
    Target selection for saccadic eye movements: prelude activity in the superior colliculus during a direction-discrimination task
    1. GD Horwitz
    2. WT Newsome
    (2001)
    Journal of Neurophysiology 86:2543–2558.
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
    The Mouse Brain in StereotaxicCoordinates (2nd edn)
    1. G Paxinos
    2. KB Franklin
    (2004)
    Amsterdam: Elsevier Academic.
  45. 45
  46. 46
    Role of primate substantia nigra pars reticulata in reward-oriented saccadic eye movement
    1. M Sato
    2. O Hikosaka
    (2002)
     Journal of Neuroscience 22:2363–2373.
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
    Reinforcement Learning : An Introduction
    1. RS Sutton
    2. AG Barto
    (1998)
    MIT Press.
  53. 53
  54. 54
  55. 55
  56. 56

Decision letter

  1. Rui M Costa
    Reviewing Editor; Fundação Champalimaud, Portugal

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your work entitled "Basal ganglia output reflects internally-specified movements" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Timothy Behrens as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The reviewers and reviewing editor agree that the study clearly presents an interesting and well done study, where the authors show that activity in SNr, the main basal ganglia output nucleus in rodents, signals preferentially internally-generated movements rather than stimulus-driven movements. The results are consistent with the clinical findings in basal ganglia disorders (self-paced movements are more affected than stimulus-driven ones), but are still novel in the sense that they show that SNr activity precedes many times internally generated movements.

However, there are several issues to be addressed before the manuscript can be considered further. Please find attached below the original comments and also a summary of the most important issues to take into account.

1) The reviewers suggest new analyses to quantify the activity of single SNr neurons in the different conditions. They specifically suggest that the authors analyse the activity of individual neurons during ipsi versus contralateral movements.

2) The reviewers ask for new analyses to demonstrate that activity preferentially encodes internally-generate movements versus stimulus-guided throughout different epochs of the movement. In the current version of the study, the activity is measured at constant times in the trial which, given the fact that reaction times are different, may mean different epochs in relation to movement are being compared.

3) The reviewers also comment that SNr-SC connections are largely ipsilateral in different species. The data presented almost suggests that the contralateral connection is equally common. How do the authors explain this?

4) The reviewers propose other ways of quantifying and analysing/fitting the data; namely using a simple RL model.

5) The flow of the manuscript tries to set up an opposition between value-based action selection or movement biasing, versus internally generated action selection or movement biasing. However, the study does not address that, it rather addresses the comparison between similarly valued stimulus-driven versus internally-guided movements (in other formulations feedforward versus feedback driven, sensorimotor versus motor sensory). Therefore, the study does not address that expected value does not influence the activity of SNr during the execution of similar internally-guided movements. It does compare the activity of similarly valued movements, internally-guided are more represented than stimulus-driven. The study is not contrary to the value-biasing view of basal ganglia function, it just shows that basal ganglia cares more about self-paced that stimulus-driven movements (even for the same value). Therefore, the Abstract; Introduction, Discussion, and flow should focus on the difference between stimulus-driven versus internally-guided movements.

Specific Comments:

Reviewer #1:

6) This study provides data suggesting that the basal ganglia contribute to the internally triggered motor behavior. It has been known that patients with basal ganglia dysfunction may have difficulty in initiating movements internally. For example, patients with Parkinson's disease often have difficulty in initiating movements (e.g., walking) if there is no sensory guidance (Glickstein & Stein, Trends Neurosci 1991) and patients with bilateral pallidal lesions may not initiate goal-directed behavior spontaneously (Laplane et al., J Neurol Neurosurg Psychiatry 1984). However, the underlying mechanism is still unclear. In this sense, this study is important.

A difficulty of this theme is how to guide the subject to initiate movements internally. Without any sensory guidance, motor behavior is likely uncontrollable and, if so, behavioral as well as neuronal data may not be feasible for statistical analysis. In such a difficult context, the authors devised a simple but controllable procedure for mice. Thus, this manuscript is valuable.

With my high expectation, I now have many suggestions and comments, as shown below.

It is important to discuss the results of this study in relation to the well-known clinical symptoms described above.

7) The authors analyzed the preferred and antipreferred directions separately throughout the description of the results, which makes the interpretation very complex and difficult. In the end, to interpret the data as a whole the authors focused on the difference in neuronal activity between the contralateral and ipsilateral choices (Figure 7). Aiming at this goal, it would be much better to use this measure (i.e., Ipsi-Contra) from the beginning. Below are my suggestions.

8) Figure 3A: Rearrange graphs in a 2x2 format: SG (left) vs. IS (right), Ipsi (top) vs. Contra (bottom). The Ipsi-Contra difference would become more evident in IS, but not SG.

9) Figure 3D: In the current version, it is unclear how individual neurons behaved. My suggestion is to plot the Ipsi-Contra difference for each neuron. The distribution of data points would be similar to the current graph. I would then analyze data statistically, including populational trend (e.g., Ipsi-Contra difference is larger in IS than SG) and classification of individual neurons (e.g., IS-preferring, SG-preferring).

10) Figure 4: In the current version, 4A has the same issue as for Figure 1D, and 4B is almost meaningless. I would use the Ipsi-Contra difference for 4A for each neuron. Then, show the across-trial changes (perhaps, using a regression line) for a number of neurons; in this case, it may be useful to use the absolute value of the Ipsi-Contra difference, because it would increase across trials. My other suggestion is to plot the Ipsi-Contra difference across time within a trial; the value would increase for IS than SG (as in Figure 3B).

11) Figure 5: The current version includes multiple comparisons, although the graphs do not visually show the results of the comparisons. My suggestion is to present simple histograms showing the absolute values of the Ipsi-Contra difference, separately for IS, SG (easy), and SG (difficult).

12) Figure 6: This figure is too complicated and does not add any support for the conclusion. Moreover, the data include both SG and IS, which I find no benefit. The previous choice should have a larger effect in IS than SG. The current choice should start earlier for IS than SG. My suggestion is to do the same analysis separately for SG and IS, although it may provoke more issues.

13) I have to add one issue to my own suggestion – focus on the Ipsi-Contra difference. The feature that is orthogonal to this measure is the average magnitude of the Ipsi & Contra activity. If this is high, SC on both sides would be inhibited and the action (e.g., orienting) would be suppressed in general. If low, the action would be enhanced in general. The authors need to consider if this feature contributes to any difference in animals' choice behavior in general.

Below, I add my comments (rather than suggestions), which can be critical.

14) Activity of SNr neurons started differential (ipsi vs. contra) earlier in IS trials (even before odor delivery) than SG trials (basically after odor delivery), as exemplified in Figure 3B. Now, the magnitude of neuronal activity was measured during the delay epoch that started with odor delivery. This raised the possibility that the differential activity was higher in IS trials than SG trials simply because the differential activity started earlier in IS trials. Please answer this question.

15) The different connections shown in Figure 7 may explain the functions of ipsiversive and contraversive SNr neurons, but this is a hopeful scheme with no evidence or hints. Virtually all anatomical studies have shown that the majority of the SNr-SC connections are ipsilateral in various mammals. The contralateral SNr-SC connection is very few in rats (Beckstead et al., J Neurosci 1981), which may be true in mice. The authors found that contraversive SNr neurons (thus, projecting to the contralateral SC) are slightly more common than ipsiversive SNr neurons (thus, projecting to the ipsilateral SC). This raises a concern about the scheme in Figure 7.

Other comments and questions:

16) Results: "if value is only one of a number of internal representations contributing to movement selection, then these movements would differentially engage the BG."

What does this mean?

17) Results: "We recorded from 298 well isolated left SNr neurons"

Why did you record SNr neurons only on the left side in 4 mice?

18) Results: "while we found that neurons were equally likely to prefer upcoming ipsiversive and contraversive movements (Figure 3C), they were more likely to exhibit a higher firing rate preceding IS movements in their preferred direction and preceding SG movements in their antipreferred direction (Figure 3D)."

It is unclear what this sentence means.

19) Results: "We observed no difference in activity between SG trials in which different mixtures were presented but movement direction was held constant"

Please show statistical evidence.

20) Results: "The difficulty of this decision could, in principle, affect SNr activity." Another possibility is 'uncertainty', rather than difficulty.

21) Results: "there was little overlap between this small population of difficulty-dependent neurons and those neurons that exhibited a dependence on trial type (Figure 5A, B)". This notion is unclear simply by looking at these graphs.

Reviewer #2:

The authors present results from a study in which they recorded neural activity in the SNr while mice carried out a task in which they chose left or right reward ports either on the basis of an odor mixture (stimulus guided) or on the basis of previously rewarded outcomes (internally-guided). They examined neural activity, mostly during the choice period and found that SNr neurons coded ipsiversive and contraversive movements with about equal frequency. Neurons responded more strongly for internally-guided movements in their preferred direction, and more strongly for stimulus guided movements in their anti-preferred direction. They also correlated with learning in the IS blocks.

This study addresses an interesting question about the role of the basal ganglia, assessed from the perspective of an output nucleus, in internally generated vs. externally cued movements. Overall the study was well carried out. The dataset is reasonable and the results are clearly presented. The authors make a strong claim about matching the values of the movements. However, the analytical approach to matching them could be more detailed. Currently it is a bit qualitative. Specific suggestions follow.

22) The analysis of the effects of learning on firing rates shows a reasonable correlation. But more sophisticated methods exist for examining such relationships. Specifically, it would be better to fit a simple reinforcement learning model to the data, and correlate value with neural activity. The analysis that correlates number of consecutive correct trials is a bit harder to interpret, although it may be getting at the same point.

23) The RL model would also allow you to more rigorously examine similarities between the reward value representations in the SG and IS blocks. At this point they are only being qualitatively compared.

24) For the analysis of the effects of difficulty in the SG blocks on neural activity, why not just run an ANOVA with difficulty level as a factor? Comparing hard vs. difficult levels is approximately doing this, but since the task design used fixed levels, why not just use them all in the analysis? This could be done in an ANOVA with odor mixture as a factor as well.

25) Also, if an RL model is fit to the learning data, this can be used to estimate accuracy as a function of trials in IS blocks. Accuracy can also be extracted from the psychometric curves in the SG blocks. Accuracy could then be used a factor in an ANOVA analysis of neural activity. How many neurons show a main effect of accuracy? How many show an accuracy by task interaction? Log accuracy should also be analyzed, as this is often more correlated with neural activity in choice tasks than accuracy.

26) A similar study has been carried out in macaques, comparing lateral prefrontal and the caudate nucleus, Seo et al., Neuron, 2012. Which areas do the authors think would be more strongly involved in the SG component of the task?

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Basal ganglia output reflects internally-specified movements" for further consideration at eLife. Your revised article has been favorably evaluated by Timothy Behrens as the Senior editor, a Reviewing editor, and two reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

1) One of the reviewers still feels strongly that the main point of analyzing ipsi /contra versus preferred/non preferred direction was not understood. So the reviewer now provides a more detailed explanation of what was the point. We ask the authors to please submit new analyses and details regarding this point (where possible).

2) In addition, it would be good to fix some of the small points raised.

3) Finally, the authors still insist that the data presented are not consistent with a value-biasing view of basal ganglia function. The editors and reviewers felt that the data and results presented focus on the difference between stimulus-driven versus internally-guided movements, which is orthogonal to value biasing (and does not formally exclude value biasing). Therefore, we urge the authors to further revise the manuscript regarding this point.

Reviewer #1:

4) As I mentioned before, I am not arguing that the authors' data are wrong. I have been trying to find a more straightforward way to present the authors' data. My main concern is that the current version may not transmit the basic and essential part of the authors' message to the readers. For this reason, I still have a strong suggestion to use the neuronal activity difference between ipsilateral and contralateral choices, for the following reasons.

Let's start from Figure 3B. Does this neuron contribute to the choice? To answer this question, I would set this neuron in SNr on both sides. If the animal decides to choose Rightward movement, the neuron in Right SNr (i.e., Ipsi) is more active than the neuron in Left SNr (i.e., Contra). This would lead to the bias toward Rightward movement (Figure 7). Importantly, the choice bias is stronger in IS than SG (Figure 3B). Therefore, the neuron does contribute to the choice using the difference in its activity between Ipsi and Contra choices, and does so more strongly in IS than SG condition.

Checking the neuron's activity for only one choice (i.e., preferred or anti-preferred) would not indicate whether the neuron can contribute to the choice. That's why the data points shown in Figures 3 and 4 provide no clear message. Two data points are associated with one neuron. By looking at these two points in Figure 3D, we can guess if the neuron can contribute to the choice; the slope between the two data points indicates the ratio of the directional bias (IS-bias/SG-bias). But, I bet most readers would not reach such detailed understanding. Moreover, the pair of data points is shown for only one example neuron. How about the others?

I understand the authors' logic. The average slope of all data points (regression line) in Figure 3D is higher than 1. Therefore, IS-bias is larger than SG-bias overall. But, the basic information – choice signal carried by individual neurons – is missing. Let me ask a basic question. How many SNr neurons were capable of contributing to the choice? Would the data in Figure 4 answer the question? I doubt it. For example, there are neurons that showed significant correlations with both preferred and anti-preferred directions. Some of them had the same polarities; they would increase (or decrease) their activity during both Ipsi and Contra IS blocks. Such neurons are probably not capable of contributing to the correct choice.

How can we quantify the choice capability of each neuron, then? My only suggestion is to use the difference between Ipsi and Contra directional choices. This exactly matches the model shown in Figure 7.

Other than my critical comments, I do find some of the authors' data interesting. For example, I find two dominant groups of neurons in Figure 4B: 1) neurons showing positive correlations with their preferred directions, and 2) neurons showing negative correlations with their anti-preferred directions. This indicates that both groups of neurons enhanced their choice signals, but in different ways: 1) enhancing the dominant signal, and 2) suppressing the non-dominant signal. This reminds me of a basic mechanism of motor control: To change the orientation of a joint, the agonist muscle contracts or the antagonist muscle relaxes, or both contract (or relax) in different amounts. These mechanisms may be useful in different contexts. In this sense, the authors' data may suggest that different groups of SNr neurons contribute to such different mechanisms of joint orientation, in this case, head or eye orientation.

However, this question is one step beyond the main question: Which neurons can contribute to the choice of orientation? To address this main question, the authors should focus on the difference in each neuron's activity between the Ipsi- and Contra-directions, which is similar to the difference in contraction force between agonist and antagonist muscles.

I have some specific questions (below).

5) Data in Figure 3D are different from the ones in the original manuscript. In particular, the two data points for the example neuron are very different. Please give me an explanation.

6) How was the preferred direction defined? I understand that the authors used ROC analysis. Were both SG and IS trials included? Weren't there any neurons that showed a bit different preference between SG and IS trials? For example, the dark blue data point at the far right in Figure 3D. This means that the neuron responded strongly in the anti-preferred direction in SG trials. In other words, it responded more strongly in the preferred direction. Which of the dark red data points belongs to this neuron? In any case, this may raise another issue of preferred vs. anti-preferred directions.

Reviewer #2:

7) I have only one brief comment. Otherwise the authors have carefully addressed all of my concerns in detail.

Figure 4A doesn't show value estimates evolving over trials as stated in reply to reviewers and Methods. "In well-behaved IS blocks, Vdir,t tended to asymptotically approach 1 as the mouse consistently returned to the rewarded port (Figure 4A)." Was this erroneously not included or with this statement did you just mean you were using the RL algorithm to generate the value estimates and these value estimates were used in Figure 4? It's not clear.

https://doi.org/10.7554/eLife.13833.016

Author response

There are several issues to be addressed before the manuscript can be considered further. Please find attached below the original comments and also a summary of the most important issues to take into account.

1) The reviewers suggest new analyses to quantify the activity of single SNr neurons in the different conditions. They specifically suggest that the authors analyse the activity of individual neurons during ipsi versus contralateral movements.

We have performed several new analyses to quantify the activity of single SNr neurons:

A) As described in our response to Comment #7, we have reanalyzed neural activity with respect to ipsiversive vs. contraversive movements wherever possible (Revised Figure 5; subsection “Modulation of SNr activity by task-relevant variables”, first paragraph). However, we believe that it remains necessary to segregate trials with respect to the preferred vs. antipreferred direction of the neuron in some cases (Revised Figures 3D and 4; subsections “SNr activity differs for stimulus-guided and internally-specified movements”, second and last paragraphs), as described in our response to Comment #7.

B) As described in our responses to Comments #22 and 25, we have used a reinforcement learning model to provide an estimate of movement value, and calculated how firing rate depends on this estimate over the course of a block of internally-specified (IS) trials (Revised Figure 4, subsection “SNr activity differs for stimulus-guided and internally-specified movements”, last paragraph and subsection “Sign of activity change during delay epoch”).

C) As described in our responses to Comments #11 and 24, we have calculated how neural activity depends on the difficulty of the odor discrimination – or an associated variable, such as uncertainty, or the estimated value of each movement direction – with respect to each mixture ratio separately (Revised Figure 5, subsection “Modulation of SNr activity by task-relevant variables”, first paragraph).

2) The reviewers ask for new analyses to demonstrate that activity preferentially encodes internally-generate movements versus stimulus-guided throughout different epochs of the movement. In the current version of the study, the activity is measured at constant times in the trial which, given the fact that reaction times are different, may mean different epochs in relation to movement are being compared.

We believe that this comment is related to Comment #14; please also see our response to it below. However, this comment also raises distinct points, which we address here. We first clarify two points: First, our primary focus is on the delay epoch (from odor valve open to odor port exit; Figure 1A), which precedes the movement, rather than epochs during the movement itself. Second, we defined reaction time as the time from go signal to reward port entry. (While it would also be sensible to define reaction time as the time from go signal to odor port exit, we find that exiting the odor port and moving to the reward port are more naturally considered to be part of the same process, so we consider them together.) Since reaction time was generally longer in stimulus-guided (SG) than IS trials (Figure 1F), the end of the movement (defined as reward port entry) generally occurred later in SG than IS trials. However, given that our delay epoch ended at the beginning of the movement (odor port exit), and not at the end of the movement, it has the same relationship with movement in both SG and IS trials: it is the period of time immediately preceding movement. We note that this would not be true of other, otherwise reasonable, definitions of the “delay epoch,” e.g., from odor valve open to the go signal; in fact, we have analyzed this epoch as well and found while preparing this resubmission that we had originally reported some results from it. We now report only results from the epoch as defined in Figure 1A, resulting in some slightly different numbers in the resubmission, none of which change any of our overall results.

However, we believe that the primary concern raised here, as well as in Comment #14, is that the direction of movement can be selected earlier in IS than SG trials, and that examining activity in the same fixed epoch (the delay epoch: odor valve open to odor port exit) therefore does not provide an appropriate comparison. Instead, perhaps we should consider defining our epoch as having a fixed length but starting relative to when the direction of movement can first be selected; for example, the epoch could begin with odor port entry for IS trials and odor valve open for SG trials (note that, if we simply prepend the period between odor port entry and odor valve open to the delay epoch for IS trials only, we would see an even larger difference between IS and SG trials than we show in Figure 3D). We have considered this possibility (among several others), but it would be problematic because many neurons exhibit activity late in our current delay epoch in IS trials, which is likely related to movement selection but would be lost if the epoch ended earlier. Ultimately, we do not believe that there exists a better definition of the relevant epoch to our study than our delay epoch: We are most interested in the activity immediately preceding movement that it captures, because it is at this time that, via modulation of activity in downstream structures, specific movements are facilitated or inhibited. We have clarified this reasoning in the Results section when we first introduce the delay epoch (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, first paragraph), and discuss the relationship between activity preceding stimulus delivery and activity during the delay epoch in the Discussion (second paragraph; also see response to Comment #14).

Finally, we suggest that, rather than being problematic, the fact that activity may start earlier in IS trials captures an important difference between internally-specified and stimulus-guided movements in the real world: the former may be selected earlier than the latter (Introduction, first paragraph). Further, the idea that direction-related activity in IS trials precedes stimulus delivery is consistent with the value-biasing view of basal ganglia (BG) function (Hikosaka et al., 2006), in which reward modulates activity prior to stimulus delivery, resulting in more valuable movements being facilitated relative to less valuable movements (Introduction, first paragraph, Discussion, third paragraph).

3) The reviewers also comment that SNr-SC connections are largely ipsilateral in different species. The data presented almost suggests that the contralateral connection is equally common. How do the authors explain this?

As described in our response to Comment #15, we agree that our idea that contraversive-preferring SNr neurons comprise the crossed nigrotectal projection is speculative and remains to be tested – particularly given the fact that about as many SNr neurons exhibit contraversive as ipsiversive preference – and therefore distracts from our main point about how SNr output may promote internally-specified movements. In the revised manuscript, we have therefore removed nearly all of our discussion about the crossed projection, and we have simplified Revised Figure 7 to show only the (more numerous, and more commonly studied) uncrossed nigrotectal neurons in relation to stimulus-guided and internally-specified movements. In addition, we explicitly state that, while contraversive-preferring SNr neurons may project to the contralateral SC, they may also project to non-tectal targets (e.g., thalamus) (Discussion, fifth paragraph).

4) The reviewers propose other ways of quantifying and analysing/fitting the data; namely using a simple RL model.

As described in our responses to Comments #22 and 25, in the revised manuscript we use a reinforcement learning model to provide an estimate of movement value (subsection “Sign of activity change during delay epoch”). We then use this estimate to calculate how firing rate evolves over the course of a block of IS trials (Revised Figure 4, subsection “SNr activity differs for stimulus-guided and internally-specified movements”, last paragraph) as well as how it influences firing rate among other factors (subsection “Regression model”).

5) The flow of the manuscript tries to set up an opposition between value-based action selection or movement biasing, versus internally generated action selection or movement biasing. However, the study does not address that, it rather addresses the comparison between similarly valued stimulus-driven versus internally-guided movements (in other formulations feedforward versus feedback driven, sensorimotor versus motor sensory). Therefore, the study does not address that expected value does not influence the activity of SNr during the execution of similar internally-guided movements. It does compare the activity of similarly valued movements, internally-guided are more represented than stimulus-driven. The study is not contrary to the value-biasing view of basal ganglia function, it just shows that basal ganglia cares more about self-paced that stimulus-driven movements (even for the same value). Therefore, the Abstract; Introduction, Discussion, and flow should focus on the difference between stimulus-driven versus internally-guided movements.

We agree that our findings do not contradict the value-biasing view of BG function, and that in order to do so we would have had to vary value across otherwise-identical IS movements and show that SNr activity did not change. Our intent, rather, was to examine whether movements specified by internal representations in general (of which value is one) modulate BG activity. In this sense, our results “extend the model underlying the value-biasing view of BG function (Hikosaka et al., 2006) by suggesting that the influence exerted by the SNr on downstream motor regions is modulated by internal representations beyond value alone” (Discussion). We recognize that this point was unclear in the original submission and have therefore rewritten several passages throughout the revised manuscript:

Revised Abstract: We changed “We found that, contrary to the value-biasing view of basal ganglia function, activity in the substantia nigra pars reticulata, a basal ganglia output, predictably differed preceding internally-specified and stimulus-guided movements” to “We found that activity in the substantia nigra pars reticulata, a basal ganglia output, predictably differed preceding internally-specified and stimulus-guided movements, which is not accounted for by the value-biasing view of basal ganglia function”.

Revised Introduction: We now state, “We therefore asked whether BG activity mediates the influence of value specifically or, more generally, of internal goals on movement selection”; and “However, if the BG are not limited to mediating the influence of value but instead mediate the influence of internal goals in general, their output would differ under these two conditions”.

Revised Results: We changed “However, if value is only one of a number of internal representations contributing to movement selection, then these movements would differentially engage the BG” to “However, if other internal representations beyond value are also integrated by the BG, then stimulus-guided and internally-specified movements may differentially engage the BG despite being equally valuable”.

Revised Discussion: We now state, “We propose that a similar model can also explain how internally-specified movements, more generally, are facilitated”.

Specific Comments:

Reviewer #1:

6) This study provides data suggesting that the basal ganglia contribute to the internally triggered motor behavior. It has been known that patients with basal ganglia dysfunction may have difficulty in initiating movements internally. For example, patients with Parkinson's disease often have difficulty in initiating movements (e.g., walking) if there is no sensory guidance (Glickstein & Stein, Trends Neurosci 1991) and patients with bilateral pallidal lesions may not initiate goal-directed behavior spontaneously (Laplane et al., J Neurol Neurosurg Psychiatry 1984). However, the underlying mechanism is still unclear. In this sense, this study is important.

A difficulty of this theme is how to guide the subject to initiate movements internally. Without any sensory guidance, motor behavior is likely uncontrollable and, if so, behavioral as well as neuronal data may not be feasible for statistical analysis. In such a difficult context, the authors devised a simple but controllable procedure for mice. Thus, this manuscript is valuable.

With my high expectation, I now have many suggestions and comments, as shown below.

It is important to discuss the results of this study in relation to the well-known clinical symptoms described above.

We thank the reviewer for suggesting that we discuss our results in the context of paradoxical kinesia. Indeed, we have done so when discussing this research with colleagues, and we agree that it strengthens the manuscript to do so as well. We have therefore added the following text to the Introduction and Discussion of the revised manuscript:

Revised Introduction: “Notably, it has been proposed that Parkinsonian patients exhibit more bradykinesia when initiating internally-specified than stimulus-guided movements because the latter engage pathways outside of the BG (Glickstein and Stein, 1991). However, whether the BG themselves are differentially engaged by these two types of movements has not been tested”.

Revised Discussion: “Interestingly, patients with Parkinson’s disease and other BG pathologies have been reported to exhibit greater deficits in the initiation of internally-specified than visually-guided movements (Forssberg et al., 1984; Laplane et al., 1984; Azulay et al., 1999). […] Our results suggest that differential processing of internally-specified and visually-guided movements within the BG themselves may also contribute to this clinical observation”.

7) The authors analyzed the preferred and antipreferred directions separately throughout the description of the results, which makes the interpretation very complex and difficult. In the end, to interpret the data as a whole the authors focused on the difference in neuronal activity between the contralateral and ipsilateral choices (Figure 7). Aiming at this goal, it would be much better to use this measure (i.e., Ipsi-Contra) from the beginning. Below are my suggestions.

In the revised manuscript we have changed many of the figures according to the reviewers’ suggestions (Revised Figures 37), and we address specific comments about figures in many of our responses below. Here, we address the general idea of replacing, in Figures 35, our separate analyses of activity in trials in the preferred and antipreferred directions with a measure of the difference in activity between ipsiversive and contraversive trials. As described in detail below (see responses to Comments #9 and 10), we suggest that, in some cases, it makes sense to analyze the data separately for preferred and antipreferred trials.

For example, the main finding in Figure 3D is that the relationship between delay epoch activity in SG and IS trials depends on whether the movement was in the preferred or antipreferred direction of the neuron. For trials in the preferred direction, activity in IS trials was higher than activity in SG trials (Figure 3D, red circles). For trials in the antipreferred direction, activity in IS trials was lower than activity in SG trials (Figure 3D, blue circles). Author response image 1 shows the same data displayed as suggested by the reviewer (difference in activity between ipsiversive and contraversive trials plotted for SG vs. IS trials). This display does not convey the main finding as well as in Figure 3D (see response to Comment #9).

Author response image 1
Alternative display for Revised Figure 3D.

Difference in delay-epoch firing rate between ipsiversive and contraversive trials in stimulus-guided vs. internally-specified trials in the corresponding session. Black circles show neurons with a significant direction preference (corresponding to black bars in Figure 3C; gray circles show neurons without a significant direction preference (corresponding to gray bars in Figure 3C. Only correct trials are included; all choices on 50/50 SG trials were considered correct regardless of whether they were rewarded (as in Figure 3D). We believe that this display is not as informative as Revised Figure 3D (see response to Comment #9).

https://doi.org/10.7554/eLife.13833.012

Similarly, for Revised Figure 4, it is important that we display correlations separately for trials in the preferred and antipreferred directions, rather than in the ipsiversive and contraversive directions: As we clarify in the revised manuscript, given the results in Figure 3D, we would expect to find positive correlations for trials in the preferred direction and negative correlations for trials in the antipreferred direction (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, last paragraph), whereas this pattern would not hold for trials in the ipsiversive and contraversive direction (see response to Comment #10).

For the analysis shown in Revised Figure 5, we could segregate trials either with respect to the direction preference of the neuron or with respect to the direction relative to the recording side. We have therefore revised this figure to show data on ipsiversive and contraversive trials separately, as suggested by the reviewer (see response to Comment #11).

Finally, the functional model shown in Revised Figure 7 (which is now simplified; see response to Comment #15) incorporates both neuronal direction preference and the direction relative to the recording side. As described in the Revised Discussion (fourth paragraph) and the legend, the relative activity of the SNr neurons shown in the model (“relative” with respect to a left vs. right SNr neuron and with respect to an IS vs. an SG movement) is based on whether the direction of movement under consideration (in this case, a rightward movement) is in the preferred or antipreferred direction of the neuron. We therefore believe that the current set of Revised Figures 35, which incorporate many of the reviewers’ suggestions, most clearly convey our main findings.

8) Figure 3A: Rearrange graphs in a 2x2 format: SG (left) vs. IS (right), Ipsi (top) vs. Contra (bottom). The Ipsi-Contra difference would become more evident in IS, but not SG.

We thank the reviewer for this suggestion; we agree that the suggested format makes it easier to compare across both trial type and direction. We have made this change in the revised manuscript (Revised Figure 3A, B).

9) Figure 3D: In the current version, it is unclear how individual neurons behaved. My suggestion is to plot the Ipsi-Contra difference for each neuron. The distribution of data points would be similar to the current graph. I would then analyze data statistically, including populational trend (e.g., Ipsi-Contra difference is larger in IS than SG) and classification of individual neurons (e.g., IS-preferring, SG-preferring).

Our goal in Figure 3D was to show how the activity of each direction-selective neuron (corresponding to the black bars in Figure 3C) compared between IS and SG trials. We plotted activity separately for trials in the preferred (in red) and antipreferred (in blue) direction of the neuron in order to highlight the finding that more neurons: a) exhibited higher activity in IS trials than SG trials in their preferred direction (84/101 vs. 17/101; p = 2.6 x 10-11, χ2 test), and b) exhibited higher activity in SG trials than IS trials in their antipreferred direction (76/109 vs. 33/109; p = 3.8 x 10-5, χ2 test). We have clarified this finding in the summarizing sentence of the relevant paragraph in the revised manuscript (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, second paragraph).

In addition, we have plotted the data as suggested by the reviewer (Author response image 1). As the reviewer predicted, the difference between firing rates on ipsiversive and contraversive trials tends to be larger on IS than SG trials (i.e., the cloud of points is skewed vertically relative to the x = y line). However, without showing the IS vs. SG data separately for trials in the preferred and antipreferred direction (Author response image 1), the main finding of this figure – that more neurons exhibited higher activity in IS trials than SG trials in their preferred direction, and more neurons exhibited higher activity in SG trials than IS trials in their antipreferred direction (as described above) – is not clearly conveyed. We therefore believe that Figure 3D most clearly achieves our intended goal.

10) Figure 4: In the current version, 4A has the same issue as for Figure 1D, and 4B is almost meaningless. I would use the Ipsi-Contra difference for 4A for each neuron. Then, show the across-trial changes (perhaps, using a regression line) for a number of neurons; in this case, it may be useful to use the absolute value of the Ipsi-Contra difference, because it would increase across trials. My another suggestion is to plot the Ipsi-Contra difference across time within a trial; the value would increase for IS than SG (as in Figure 3B).

In Revised Figure 4, we have now calculated correlation between the firing rate and the value of each port [calculated using a reinforcement learning model, as suggested by Reviewer 2 (see Comment #22), trial by trial (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, last paragraph). We agree with Reviewer 2 that this model provides a more sophisticated method for estimating the extent to which movements are selected based on internal representations, which we expect to increase during the course of an IS block. We believe that this new analysis also addresses the current comment by strengthening Figure 4 in the manner that Reviewer 1 intended. In addition, we have clarified that the results shown in Figure 4B are consistent with the pattern of results shown, separately for trials in the preferred and antipreferred direction, in Figure 3D (in the aforementioned paragraph).

11) Figure 5: The current version includes multiple comparisons, although the graphs do not visually show the results of the comparisons. My suggestion is to present simple histograms showing the absolute values of the Ipsi-Contra difference, separately for IS, SG (easy), and SG (difficult).

Our goal in Figure 5 was to display whether the apparent difference in activity preceding IS and SG movements could be accounted for by a dependence on difficulty, or an associated variable such as uncertainty or the estimated value of each movement direction. In order to do so, we must show a within-neuron comparison between easy and difficult discrimination trials, and also show which neurons exhibited a difference between SG and IS trials. If the same neurons that exhibited a difference between SG and IS trials also showed a dependence on difficulty, it would indicate that the apparent dependence on trial type (SG vs. IS, shown in Figure 3D) was actually due to a dependence on difficulty (easy vs. difficult).

We have displayed the data as suggested by the reviewer (Author response image 2). As expected given the results shown in Revised Figure 3D, the difference in firing rate between ipsiversive and contraversive trials appears to be larger in IS (Author response image 2A) than in SG trials (Author response image 2B, C) However, in this display, the within-neuron comparison is lost, as is the relationship between trial-type- and difficulty-dependence. We therefore believe that this display is not maximally informative.

Author response image 2
Alternative display for Revised Figure 5.

Difference in delay-epoch firing rate between ipsiversive and contraversive trials in easy SG trials (A), difficult SG trials (B), and IS trials (C). Neurons with a significant difference in ipsiversive vs. contraversive firing rate are shown in black. We believe that this display is not as informative as Revised Figure 5 (see response to Comment #11).

https://doi.org/10.7554/eLife.13833.013

However, in Revised Figure 5, we have now separated the panels by absolute direction (ipsiversive and contraversive), as suggested by the reviewer (see Comment #7), instead of by the preferred or antipreferred direction of the neuron. In addition, we have made it much easier to see that there was little overlap (purple) between those neurons that exhibit dependence on trial type (red) and those that exhibit a dependence on difficulty (blue). For further clarity, those direction-selective neurons that do not exhibit a dependence on either trial type or difficulty (ipsiversive: n = 94; contraversive: n = 108) are not shown, and we have narrowed the range shown on the x- and y-axes.

12) Figure 6: This figure is too complicated and does not add any support for the conclusion. Moreover, the data include both SG and IS, which I find no benefit. The previous choice should have a larger effect in IS than SG. The current choice should start earlier for IS than SG. My suggestion is to do the same analysis separately for SG and IS, although it may provoke more issues.

As clarified in the revised manuscript, we believe that the regression analysis shown in Figure 6 is valuable because it provides an unbiased method for determining which of several inter-related factors influence neural activity (subsection “Modulation of SNr activity by task-relevant variables”, second paragraph). In addition, we have added text to the Results section to clarify how the results of this analysis contribute to our conclusions (subsection “Modulation of SNr activity by task-relevant variables”, second paragraph). Specifically, we explain that the observed dependence on previous choice is particularly interesting because this variable is critical for determining, in an IS block, which direction is associated with reward (subsection “Modulation of SNr activity by task-relevant variables”, third paragraph). Finally, we have simplified Revised Figure 6B by deleting the line and shading corresponding to a dependence of firing rate on reaction time, which we agree was a distraction from our main conclusion.

Further, we attempted to perform the same analysis as in Revised Figure 6B separately for SG and IS trials, as suggested by the reviewer. However, as the reviewer noted might be the case, there are problems interpreting the results: In IS trials, previous choice and current choice are highly correlated (by design). Therefore, we cannot disambiguate which of these factors activity depends upon.

13) I have to add one issue to my own suggestion – focus on the Ipsi-Contra difference. The feature that is orthogonal to this measure is the average magnitude of the Ipsi & Contra activity. If this is high, SC on both sides would be inhibited and the action (e.g., orienting) would be suppressed in general. If low, the action would be enhanced in general. The authors need to consider if this feature contributes to any difference in animals' choice behavior in general.

This is a very good point. While we show how several factors modulate activity from its baseline level (Figures 3D, 5, and 6B, and Table 1), the absolute level of activity is important for determining the level of inhibition on downstream motor structures. However, given the context of the decision required by our task (to move left vs. to move right), we believe that the critical factor is the relative difference between activity promoting leftward and rightward movements, as illustrated in Revised Figure 7. We have clarified this point in the revised manuscript (Discussion, fourth paragraph).

Below, I add my comments (rather than suggestions) which can be critical.

14) Activity of SNr neurons started differential (ipsi vs. contra) earlier in IS trials (even before odor delivery) than SG trials (basically after odor delivery), as exemplified in Figure 3B. Now, the magnitude of neuronal activity was measured during the delay epoch which started with odor delivery. This raised the possibility that the differential activity was higher in IS trials than SG trials simply because the differential activity started earlier in IS trials. Please answer this question.

We agree with the reviewer that direction preference during the delay epoch could be stronger in IS trials than SG trials in part because mice could select their direction movement prior to stimulus delivery on IS, but not SG, trials, allowing direction preference to develop earlier in IS trials. To quantify this, we calculated the difference in the strength of preference between SG and IS trials during the delay epoch and during the epoch from odor port entry and odor valve open (which we refer to here as the “prestimulus” epoch). We indeed found that the larger the difference between SG and IS trials during the prestimulus epoch, the larger the difference during the delay epoch (r = 0.47, p = 4.8 x 10-18; Author response image 3). These results are consistent with a) the idea that, in the real world, internally-specified movements can be selected earlier than stimulus-guided movements; and b) the value-biasing view of BG function (Hikosaka et al., 2006), in which reward modulates activity prior to stimulus delivery, resulting in more valuable movements being facilitated relative to less valuable movements. In the revised manuscript, we note this correlation analysis and provide the results (Discussion, second paragraph), and we clarify that the value-biasing view also hypothesizes that SNr activity is modulated by reward prior to stimulus presentation (Introduction, first paragraph and Discussion, third paragraph).

Author response image 3
Relationship between activity during prestimulus and delay epochs.

Abscissa shows the absolute value of the difference in strength of direction preference between SG and IS trials during the delay epoch, where the sign preserves the relationship between preference in SG and IS trials across epochs (e.g., if prefSG > prefIS in the prestimulus epoch, ordinate shows prefSG – prefIS). Black circles show neurons with a significant direction preference (corresponding to black bars in Figure 3C; gray circles show neurons without significant direction preference (corresponding to gray bars in Figure 3C. See response to Comment #14.

https://doi.org/10.7554/eLife.13833.014

We considered including Author response image 3 in the Results section of the revised manuscript itself, but ultimately decided against doing so: Our analyses in Revised Figures 35 and Table 1 currently focus on the delay epoch, and introducing the prestimulus epoch in Revised Figure 3 (where this panel would most naturally fit) would therefore disrupt the flow of the manuscript. Please also see our response to related Comment #2.

15) The different connections shown in Figure 7 may explain the functions of ipsiversive and contraversive SNr neurons, but this is a hopeful scheme with no evidence or hints. Virtually all anatomical studies have shown that the majority of the SNr-SC connections are ipsilateral in various mammals. The contralateral SNr-SC connection is very few in rats (Beckstead et al., J Neurosci 1981), which may be true in mice. The authors found that contraversive SNr neurons (thus, projecting to the contralateral SC) are slightly more common than ipsiversive SNr neurons (thus, projecting to the ipsilateral SC). This raises a concern about the scheme in Figure 7.

We agree that the idea that contraversive-preferring SNr neurons comprise the crossed nigrotectal projection is speculative and remains to be tested, and therefore distracts from our main point about how SNr output may promote internally-specified movements. In the revised manuscript, we have therefore removed nearly all of our discussion about the crossed projection, we have simplified Revised Figure 7 to show only the (more numerous, and more commonly studied) uncrossed nigrotectal neurons in relation to stimulus-guided and internally-specified movements, and we have clarified the description of Figure 7 in the legend. In addition, we explicitly state that, while contraversive-preferring SNr neurons may project to the contralateral SC, they may also project to non-tectal targets (e.g., thalamus) (Discussion, fifth paragraph). Finally, at the end of the Introduction, we clarify that “internally-specified movements may be promoted over stimulus-guided movements by BG activity”.

Other comments and questions:

16) Results: "if value is only one of a number of internal representations contributing to movement selection, then these movements would differentially engage the BG." What does this mean?

In the Introduction of the revised manuscript, we clarify our main objective as examining whether “the BG primarily mediate the influence of value, [in which case] their output when selecting equally valuable stimulus-guided and internally-specified movements would be similar. However, if the BG are not limited to mediating the influence of value but instead mediate the influence of internal goals in general, their output would differ under these two conditions”. Likewise, we have rewritten the sentence in question as: “However, if other internal representations beyond value are also integrated by the BG, then stimulus-guided and internally-specified movements may differentially engage the BG despite being equally valuable”.

17) Results: "We recorded from 298 well isolated left SNr neurons"

Why did you record SNr neurons only on the left side in 4 mice?

We assume – fairly, we believe – that the relationship between the activity of neurons in each SNr and movement direction is symmetric, such that relative direction (ipsiversive vs. contraversive) is important to activity while absolute direction (left vs. right) is not. Therefore, it is only necessary to record from one SNr. Since it is also more experimentally convenient to do so, this is standard practice in our and other awake behaving recording experiments (e.g., Thompson and Felsen, 2013; Kiani et al., 2014). In our bilateral model (Figure 7), we treat right SNr neurons as “antineurons” to the left SNr neurons that we recorded, in the sense that we assume the same relationship between neural activity and relative direction (ipsiversive or contraversive) (Britten et al., 1992).

18) Results: "while we found that neurons were equally likely to prefer upcoming ipsiversive and contraversive movements (Figure 3C), they were more likely to exhibit a higher firing rate preceding IS movements in their preferred direction and preceding SG movements in their antipreferred direction (Figure 3D)."

It is unclear what this sentence means.

In the revised manuscript, we have clarified that SNr neurons “were more likely to exhibit a higher firing rate preceding IS – as compared to SG – movements in their preferred direction, and preceding SG – as compared to IS – movements in their antipreferred direction (Figure 3D)” (Results).

19) Results: "We observed no difference in activity between SG trials in which different mixtures were presented but movement direction was held constant"

Please show statistical evidence.

Although we have deleted this particular sentence from the revised manuscript, we have addressed this idea in the revised manuscript. We performed an ANOVA to examine whether activity depended on mixture ratio separately for each direction, and found that the activity of some individual neurons depended on mixture ratio (or difficulty or an associated variable) (ipsiversive direction: 39/216 neurons; contraversive direction: 31/216 neurons, p < 0.05, 1-way ANOVA across mixture ratios, Revised Figure 5A, B; subsection “Modulation of SNr activity by task-relevant variables”, first paragraph). Please also see our response to Comment #24.

20) Results: "The difficulty of this decision could, in principle, affect SNr activity." Another possibility is 'uncertainty', rather than difficulty.

In the revised manuscript, we have edited this sentence by replacing “choice confidence” with “uncertainty;” it now reads, “The difficulty of this decision – or an associated variable, such as uncertainty, or the estimated value of each movement direction – could, in principle, affect SNr activity […]”.

21) Results: "there was little overlap between this small population of difficulty-dependent neurons and those neurons that exhibited a dependence on trial type (Figure 5A, B)". This notion is unclear simply by looking at these graphs.

See response to comment #11.

Reviewer #2:

The authors present results from a study in which they recorded neural activity in the SNr while mice carried out a task in which they chose left or right reward ports either on the basis of an odor mixture (stimulus guided) or on the basis of previously rewarded outcomes (internally-guided). They examined neural activity, mostly during the choice period and found that SNr neurons coded ipsiversive and contraversive movements with about equal frequency. Neurons responded more strongly for internally-guided movements in their preferred direction, and more strongly for stimulus guided movements in their anti-preferred direction. They also correlated with learning in the IS blocks.

This study addresses an interesting question about the role of the basal ganglia, assessed from the perspective of an output nucleus, in internally generated vs. externally cued movements. Overall the study was well carried out. The dataset is reasonable and the results are clearly presented. The authors make a strong claim about matching the values of the movements. However, the analytical approach to matching them could be more detailed. Currently it is a bit qualitative. Specific suggestions follow.

22) The analysis of the effects of learning on firing rates shows a reasonable correlation. But more sophisticated methods exist for examining such relationships. Specifically, it would be better to fit a simple reinforcement learning model to the data, and correlate value with neural activity. The analysis that correlates number of consecutive correct trials is a bit harder to interpret, although it may be getting at the same point.

This is an excellent suggestion. We have now used a reinforcement learning model to estimate the value associated with movement in each direction on each trial of the task (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, last paragraph and subsection “Sign of activity change during delay epoch”). Briefly, we iteratively updated the value associated with each direction in each trial as Vdir, t=Vdir, t1+α(Rdir,t1Vdir,t1), where Rdir,t1is the reward for the given direction in the previous trial in which that direction was chosen (0 for unrewarded and 1 for rewarded) and α is the learning rate (set to 0.1; similar results were obtained with a range of values) (Sutton and Barto, 1998). Since we calculated Vdir, t in each trial of the session (including SG and IS trials), it tended to start near 0.5 (but was not exactly 0.5) at the beginning of each IS block. In well-behaved IS blocks, Vdir, t tended to asymptotically approach 1 as the mouse consistently returned to the rewarded port (Figure 4A). Vdir, t was calculated separately for the ipsiversive and contraversive directions and was only updated in trials in which that direction was selected.

In Revised Figure 4, we have replaced “consecutive correct trials” with this estimate of value in each IS trial, which provides a more rigorous analysis and also strengthens our results: 77/157 neurons exhibited a significant correlation (p < 0.05) between firing rate and the number of consecutive correct trials for either direction [Revised Figure 4B; 35/77 for trials in the preferred direction (red circles), 29/77 for trials in the antipreferred direction (blue circles), and 13/77 for trials in both directions (purple circles)], with more positive correlations for trials in the preferred direction (p = 2.4 x 10-6, χ2 test) and negative correlations for trials in the antipreferred direction (p = 5.9 x 10-6, χ2 test), as we would expect given the pattern of results shown in Figure 3D (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, last paragraph).

23) The RL model would also allow you to more rigorously examine similarities between the reward value representations in the SG and IS blocks. At this point they are only being qualitatively compared.

This is a fair point, and important to address on its own. While we designed the task such that correctly performed trials were equally rewarded, and therefore valuable (Abstract, Introduction, last paragraph), we could not entirely predict and/or control for the animals’ choice behavior, and therefore the value of different trial types that was actually experienced is not necessarily identical. We now clarify this point in the revised manuscript (Discussion, first paragraph). Indeed, when we compared the average value of IS and SG trials (estimated in each IS trial with the reinforcement learning model and in each SG trials using the block-by-block psychometric functions, as suggested in Comment #25) within each mouse, we found that, for 3 mice IS trials were slightly more likely to be rewarded and for 1 mouse SG trials were slightly more likely to be rewarded (although the timescale over which value for each mixture ratio should best be estimated – by trial, block, session, or lifetime – is an open – and interesting – question). However, in Revised Figure 5 we directly examined whether the difference we observed between IS and SG trials in Revised Figure 3D could be due to a dependence on the value of the movement. Specifically, as suggested by the reviewer (see response to Comment #24), we performed an ANOVA to examine whether activity depended on mixture ratio. Given that mice performed at different levels of accuracy on different mixture ratios (Figure 3C), and we used accuracy as a proxy for movement value in the RL model, this analysis addresses whether activity is influenced by value. We found that, while the activity of some neurons was influenced by value, this dependence did not account for the observed difference between IS and SG trials (Revised Figure 5;subsection “Modulation of SNr activity by task-relevant variables”, first paragraph).

24) For the analysis of the effects of difficulty in the SG blocks on neural activity, why not just run an ANOVA with difficulty level as a factor? Comparing hard vs. difficult levels is approximately doing this, but since the task design used fixed levels, why not just use them all in the analysis? This could be done in an ANOVA with odor mixture as a factor as well.

As suggested, we performed an ANOVA to examine whether activity depended on the mixture ratio presented in SG trials, separately for each direction. Mixture ratio is a useful variable that can serve as a proxy for several other quantities, including discrimination difficulty, uncertainty, accuracy, and the value associated with each side. For example, compared to a 60/40 mixture ratio, a 95/5 mixture ratio is associated with decreased discrimination difficulty, decreased uncertainty, increased accuracy, and increased value at the right port (corresponding to the increased accuracy and therefore higher likelihood of reward, as noted in Comment #25). Therefore, this analysis allows us to examine the influence of several potential factors on firing rate.

For ipsiversive movements, we found that the activity of 39 of the 216 direction-selective neurons depended on difficulty (p < 0.05). However, consistent with the results shown in Figure 5A, difficulty dependence was unlikely to explain the dependence on trial type (SG vs. IS): The activity of 21/109 trial-type-selective neurons was difficulty-dependent; this ratio did not differ from the 18/107 non-trial-type-selective neurons that exhibited difficulty dependence (p = 0.32, χ2 test). Similarly, for contraversive movements, we found that the activity of 16/101 trial-type-selective neurons was difficulty-dependent; this ratio did not differ from the 15/115 non-trial-type-selective neurons that exhibited difficulty dependence (p = 0.56, χ2 test). Given that the value associated with a given side is determined by mixture ratio, these results further support the idea that the dependence of activity on trial type is not due to a dependence on value (see response to Comment #23). We have added the results of the analysis to the revised manuscript (subsection “Modulation of SNr activity by task-relevant variables”, first paragraph).

25) Also, if an RL model is fit to the learning data, this can be used to estimate accuracy as a function of trials in IS blocks. Accuracy can also be extracted from the psychometric curves in the SG blocks. Accuracy could then be used a factor in an ANOVA analysis of neural activity. How many neurons show a main effect of accuracy? How many show an accuracy by task interaction? Log accuracy should also be analyzed, as this is often more correlated with neural activity in choice tasks than accuracy.

As the reviewer suggests, we calculated value (i.e., accuracy; see response to Comment #24) in each IS trial using the reinforcement learning model (see response to Comment #23) and in each SG trial using the psychometric functions. In order to determine the influence of value on firing rate, rather than using an ANOVA – which we believe is not well suited to this analysis because value is a continuous variable, ranging from 0 to 1 for both IS and SG trials – we included a term for value (or log value) and the interaction between value (or log value) and trial type in our regression analysis (subsection “Regression model”). Of the 296 neurons in our population, we found that the firing rate of 108 neurons was influenced by log value and the firing rate of 49 neurons was influenced by the interaction between log value and trial type (some were influenced by both); the firing rate of 105 neurons remained dependent on trial type. Results were similar when we used value itself as a factor instead. These results are as we might expect, given the known dependence of movement-related SNr activity on the value of the movement (Bryden et al., 2011; Sato and Hikosaka, 2002). We have included this information in the revised manuscript (in the aforementioned subsection).

We believe, however, that incorporating these results into Revised Figure 6A itself would detract from the effectiveness of this panel. One of our goals in this panel is to present the extent to which particular variables that, by design, correlate with trial type (previous choice and reaction time) can account for the apparent dependence of firing rate on trial type (Figure 3D). It could have been the case, for example, that the activity of very few neurons depends on trial type when these other variables are controlled for—but this is not what we observed (Revised Figure 6A). Since we have already shown directly that the dependence of firing rate on trial type is not due to a dependence on value (Revised Figure 5), including value in Revised Figure 6A would be somewhat redundant. Further, incorporating additional factors into this panel would be nontrivial, requiring a 5-set Venn diagram (since we cannot eliminate any of the original factors) and further complicating an already complex panel (see Comment #12).

26) A similar study has been carried out in macaques, comparing lateral prefrontal and the caudate nucleus, Seo et al., Neuron, 2012. Which areas do the authors think would be more strongly involved in the SG component of the task?

We thank the reviewer for calling attention to the study by Seo et al. (2012), in which a similar behavioral paradigm (with “fixed” and “random” blocks of trials) was used to examine representations of action selection and action value in the lateral prefrontal cortex and dorsal striatum. In the revised manuscript, we now contextualize our behavioral paradigm with reference to this paper, as well as other recent studies that employed variants of “fixed-choice” and “free-choice” trials (Pastor-Bernier and Cisek, 2011; Seo et al., 2012; Ito and Doya, 2015; Introduction, last paragraph).

While Seo et al. (2012) found that lateral prefrontal cortex and dorsal striatum were each engaged by both the fixed and random trials, they noted some interesting differences between the regions: the selected action was more robustly represented in lateral prefrontal cortex, and action value was represented in dorsal striatum. It would be interesting to record from these regions (among several others!) in our task. We speculate that dorsal striatum would be more strongly engaged by our IS trials, in which action values need to be re-learned based on choice and reward history, than our SG trials (although it would continue to represent action values during SG trials). Perhaps lateral prefrontal cortex would be more strongly engaged by our SG than IS trials, since selected movements in random trials (which are somewhat analogous to our SG trials) were represented earlier in lateral prefrontal cortex than dorsal striatum.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

1) One of the reviewers still feels strongly that the main point of analyzing ipsi /contra versus preferred/non preferred direction was not understood. So the reviewer now provides a more detailed explanation of what was the point. We ask the authors to please submit new analyses and details regarding this point (where possible).

We thank Reviewer #1 for the further explanation on this point. We have performed new analyses and revised Figures 3 and 4 in order to address the comments; please see our response to Comment #4.

2) In addition, it would be good to fix some of the small points raised.

We are happy to address all of the concerns raised, as described in our responses to Comments #5-7, and we have revised the manuscript accordingly.

3) Finally, the authors still insist that the data presented are not consistent with a value-biasing view of basal ganglia function. The editors and reviewers felt that the data and results presented focus on the difference between stimulus-driven versus internally-guided movements, which is orthogonal to value biasing (and does not formally exclude value biasing). Therefore, we urge the authors to further revise the manuscript regarding this point.

We agree that our results focus on how substantia nigra pars reticulata (SNr) activity differs between stimulus-guided and internally-specified movements, and are in no way inconsistent with the value-biasing view of basal ganglia function. As listed below, we have edited several portions of the manuscript to further clarify this point; we bold the main changes here for clarity. We believe these changes, in addition to the changes in the previous submission, clarify that our results do not argue against – but rather extend – the value-biasing hypothesis. Please note that in the Introduction of both the previous and current submission, we have retained our description of the value-biasing view because it provides appropriate background on how the BG have primarily been studied with respect to modulatory influences on movement selection, which we build on here.

Revised Abstract: We changed “Here, we examine whether value specifically, or internal goals more generally, influence movements via the basal ganglia” to “Here, we examine whether other internal goals, in addition to value, alsoinfluence movements via the basal ganglia”. In addition, we deleted the clause “which is not accounted for by the value-biasing view of basal ganglia function” from the end of the following sentence: “We found that activity in the substantia nigra pars reticulata, a basal ganglia output, predictably differed preceding internally- specified and stimulus-guided movements”.

Revised Introduction: Original text: “We therefore asked whether BG activity mediates the influence of value specifically or, more generally, of internal goals on movement selection. […] However, if the BG are not limited to mediating the influence of value but instead mediate the influence of internal goals in general, their output would differ under these two conditions.” Revised text: “We therefore asked whether BG activity mediates the influence of internal goals, in addition to value, on movement selection. We reasoned that, if this were the case, BG output would differ when selecting equally valuable stimulus-guided and internally-specified movements.”. In addition, we deleted the clause “beyond value” from the end of the sentence “We found that SNr activity predictably differed between these two conditions, supporting the idea that the BG mediate the influence on movement selection of internal goals”.

Revised results: We changed “According to the value-biasing view of BG function, equally valuable stimulus-guided and internally-specified movements would similarly engage the BG. However, if other internal representations beyond value are also integrated by the BG, then stimulus-guided and internally-specified movements may differentially engage the BG despite being equally valuable. In the latter case…” to “If the BG integrates not only value but also other internal representations, then stimulus-guided and internally-specified movements may differentially engage the BG despite being equally valuable. In this case […]”. We recognize that the previous version made it seem as if the predicted results would refute the value- biasing view, which was not our intent.

Revised Discussion: We changed “Our results therefore extend the model underlying the value- biasing view of BG function (Hikosaka et al., 2006) by suggesting that the influence of the SNr on downstream motor regions is modulated by internal representations beyond value alone” to “Our results therefore extend the model underlying the value-biasing view of BG function (Hikosaka et al., 2006) by suggesting that the influence of the SNr on downstream motor regions is modulated by internal representations in addition to value”.

Reviewer #1:

4) As I mentioned before, I am not arguing that the authors' data are wrong. I have been trying to find a more straightforward way to present the authors' data. My main concern is that the current version may not transmit the basic and essential part of the authors' message to the readers. For this reason, I still have a strong suggestion to use the neuronal activity difference between ipsilateral and contralateral choices, for the following reasons.

Let's start from Figure 3B. Does this neuron contribute to the choice? To answer this question, I would set this neuron in SNr on both sides. If the animal decides to choose Rightward movement, the neuron in Right SNr (i.e., Ipsi) is more active than the neuron in Left SNr (i.e., Contra). This would lead to the bias toward Rightward movement (Figure 7). Importantly, the choice bias is stronger in IS than SG (Figure 3B). Therefore, the neuron does contribute to the choice using the difference in its activity between Ipsi and Contra choices, and does so more strongly in IS than SG condition.

Checking the neuron's activity for only one choice (i.e., preferred or anti-preferred) would not indicate whether the neuron can contribute to the choice. That's why the data points shown in Figures 3 and 4 provide no clear message. Two data points are associated with one neuron. By looking at these two points in Figure 3D, we can guess if the neuron can contribute to the choice; the slope between the two data points indicates the ratio of the directional bias (IS-bias/SG-bias). But, I bet most readers would not reach such detailed understanding. Moreover, the pair of data points is shown for only one example neuron. How about the others?

I understand the authors' logic. The average slope of all data points (regression line) in Figure 3D is higher than 1. Therefore, IS-bias is larger than SG-bias overall. But, the basic information – choice signal carried by individual neurons – is missing. Let me ask a basic question. How many SNr neurons were capable of contributing to the choice? Would the data in Figure 4 answer the question? I doubt it. For example, there are neurons that showed significant correlations with both preferred and anti-preferred directions. Some of them had the same polarities; they would increase (or decrease) their activity during both Ipsi and Contra IS blocks. Such neurons are probably not capable of contributing to the correct choice.

How can we quantify the choice capability of each neuron, then? My only suggestion is to use the difference between Ipsi and Contra directional choices. This exactly matches the model shown in Figure 7.

We thank the reviewer for further clarifying this concern, which we believe we now address. We entirely agree that the difference in firing rate between ipsiversive and contraversive trials is critical. In the revised manuscript we now display the data as recommended by the reviewer, in two separate panels depending on the direction preference of the neuron (calculated during SG and IS trials combined; see response to Comment #6; Revised Figure 3D; subsection “SNr activity differs for stimulus-guided and internally-specified movements”, second paragraph). This way, it is clear that for ipsiversive-preferring neurons, (activity in ipsiversive trials – activity in contraversive trials) is greater for internally-specified (IS) than stimulus-guided (SG) trials, and for contraversive-preferring neurons, (activity in contraversive trials – activity in ipsiversive trials) is also greater for IS than SG trials. In other words, the difference in firing rate between trials in the preferred and antipreferred directions tends to be higher for IS trials than SG trials. We believe that this format most clearly conveys the difference in firing rate between ipsiversive and contraversive trials across our population of neurons. (When ipsiversive- and contraversive- preferring neurons are combined in one panel, as was the case in Author response image 1 of the previous submission, the main point is less clear.) For consistency with Revised Figure 3D, we have also separated ipsiversive- and contraversive-preferring neurons in Revised Figure 4.

We note that, although we no longer include the previous version of Figure 3D, we retain a similar analysis in order to classify neurons as trial-type dependent (i.e., how activity differs between IS and SG trials): We compare firing rates between SG and IS trials separately for the preferred and antipreferred direction of the neuron (in the aforementioned paragraph). We use this classification for the analyses shown in Revised Figure 4 and Figure 5.

Finally, we agree that the connection between Revised Figure 3D and our model (Figure 7) is now more straightforward, and we have significantly simplified the relevant text accordingly (Discussion, fourth paragraph and fifth paragraphs).

Other than my critical comments, I do find some of the authors' data interesting. For example, I find two dominant groups of neurons in Figure 4B: 1) neurons showing positive correlations with their preferred directions, and 2) neurons showing negative correlations with their anti-preferred directions. This indicates that both groups of neurons enhanced their choice signals, but in different ways: 1) enhancing the dominant signal, and 2) suppressing the non-dominant signal. This reminds me of a basic mechanism of motor control: To change the orientation of a joint, the agonist muscle contracts or the antagonist muscle relaxes, or both contract (or relax) in different amounts. These mechanisms may be useful in different contexts. In this sense, the authors' data may suggest that different groups of SNr neurons contribute to such different mechanisms of joint orientation, in this case, head or eye orientation.

However, this question is one step beyond the main question: Which neurons can contribute to the choice of orientation? To address this main question, the authors should focus on the difference in each neuron's activity between the Ipsi- and Contra-directions, which is similar to the difference in contraction force between agonist and antagonist muscles.

I have some specific questions (below).

5) Data in Figure 3D are different from the ones in the original manuscript. In particular, the two data points for the example neuron are very different. Please give me an explanation.

As noted in our response to Comment #2 on our original submission, in the original Figure 3D we had reported results calculated with a slightly different definition of the delay epoch – from odor valve open to the go signal – than the epoch shown in Figure 1A (odor valve open to odor port exit). While this alternative definition is not unreasonable, we believe that it is important for the delay epoch to end at the beginning of movement initiation (i.e., odor port exit), so that it has the same temporal relationship with movement across SG and IS trials (see response to Comment #2 on our original submission for further details). In the first revision, we made sure to report only results from the epoch as defined in Figure 1A, resulting in some slightly different values than in the original submission, none of which changed any of our overall results. In the current revision we have significantly modified Revised Figure 3 according to Reviewer #1’s suggestion (see response to Comment #4).

6) How was the preferred direction defined? I understand that the authors used ROC analysis. Were both SG and IS trials included? Weren't there any neurons that showed a bit different preference between SG and IS trials? For example, the dark blue data point at the far right in Figure 3D. This means that the neuron responded strongly in the anti-preferred direction in SG trials. In other words, it responded more strongly in the preferred direction. Which of the dark red data points belongs to this neuron? In any case, this may raise another issue of preferred vs. anti-preferred directions.

We now clarify in the revised manuscript that we calculated direction preference (shown in Figure 3C) based on activity during SG and IS trials combined (subsection “SNr activity differs for stimulus-guided and internally-specified movements”, first paragraph). Indeed, consistent with the results shown in Revised Figure 3D, direction preference was generally stronger in IS than SG trials, although preferences calculated in SG trials were highly correlated with those calculated in IS trials (see Author response image 4).

Author response image 4
Direction preference calculated in SG vs.

IS trials in the same session. Each circle corresponds to one neuron.

https://doi.org/10.7554/eLife.13833.015

Reviewer #2:

7) I have only one brief comment. Otherwise the authors have carefully addressed all of my concerns in detail.

Figure 4A doesn't show value estimates evolving over trials as stated in reply to reviewers and Methods. "In well-behaved IS blocks, Vdir,t tended to asymptotically approach 1 as the mouse consistently returned to the rewarded port (Figure 4A)." Was this erroneously not included or with this statement did you just mean you were using the RL algorithm to generate the value estimates and these value estimates were used in Figure 4? It's not clear.

We intended the latter: We used the RL algorithm to generate the value estimates and these value estimates are what are shown on the x-axis of Figure 4A. It is true that Vdir,t tended to asymptotically approach 1 as the mouse consistently returned to the rewarded port, as would be expected, but in Figure 4A we do not show Vdir,t as a function of trial number; rather, we show firing rate as a function of Vdir,t. In the quoted sentence of the Materials and methods section of the revised manuscript, we have deleted the reference to Figure 4A (subsection “Reinforcement learning model”). In addition, to clarify what we are showing in Figure 4A, its legend in the revised manuscript now states that Vdir,tis plotted on the x-axis. We retain the current x axis label within Figure 4A itself (“Extent that choice is internally specified”), as we believe it is most descriptive of the point this figure is trying to convey.

https://doi.org/10.7554/eLife.13833.017

Article and author information

Author details

  1. Mario J Lintz

    1. Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, United States
    2. Neuroscience Program, University of Colorado School of Medicine, Aurora, United States
    3. Medical Scientist Training Program, University of Colorado School of Medicine, Aurora, United States
    Contribution
    MJL, Designed the experiments, Collected the data, Analysis and interpretation of data, Wrote the manuscript
    Competing interests
    The authors declare that no competing interests exist.
  2. Gidon Felsen

    1. Department of Physiology and Biophysics, University of Colorado School of Medicine, Aurora, United States
    2. Neuroscience Program, University of Colorado School of Medicine, Aurora, United States
    3. Medical Scientist Training Program, University of Colorado School of Medicine, Aurora, United States
    Contribution
    GF, Designed the experiments, Analysis and interpretation of data, Wrote the manuscript
    For correspondence
    gidon.felsen@gmail.com
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon 0000-0003-0745-8279

Funding

National Institute of Neurological Disorders and Stroke (R01NS079518)

  • Gidon Felsen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Michele Basso, Abigail Person, Jaclyn Essig, Andrew Wolf, and Elizabeth Stubblefield for comments on the manuscript; Angie Ribera for use of the confocal microscope; and John Thompson, Jamie Costabile, and Quang Dang for experimental and technical assistance. This work was supported by the National Institutes of Health (R01 NS079518).

Ethics

Animal experimentation: This study was performed in accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health, 8th edition. All experiments were performed according to protocol #90209(12)1D, approved by the University of Colorado School of Medicine Institutional Animal Care and Use Committee. All surgeries were performed under isoflurane anesthesia and all perfusions were performed following an overdose of sodium pentobarbital. Quality of life was improved with enriched living environments and dietary treats while every effort was made to minimize suffering.

Reviewing Editor

  1. Rui M Costa, Reviewing Editor, Fundação Champalimaud, Portugal

Publication history

  1. Received: December 17, 2015
  2. Accepted: July 4, 2016
  3. Accepted Manuscript published: July 5, 2016 (version 1)
  4. Version of Record published: August 2, 2016 (version 2)

Copyright

© 2016, Lintz et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,190
    Page views
  • 399
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)