Cerebellar climbing fibers encode expected reward size
Abstract
Climbing fiber inputs to the cerebellum encode error signals that instruct learning. Recently, evidence has accumulated to suggest that the cerebellum is also involved in the processing of reward. To study how rewarding events are encoded, we recorded the activity of climbing fibers when monkeys were engaged in an eye movement task. At the beginning of each trial, the monkeys were cued to the size of the reward that would be delivered upon successful completion of the trial. Climbing fiber activity increased when the monkeys were presented with a cue indicating a large reward, but not a small reward. Reward size did not modulate activity at reward delivery or during eye movements. Comparison between climbing fiber and simple spike activity indicated different interactions for coding of movement and reward. These results indicate that climbing fibers encode the expected reward size and suggest a general role of the cerebellum in associative learning beyond error correction.
https://doi.org/10.7554/eLife.46870.001Introduction
Computational, anatomical, and functional evidence support the theory that the cerebellar cortex performs error correcting supervised motor learning (Albus, 1971; Gilbert and Thach, 1977; Marr, 1969; Nguyen-Vu et al., 2013; Stone and Lisberger, 1990; Suvrathan et al., 2016). In this framework, motor learning occurs through changes in the computation of Purkinje cells, the sole output cells of the cerebellar cortex. Purkinje cells receive two distinct types of inputs: parallel fiber inputs and climbing fiber inputs. Each type of input leads to a different type of action potential. Parallel fiber inputs modulate the rate of Simple spikes (Sspks), events similar to action potentials in other cell types. Climbing fiber inputs result in complex spikes (Cspks), which are unique prolonged events. Cspks are thought to represent instructive error signals triggered by movement errors. These error signals adjust the Sspk response of the Purkinje cell to parallel fiber input, resulting in improvement in subsequent movements. This hypothesized role of the Cspks in learning was broadened when it was shown that the Cspk rate increases in response to cues that are predictive of undesired successive stimuli (Ohmae and Medina, 2015). Thus, the Cspk signal is well-suited for driving associative learning based on motor errors that drive avoidance of aversive stimuli.
Recent research has shown that Cspk rate increases when behavior leads to a desired rewarded outcome or when reward related stimuli are presented (Heffley et al., 2018; Kostadinov et al., 2019; Heffley and Hull, 2019), a marked departure from their established role in error signaling. We aimed to further investigate what is coded by the reward related Cspk increase and whether the reward driven Cspk modulations are linked to simple spike modulations.
We considered three possibilities for the coding of reward by Cspks. The first was that the Cspk reward signal could be directly linked to the physical delivery of reward. For example to reward consumption behavior (such as licking; Welsh et al., 1995) or to the signal at reward delivery that behavior was successful (Heffley et al., 2018). If so, we would expect reward related modulations of the Cspk rate to be locked to the time of reward delivery. The second possibility was that the Cspks could encode the predicted reward consequences of arbitrary stimuli, similar to the way in which Cspks encode the prediction of an undesired air-puff (Ohmae and Medina, 2015). If this were the case, we would expect a Cspk increase when reward predictive stimuli are presented. Finally, reward could modulate Cspks through the coding of motor errors. In the eye movement system, for instance, Cspks are modulated when the eye velocity does not match the target velocity (i.e. retinal slip; Stone and Lisberger, 1990). Reward could influence the representation of the error signal such that similar retinal slips would result in a higher Cspk rate when a greater reward is expected. Thus, if reward acts on error signaling directly, we would expect reward to modulate the Cspk rate at the time of the retinal slip.
To dissociate these alternatives we designed a task that temporally separated reward information, motor behavior and reward delivery (Joshua and Lisberger, 2012). We found that climbing fiber activity encoded the expected reward size seconds before the reward delivery. Reward size did not modulate activity at reward delivery. Furthermore, reward expectation did not modulate the Cspk tuning of eye movement parameters. These results suggest the Cspk reward signal encodes changes in the prediction of future reward. During the cue, the modulation in the Cspk and Sspk rates of cells were uncorrelated, in contrast to the negative correlation reported in the context of error correction learning (Gilbert and Thach, 1977) or the coding of movement parameters (Ojakangas and Ebner, 1994; Stone and Lisberger, 1990). This suggests that Cspk modulation of the Sspk rate could be restricted to certain network states. Overall our findings imply that the cerebellum receives signals that could allow it to perform both error and reward-based associative learning, thus going beyond the accepted role of the cerebellum in error correction to suggest a general role in associative learning.
Results
Complex spikes encode the size of the expected reward
Monkeys performed a smooth pursuit eye movement task in which we manipulated the expected reward size (Joshua and Lisberger, 2012; Figure 1A). At the start of each trial, the monkey fixated on a white spot. The spot then changed to one of two colors, indicating whether a large or small reward would be given upon successful completion of the trial. After a variable delay, the colored target began to move in one of eight directions and the monkey had to accurately track it. At the end of a successful trial, the monkey received either a large or a small reward, as indicated by the color of the cue. To suppress catch-up saccades in the time immediately after the onset of the target movement, the movement of the target was preceded by an instantaneous step in the opposite direction (step-ramp). Thus, when the monkey began tracking, the target was close to the eye position and there was no need for fast corrective eye movements (Rashbass and Westheimer, 1961).

Smooth pursuit eye-movement task.
(A) Eye movement task temporally separates reward expectation, pursuit behavior and reward delivery. (B) Traces of average eye speed, in the first 300 ms after target motion onset. Target velocity was 20 °/s. (C) Each dot represents the average speed for an individual session 250 ms after target movement onset for the large (horizontal) and small (vertical) reward cue (Signed-rank, p = 8.6*10−24, n = 208).
The average eye velocity during tracking of the large reward target was faster and more similar to the target velocity than the tracking of the small reward target (Figure 1B). This difference was clearly apparent even at the single session level. In most sessions, the average eye velocity of 250 ms following motion onset was larger when the expected reward was large (Figure 1C). This behavioral difference and the selection of the larger reward target in an additional choice task (Figure 1—figure supplement 1) indicate that the monkeys associated the reward size with the color of the target. During the task, we recorded neural activity from the flocculus complex and neighboring areas (Figure 1—figure supplement 2). Our recordings included neurons that responded to eye movements and neurons that did not. Our task design allowed us to separately analyze the Cspk rate following cue presentation, during pursuit, and following reward delivery.
Following cue presentation, we found that many Purkinje cells (40 out of 220) had different Cspk rates in the different reward conditions. Of these, the vast majority (34 cells) transiently increased their Cspk rate when the expected reward was large but not when the expected reward was small (example in Figure 2A–C). This difference was apparent when examining the population average Cspk peri-stimulus time histogram (PSTH). After the color cue appeared, the population average Cspk rate was higher when the expected reward was large, as can be seen by the difference in the PSTHs of the two reward conditions (Figure 2D). At the single cell level, most cells had a higher Cspk rate on large reward trials than on small reward trials (Figure 2E, most dots lie beneath the identity line). Thus, the Cspk rate was modulated by changes in reward expectation, at times temporally distinct from the behavioral effect on pursuit eye movements and reward delivery. This change of rate reflects mostly an increase in the number of trials with a single Cspk following the cue, and a minor increase in the number of trials with multiple Cspks (Figure 2C,F and Figure 2—figure supplement 1).

Cspk rate differentiates reward conditions during cue presentation.
(A) Raster plot of an example cell in the two reward conditions, aligned to cue presentation. (B) PSTH of the cell in A. (C) Histogram of the number of Cspks that occurred in the 100–300 ms time window following cue presentation, in the same example cell. (D) Population PSTH. In all figures the error bars represent SEM. (E), Each dot represents the average Cspk rate of an individual cell 100–300 ms after the display of the large (horizontal) and small (vertical) reward cue (Signed-rank, Monkey B: p = 0.01, n = 148, Monkey C: p = 3.35*10−4, n = 72). (F) Histogram of the number of Cspks that occurred in the 100–300 ms time window following cue presentation, in the entire population (fraction of trials with 1 Cspks: Signed-rank, p = 5.1*10−4, n = 40; fraction of trials with two Cspks: Signed-rank, p = 0.03, n = 40).
Complex spikes do not encode reward size at reward delivery
The population Cspk rate was only affected by reward size when information regarding future reward was given, but not during the reward itself. During reward delivery, the PSTHs of the two conditions overlapped (Figure 3A), indicating a similar population response for the large and small rewards. When examining the responses of single cells, the Cspk rate was similar in the two reward conditions (Figure 3B, most cells fell close to the identity line). To compare the temporal pattern of the reward size encoding at cue and reward delivery, we calculated the difference in PSTHs between the large and small reward conditions (Figure 3C). The difference between large and small rewards rose steeply shortly after the color cue appeared. In sharp contrast, following reward delivery, there was only a small rate fluctuation that resembled the fluctuation prior to reward delivery. At the single cell level, there was no correlation between cell encoding of reward size during the cue and during reward delivery. For both the full population and for the subpopulation of neurons significantly coding the reward size at cue, the correlation between cue and reward delivery epochs was not significant (Figure 3D). This indicates that Purkinje cells that differentiated reward conditions during the cue did not differentiate between them during delivery.

Cspk is not modulated by reward size during reward delivery.
(A) Population PSTHs for different reward conditions aligned to reward delivery. (B) Each dot represents the average Cspk rate of an individual cell 100–300 ms large (horizontal) and small (vertical) reward delivery (Signed-rank, Monkey B: p = 0.339, n = 148; Monkey C: p = 0.719, n = 72). (C) The differences between the PSTH for large and small rewards aligned to cue or to reward delivery. (D) Each dot represents the average Cspk rate of an individual cell 100–300 ms after the cue (horizontal) and reward delivery (vertical; Spearman correlation of all cells: r = −0.069, p = 0.304, n = 220; Spearman correlation of cells that responded to reward size during cue: r = −0.056, p = 0.727, n = 40). (E) and (F) Fraction of trials with licks, during cue and reward delivery.
We ruled out the possibility that differences in licking behavior were responsible for the Cspk rate modulations. The pattern of licking (Figure 3E,F) and Cspk rate was completely different. Licking but not spiking increased at reward delivery. Further, after cue onset, licking in both reward conditions decreased whereas the temporal pattern of Cspks was different between reward conditions (Figure 2D). In approximately half of the recording sessions, we recorded licking behavior along with our electrophysiological recordings. For the cells that discriminated between reward conditions in these sessions (n = 21), the population PSTH showed a difference between reward conditions both in trials that included a lick immediately following the cue and trials that did not (Figure 3—figure supplement 1A,B). We also approximated the contribution of licking to the Cspk rate (Figure 3—figure supplement 1C,D). This contribution was negligible and was not different for large and small rewards.
We conducted a similar analysis for saccades and microsaccades. The pattern of saccades and microsaccades also differed from the Cspk pattern (Figure 3—figure supplement 2A,B). Saccades but not spiking increased following reward delivery. After cue presentation, fixational saccades were modulated by reward (Joshua et al., 2015), but this modulation did not affect the Cspk response to the cue (Figure 3—figure supplement 2C,D). The cells that discriminated between the large and small rewards after cue presentation responded similarly in trials with and without saccades. Similar to licking, the approximated contribution of saccades to the Cspk rate was small and did not differ between reward conditions (Figure 3—figure supplement 2E,F). We also ruled out the possibility that differences in saccade velocity or direction could explain our results (not shown).
Complex spike coding of target motion does not depend on reward size
Overall, these results indicate that the Cspk rate differentiates between reward sizes when reward information is first made available, but not during delivery. However, Cspks are also tuned to the direction of target motion (Kobayashi et al., 1998; Stone and Lisberger, 1990). According to the error signal model, this tuning is a result of image motion on the retina that is caused by the mismatch between target and eye motion (retinal slip). Our sample contained cells that were directionally tuned and not cue responsive (21 cells, example in Figure 4—figure supplement 1A–C), cells that were cue responsive and not directionally tuned (28 cells, example in Figure 4—figure supplement 1D–F) and cells that were both (12 cells, example in Figure 4—figure supplement 1G–I).
To determine how Cspk coding of target direction is affected by reward expectation, we focused on directionally tuned cells (33 cells, Figure 4A,B). When we examined the Cspk rate in the preferred direction (PD) of the cell and the direction 180° to it (the null direction), we did not find significant differences in the Cspk rate between reward conditions (Figure 4C). We aligned the cells to their PD and calculated a population tuning curve for each reward condition. The tuning curves overlapped and were not significantly different (Figure 4D).

Reward did not modulate Cspk direction tuning.
(A) Raster plot of an example cell in its preferred (black) and null (gray) directions, aligned to target movement onset. (B) PSTH of the cell in A. (C) Population PSTH for different reward conditions, in the preferred (solid) and null (dashed) directions. (D) Population direction tuning curve (Permutation test: p = 0.2156, n = 33).
We also examined the modulation of reward on Cspk rate at different eye velocities. We performed an additional speed task in which we manipulated the target speed (5, 10 or 20 °/s). Eye velocity corresponded to the speed of the target (Figure 5A). The effect of expected reward size on eye velocity was evident for all speeds at the average and the single session level (Figure 5A,B). Whereas cells responded to the target movement onset (Figure 5C), reward expectation did not modulate their response (Figure 5D). Together with the directional tuning results, this shows that encoding of reward is limited to the time point at which the reward size is first signaled and not the time when reward drives changes in behavior. Note that the rate of the Cspks did not increase monotonically with target speed (Figure 5C and D); we return to this point in the discussion.

Cspk rate was not modulated by reward size at target motion onset in the speed tuning task.
(A) Average eye velocity traces for experiments in which the color cue signaled a large (blue) or small (red) reward and the target speed was 5 °/s, 10 °/s and 20 °/s. Slower traces correspond to slower target speeds. Dotted lines represent target velocity. (B) Individual session average eye velocity 250 ms after target movement onset for large (horizontal) and small (vertical) reward, in the different target velocity conditions (Signed-rank: p = 6*10−16, n = 56). (C) population PSTHs of cells in their PD for the different speed conditions. (D) Population speed tuning curve in the PD (solid) and null (dashed) directions (Permutation test: p = 0.4541, n = 16).
Small drift eye movements during fixation can result in retinal slip (Figure 4—figure supplement 2A,B). Since Cspks respond to slow visual motion (Guo et al., 2014; Hoffmann and Distler, 1989), it is possible that the reward size modulation during the cue arose from a retinal slip driven by the appearance of the cue. However, since many of the cells that responded to the cue did not respond during target motion (24 cells), this does not seem to have been the case. Furthermore, we could not find a relationship between drift size or direction and the occurrence of Cspks. Specifically, the drift was similar between trials with and without a Cspk (Figure 4—figure supplement 2C,D). Aligning the drift following the cue to the occurrence of a Cspk resulted in a flat line around zero (Figure 4—figure supplement 2E), indicating that Cspks were not preceded by increased retinal slip. Finally, when calculating the Cspk responses to the cue separately for trials in which the drift was in the preferred or the null direction of the cell we observed no differences (Figure 4—figure supplement 2F).
The relationship between simple and complex spikes is different for reward and direction tuning
Given that Cspks were modulated by reward size following cue presentation, we went on to examine the Sspk modulations that occur concurrently. Preparatory activity following cues that predict reward or movement had been found in the cerebellum both at the level of the inputs that modulate Sspk rate (Wagner et al., 2017) and at the level of their output (Chabrol et al., 2019; Gao et al., 2018). Recently it was shown that Sspk rate decreases when behavior leads to a reward (Chabrol et al., 2019). Within the cells we recorded, Sspk responses to cue presentation were heterogeneous (Figure 6, examples in A-C). We found some cells that elevated their Sspk rate in the large versus small reward conditions (Figure 6A), others where activity was lower in the large reward condition (Figure 6B) and cells in which responses were similar in the large and small reward conditions (Figure 6C). Overall, we found more cells in which the Sspk rate was larger for the large reward condition (Figure 6D, blue line). However, in a substantial number of cells the Sspk rate was larger for the small reward (Figure 6D, red line). As a result of the opposite modulation, at the population level, the difference in Sspk between large and small reward mostly averaged out (Figure 6E,F).

Sspk modulations following cue presentation.
(A-C) Examples of cells' Sspks responses to cue presentation in each reward condition. (D) Fraction of cells with a higher Sspk rate in the large reward condition (blue) or small reward condition (red) as a function of time. The dashed line represents the 0.05 false positive chance level. (E) Population PSTH, the average Sspk rate of each cell was subtracted. (F) Each dot represents the average Sspk rate of an individual cell 100–300 ms following large (horizontal) and small (vertical) reward delivery (Signed-rank, Monkey B: p = 0.142, n = 155; Monkey C: p=0.09, n = 75).
The directionally tuned Cspk signal has been linked to the coding of visual errors that instruct motor learning (Medina and Lisberger, 2008; Nguyen-Vu et al., 2013) by changing the Sspk response to parallel fiber inputs. Cspks generate plasticity in parallel fiber synapses leading to a decrease in the Sspk rate (Ekerot and Kano, 1985). This plasticity is thought to underlie the opposite modulations of simple and complex spike rates on different tasks (Badura et al., 2013; Gilbert and Thach, 1977; Stone and Lisberger, 1990). The consistently larger response to the larger reward in the Cspk (Figure 2) versus the heterogeneous Sspk response (Figure 6), suggests that the expected opposite modulation between Cspk and Sspk found in relation to movement does not hold for reward related signals.
To test the relationship between Cspks and Sspks directly we compared the rate modulation in the same cell. In our sample of cells, we found the expected opposite modulations during movement. When we aligned the Cspk tuning curve to the preferred direction of the Sspks of the same cell, we found that the Cspk rate decreased in directions for which the Sspk rate increased (Figure 7A). To examine whether this effect existed at the single cell level, we calculated the signal correlation for the complex and simple spikes which we defined as the correlation between simple and complex direction tuning curves. We found that most signal correlations were negative; in other words, the Cspks and Sspks were oppositely modulated during movement in most cells (Figure 7B). This effect disappeared when we shuffled the phase of the Cspk tuning curve or assigned direction labels randomly (see Materials and methods).

Cspk rate negatively correlated with Sspk rate during movement but not during cue presentation.
(A) Population tuning curve of Cspks (up) and Sspks (bottom), both aligned to the preferred direction of Sspks (Spearman r = −0.3087, p = 7*10−7, n = 31). (B) Histogram of signal correlations of simple and complex spikes in the population. Solid and dashed lines show the correlations for phased and direction shuffled data (Signed-rank: p=0.002, n = 31). (C) Each dot shows individual cell differences in average rate between reward conditions 100–300 ms after cue, in Cspks (horizontal) and Sspks (vertical; Spearman correlation of all cells r = −0.07, p = 0.32, n = 172; Spearman correlation of cells that responded to reward size during cue: r = −0.003, p = 0.98, n = 30) (D) Similar to C the horizontal position of each dot shows individual cell differences in average Cspk rate between reward conditions 100–300 ms after cue. The vertical axis shows the difference in Sspk firing rate in the time window 100–300 ms after the cue and 100–300 ms before the cue (vertical; Spearman correlation of all cells r = −0.03, p = 0.63, n = 172; Spearman correlation of cells that responded to reward size during cue: r = −0.19, p = 0.31, n = 30).
Unlike movement related modulation, the complex and simple spikes were not oppositely modulated following cue presentation (Figure 7C,D). If reward-related modulations in Cspks drive Sspk attenuation, we would expect that the higher Cspk rate in the large reward condition would result in a stronger attenuation of Sspks. This would lead to a negative correlation between the complex and simple spike reward modulations during the cue. However, we found that simple and complex spike modulations following cue presentation were uncorrelated (Figure 7C). As we observed cells that changed their Sspk rate after the cue without differentiating between reward conditions, we also calculated the correlation between Cspk reward condition modulations and the change in Sspk rate following the cue. In this case as well, we did not find any correlation (Figure 7D). Further, the correlations were not significantly different from zero whether we analyzed the full population or only those cells whose Cspks were significantly tuned to reward size during the cue. Thus, the way the difference in Cspk rate during cue affects Sspk encoding and behavior may differ from the one suggested by the error signal model.
Discussion
The difference in Cspk rate during cue presentation and the lack of difference during reward delivery and pursuit behavior implies that Cspks can act as a reward prediction signal. This finding diverges from the accepted error signal model. The coding of predictive stimuli has been reported in Cspks in the context of error-based learning (Ohmae and Medina, 2015). Together with the current results, this suggests a more general role for the cerebellum in associative learning, when learning is both error and reward based (Heffley and Hull, 2019; Kostadinov et al., 2019; Thoma et al., 2008; Wagner et al., 2017). The similar Cspk response to the different reward sizes during reward delivery implies that the Cspk coding of reward is not related to reward consumption behavior and does not represent the successful completion of the trial. As we did not observe a reward effect on Cspk rate during pursuit eye movements, when the retinal slip was the largest, our results do not support reward modulation of the Cspk error signal.
Plasticity and learning from rewards in the cerebellum
Error-based models of the cerebellum link the cerebellar representation of movement, plasticity mechanisms and learning. In this framework, the behavioral command of the cerebellar cortex in response to a stimulus is represented by the Sspk rate of Purkinje cells. Cspks lead to a reduction in the synaptic weight in recently active parallel fibers and thereby change the Sspk rate in response to similar parallel fiber input (Ekerot and Kano, 1985). This change in the Sspk rate is hypothesized to alter the behavioral response to the same stimulus. Thus, when errors occur, the behavior that led to them is eliminated. The same logic cannot apply to learning from rewards since reward strengthens rather than eliminates the behavior that led to the reward (Thorndike, 1898).
Consistent with this reasoning, we found that reward-related modulation of Cspks did not exhibit the classical decrease in Sspk activity associated with Cspk activity (Figure 7). This result suggests that on our task, other plasticity rules might mask or override the depression. Research on the cerebellum has identified many other sites in which plasticity might drive changes in neuronal activity (Gao et al., 2012; Jörntell and Ekerot, 2002). Furthermore, the Cspk dependent plasticity in the parallel fibers might also change sign as a result of the network state (Rowan et al., 2018). Thus, our results suggest that such mechanisms are engaged when Cspks are modulated by reward.
The Cspk reward signal does not seem to affect cerebellar computation through the same relatively well-understood mechanisms of the Cspk error signal. We also did not find an effect of reward on the Cspk signal during behavior. Thus, the influence of the Cspk reward signal to behavior remains unclear. Moving beyond the level of representation to a mechanistic understanding of the effect of the Cspk reward signal on cerebellar computation and behavior is a crucial next step.
Relationship to previous studies of the smooth pursuit system
A further demonstration of the existence of independent mechanisms for learning from reward and sensory errors emerges when combining the current results with our recent behavioral study (Joshua and Lisberger, 2012). In that study, monkeys learned to predict a change in the direction of target motion by generating predictive pursuit movements. The size of the reward did not modulate the learning process itself but only the execution of the movement (Joshua and Lisberger, 2012). The critical signal for direction change learning has been shown to be the directionally tuned Cspk signal (Medina and Lisberger, 2008). Our findings that the target direction signal is not modulated by reward provides a plausible explanation at the implementation level for this behavioral finding. The directionally tuned Cspks that drive learning are not modulated by reward; therefore, learning itself is reward independent.
In the current study, the Cspk rate did not increase with target speed (Figure 5). At least one study has reported a monotonic increase between Cspk rate and motion speed (Kobayashi et al., 1998). The specific experimental protocol we used might have led to the lack of speed coding. The vast majority of trials in which the monkeys were engaged were at 20 °/s, and we only measured responses at different speeds in a minority of the sessions (see Materials and methods). Therefore, it is possible that the monkey developed a speed prior (Darlington et al., 2018) and hence was expecting the target to move at 20 °/s. Violation of this prior in the slower motion trials might have potentiated the response and masked the speed tuning. Behavioral support for such a prior comes from the eye speed response to low speed targets (5 °/s) in which the eye speed overshot the target speed (Figure 5A,B). Other possibilities such as the recorded population or the properties of the visual stimuli might also have contributed to the lack of speed tuning.
Future directions
The reward signal we found is similar to reward expectation signals in dopaminergic neurons of the ventral tegmental area (VTA) and substantia nigra pars compacta (Schultz et al., 1997). The VTA projects to the inferior olive (Fallon et al., 1984) and recently, direct projections from the cerebellum to dopaminergic neurons in the VTA have been found (Carta et al., 2019). Reward signals have also been found in cerebellar granular cells that modulate the Sspk rate in Purkinje cells (Wagner et al., 2017) and in the deep cerebellar nuclei (Chabrol et al., 2019). Researching the differences and interactions of reward signals is an important next step in understanding how reward is processed. In particular, future research will need to investigate the source of the reward information in the inferior olive.
Another interesting question is whether the Cspk representation of reward depends on the range of possible rewards. Our results demonstrate that the Cspk rate is informative of future reward size. Expected reward size might be represented in the cerebellum in an absolute manner, based on its physical size, or in relative order, based on its motivational value in comparison to other available rewards (Cromwell et al., 2005; Tremblay and Schultz, 1999). Our results show that when a small reward cue is presented, there is no increase in the Cspk rate (Figure 2D). Although this cue predicts a future reward, it does not elicit a Cspk response. This hints that the Cspk representation may be relative and not absolute. To further verify this, we need to construct a task in which we examine the same reward size in different contexts.
Conclusion
To sum up, the current study demonstrates that a population of Purkinje cells receive a reward predictive signal from the climbing fibers. Our results show that the reward signal is not limited to the direct rewarding consequences of the behavior. These results thus suggest that the cerebellum receives information about future reward size. Our results go beyond previous findings of cerebellar involvement in the elimination of undesired behavior, to suggest that the cerebellum receives the relevant information that could allow it to adjust behavior to maximize reward.
Materials and methods
We collected neural and behavioral data from two male Macaca Fascicularis monkeys (4–5 kg). All procedures were approved in advance by the Institutional Animal Care and Use Committees of the Hebrew University of Jerusalem and were in strict compliance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. We first implanted head holders to restrain the monkeys' heads in the experiments. After the monkeys had recovered from surgery, they were trained to sit calmly in a primate chair (Crist Instruments) and consume liquid food rewards (baby food mixed with water and infant formula) from a tube set in front of them. We trained the monkeys to track spots of light that moved across a video monitor placed in front of them.
Visual stimuli were displayed on a monitor 45 cm from the monkeys’ eyes. The stimuli appeared on dark background in a dimly lit room. A computer performed all real-time operations and controlled the sequences of target motions. The position of the eye was measured with a high temporal resolution camera (1 kHz, Eye link - SR research) and collected for further analysis. Monkeys received a reward when tracking the target successfully.
In subsequent surgery, we placed a recording cylinder stereotaxically over the floccular complex. The center of the cylinder was placed above the skull targeted at 0 mm anterior and 11 mm lateral to the stereotaxic zero. We placed the cylinder with a backward angle of 20° and 26° for monkey B and C respectively. Quartz-insulated tungsten electrodes (impedance of 1–2 Mohm) were lowered into the floccular complex and neighboring areas to record simple and complex spikes using a Mini-Matrix System (Thomas Recording GmbH). When lowering the electrodes, we searched for neurons that responded during pursuit eye movements (see direction task) but often collected data from neurons that did not respond to eye movements. Overall, we recorded complex spikes from 148 and 72 neurons from monkeys B and C respectively. Of these, the Sspks of 28 and 19 neurons from monkeys B and C were directionally tuned during the direction task (Kruskal-Wallis test, α = 0.05).
Signals were digitized at a sampling rate of 40 kHz (OmniPlex, Plexon). For the detailed data analysis, we sorted spikes offline (Plexon). For sorting, we used principal component analysis and corrected manually for errors. In some of the cells the Cspks had distinct low frequency components (Warnaar et al., 2015; Zur and Joshua, 2019; for example Figure 1—figure supplement 2B, left column and Figure 2—figure supplement 1). In these cells, we used low frequency features to identify and sort the complex spikes. We paid special attention to the isolation of spikes from single neurons. We visually inspected the waveforms in the principal component space and only included neurons for further analysis when they formed distinct clusters. Sorted spikes were converted into timestamps with a time resolution of 1 ms and were inspected again visually to check for instability and obvious sorting errors.
We used eye velocity and acceleration thresholds to detect saccades automatically and then verified the automatic detection by visual inspection of the traces. The velocity and acceleration signals were obtained by digitally differentiating the position signal after we smoothed it with a Gaussian filter with a standard deviation of 5 ms. Saccades were defined as an eye acceleration exceeding 1000 °/s2, an eye velocity crossing 15 °/s during fixation or eye velocity crossing 50 °/s while the target moved. To calculate the average of the smooth pursuit initiation we first removed the saccades and treated them as missing data. We then averaged the traces with respect to the target movement direction. Finally, we smoothed the traces using a Gaussian filter with a standard deviation of 5 ms. We also recorded licking behavior to control for behavioral differences between reward conditions that might confound our results. Licks were recorded using an infra-red beam. Monkey B tended not to extend its tongue, therefore we recorded lip movements.
Experimental design
Direction task
Request a detailed protocolEach trial started with a bright white target that appeared in the center of the screen (Figure 1A). After 500 ms of presentation, in which the monkey was required to acquire fixation, a colored target replaced the fixation target. The color of the target signaled the size of the reward the monkey would receive if it tracked the target. For monkey B we used blue to signal a large reward (~0.2 ml) and red to signal a small reward (~0.05 ml); for monkey C we used yellow to signal a large reward and green to signal a small reward. After a variable delay of 800–1200 ms, the targets stepped in one of eight directions (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°) and then moved in the direction 180° from it (step-ramp, Rashbass and Westheimer, 1961). For both monkeys, we used a target motion of 20 °/s and a step to a position 4° from the center of the screen. The target moved for 750 ms and then stopped and stayed still for an additional 500–700 ms. When the eye was within a 3 × 3 degree window around the target the monkey received a juice reward.
Speed task
Request a detailed protocolDuring the direction task we online fitted a Sspk tuning curve for each cell and approximated the cell's PD. If a cell seemed directionally tuned, we ran an additional speed task. The temporal structure of the speed task was the same as the direction task. The step size was set to minimize saccades and was 1°, 2° and 4° for a target speed of 5, 10 or 20 °/s. The targets could move either in the approximate PD of the cell or the direction 180° from it, which we termed the null direction. The targets moved at 5, 10 or 20 °/s.
Choice task
Request a detailed protocolMonkeys were required to choose one of two targets (large or small reward) presented on the screen (Figure 1—figure supplement 1A). We used this task to determine whether the monkeys correctly associated the color of the target and the reward size (Figure 1—figure supplement 1B). Their choice determined the amount of reward they received. Each trial began with a 500 ms fixation period, similar to the tasks described previously. Then two additional colored spots appeared at a location eccentric to the fixation target. One of the colored targets appeared 4° below or above the fixation target (vertical axis) and the other appeared 4° to the right or left of the fixation target (horizontal axis). The monkey was required to continue fixating on the fixation target in the middle of the screen. After a variable delay of 800–1200 ms, the white target disappeared, and the colored targets started to move towards the center of the screen (vertically or horizontally) at a constant velocity of 20 °/s. The monkey typically initiated pursuit eye movement that was often biased towards one of the targets (Figure 1—figure supplement 1C). After a variable delay, the monkeys typically made saccades towards one of the targets. We defined these saccades as an eye velocity that exceeded 80 °/s. The target that was closer to the endpoint of the saccade remained in motion for up to 750 ms and the more distant target disappeared. The monkey was required to track the target until the end of the trial and then received a liquid food reward as a function of the color of the target.
Data analysis
Request a detailed protocolAll analyses were performed using Matlab (Mathworks). When comparing reward conditions, we only included cells that were recorded for a minimum of 20 trials (approximately 10 for each condition). When performing analyses that included additional variables such as target direction or velocity, we set a minimum of 50 trials (approximately 3–4 for each condition).
To study the time varying properties of the response, we calculated the PSTH at a 1 ms resolution. We then smoothed the PSTH with a 10 ms standard deviation Gaussian window, removing at least 100 ms before and after the displayed time interval to avoid edge effects. Note that this procedure is practically the same as measuring the spike count per trial in larger time bins. We defined cells that responded significantly differently to reward conditions during the cue using the rank-sum test on the mean number of spikes 100–300 ms after cue onset.
To calculate the tuning curves, we averaged the responses in the first 100–300 ms of the movement. We calculated the preferred direction of the neuron as the direction that was closest to the vector average of the responses across directions (direction of the center of mass). We used the preferred direction to calculate the population tuning curve by aligning all the responses to the preferred direction. We defined a cell as directionally tuned if a one-way Kruskal-Wallis test (the case of 8 directions, directions task), or a rank-sum test (the case of two directions, speed task), revealed a significant effect for direction. We present reward modulation on movement parameters only for directionally tuned cells and also confirmed that if we took the full population there was no reward modulation at motion onset (Signed-rank: Monkey B, p = 0.8904, n = 148; Monkey C, p = 0.4487, n = 72).
To statistically test the significance of the effect of reward direction tuning we used a permutation test. We first calculated separate tuning curves for each cell in the two reward conditions. We then chose a random subset of combinations of cells and directions and reversed the small and large reward labels of this subset. We then calculated the population PSTHs for the shuffled ‘small’ and ‘large’ reward conditions. Our statistic was the mean square distance of the two tuning curves. We used the percentile of the statistic of the unshuffled data to calculate the p-value. We used a similar test for the speed task in which the subset we chose was a random combination of cell, direction and speed.
We calculated the fraction of cells whose Sspk rate was different between reward conditions as a function of time (Figure 6D) by using left and right-tailed rank-sum tests on a moving time window. For each cell, we looked for time points in which there were significantly more Sspks in the large reward trials in comparison to the small (RL > RS) and time points in which there were significantly more Sspks in the small reward trials in comparison to the large (RS > RL). We tested each time point by calculating the number of Sspks in each trial in time bins of 200 ms surrounding it. We then tested if the number of Sspks in large reward trials was significantly different using both left and right signed-rank test. We classified that time point as RL > RS, RS > RL or neither according to the result of the tests. We then calculated the fraction of cells in each category for every time point.
We calculated the signal correlation of each cell's Cspks and Sspks by calculating a tuning curve of each spike type and computing the Pearson correlation of the tuning curves (Figure 7B). As a control, we performed the same analysis on shuffled data. In the phase shuffled control, we shuffled the Cspk tuning curves by different phases while preserving their relative order. For example, shuffling by a phase of 45° meant moving the response at 0° to 45°, 45° to 90°, 315° to 0° and so on. In the direction shuffle, we assigned random direction labels to the Cspk responses.
We calculated the cross-correlation of complex and simple spikes (Figure 1—figure supplement 2D) by calculating the PSTH of Sspks aligned to a Cspk event. We removed Cspks that occurred less than 100 ms after the trial began or less than 100 ms before a trial ended since we did not have sufficient information to calculate the PSTH. We manually removed spikes that were detected 1 ms before a Cspk or 2 ms after, because occasionally they could not be distinguished from Cspk spikelets.
To control for the direct responses to licking we approximated the contribution of the Cspk response to licking (Figure 3—figure supplement 1D) to the Cspk response to cue. We first calculated the peri-event time histogram (PETH) of each cell aligned to lick onset without separating the reward conditions (Figure 3—figure supplement 1C). Then, for every trial, we created synthetic data in which the firing rate around each lick onset was set to the average lick triggered PETH. Firing rates during times that were outside the range of the PETH (300 ms) were treated as missing data. We then averaged these single trial estimations of the firing rate to calculate the predicted PSTH for each reward condition, aligned to cue presentation. We performed a similar analysis for lick offset (Figure 3—figure supplement 1D, dashed line) and saccades (Figure 3—figure supplement 2F).
To control for the response to retinal slip following cue presentation, we calculated the vertical and horizontal drift. Drift velocity during fixation is small and thus eye tracker measurements of drift movements are prone to measurement noise. We noted that a large fraction of the variance in drift position is explained by changes in pupil size (R2 vertical median = 0.94, R2 horizontal median = 0.17, n = 208; Kimmel et al., 2012). To examine differences in the drift between reward conditions independently of pupil size, we fitted a linear model between pupil size and eye position for the averages of each recording session. We subtracted the position predicted by pupil size from the measured position for each trial and performed further analyses on these corrected traces.
We did not correct for multiple comparisons in our analysis. We either used a small number of tests over the entire population or a large number of tests on individual cells that were only used as a criterion (for example, whether a cell differentiated between reward conditions during the cue). When using a test as a criterion we did not infer the existence of responsive cells but rather used it to classify cells into subpopulations.
Data availability
The data used in this paper is available in: https://github.com/MatiJlab.
References
-
A theory of cerebellar functionMathematical Biosciences 10:25–61.https://doi.org/10.1016/0025-5564(71)90051-4
-
Relative reward processing in primate striatumExperimental Brain Research 162:520–525.https://doi.org/10.1007/s00221-005-2223-z
-
Neural implementation of bayesian inference in a sensorimotor behaviorNature Neuroscience 21:1442–1451.https://doi.org/10.1038/s41593-018-0233-y
-
Distributed synergistic plasticity and cerebellar learningNature Reviews Neuroscience 13:619–635.https://doi.org/10.1038/nrn3312
-
Purkinje cell activity during motor learningBrain Research 128:309–328.https://doi.org/10.1016/0006-8993(77)90997-0
-
Cerebellar encoding of multiple candidate error cues in the service of motor learningJournal of Neuroscience 34:9880–9890.https://doi.org/10.1523/JNEUROSCI.5114-13.2014
-
Coordinated cerebellar climbing fiber activity signals learned sensorimotor predictionsNature Neuroscience 21:1431–1441.https://doi.org/10.1038/s41593-018-0228-8
-
Interactions between target location and reward size modulate the rate of microsaccades in monkeysJournal of Neurophysiology 114:2616–2624.https://doi.org/10.1152/jn.00401.2015
-
Reward action in the initiation of smooth pursuit eye movementsJournal of Neuroscience 32:2856–2867.https://doi.org/10.1523/JNEUROSCI.4676-11.2012
-
A theory of cerebellar cortexThe Journal of Physiology 202:437–470.https://doi.org/10.1113/jphysiol.1969.sp008820
-
Cerebellar purkinje cell activity drives motor learningNature Neuroscience 16:1734–1736.https://doi.org/10.1038/nn.3576
-
Climbing fibers encode a temporal-difference prediction error during cerebellar learning in miceNature Neuroscience 18:1798–1803.https://doi.org/10.1038/nn.4167
-
Purkinje cell complex spike activity during voluntary motor learning: relationship to kinematicsJournal of Neurophysiology 72:2617–2630.https://doi.org/10.1152/jn.1994.72.6.2617
-
Independence of conjugate and disjunctive eye movementsThe Journal of Physiology 159:361–364.https://doi.org/10.1113/jphysiol.1961.sp006813
-
Purkinje cells in awake behaving animals operate at the upstate membrane potentialNature Neuroscience 9:459–461.https://doi.org/10.1038/nn0406-459
-
Animal intelligence: an experimental study of the associative processes in animalsPsychological Review 5:551–553.https://doi.org/10.1037/h0067373
-
Duration of purkinje cell complex spikes increases with their firing frequencyFrontiers in Cellular Neuroscience 9:122.https://doi.org/10.3389/fncel.2015.00122
-
Using extracellular low frequency signals to improve the spike sorting of cerebellar complex spikesJournal of Neuroscience Methods 328:108423.https://doi.org/10.1016/j.jneumeth.2019.108423
Decision letter
-
Jennifer L RaymondReviewing Editor; Stanford University School of Medicine, United States
-
Ronald L CalabreseSenior Editor; Emory University, United States
-
Steve A EdgleyReviewer; University of Cambridge, United Kingdom
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Cerebellar climbing fibers encode expected reward size" for consideration by eLife. Your article has been reviewed by Ronald Calabrese as the Senior Editor, a Reviewing Editor, and three reviewers. The following individuals involved in review of your submission have agreed to reveal their identity: Steve A Edgley (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
This paper presents the dramatic finding that cerebellar Purkinje cell complex spikes (CS), which are generally thought to carry motor error signals, also show activity related to a cue indicating the reward value of successful performance of a future motor task. The consensus view of CS function suggests that they represent precisely timed instructive signals which act to modify cerebellar cortical circuits in order to optimise cerebellar outputs and thus motor performance. In this paper, complex spikes were recorded from Purkinje cells of monkeys performing a smooth-pursuit task. Before the pursuit target is presented, a visual cue predicts the value of a reward contingent upon good motor performance following a "go" signal after a variable delay. The key finding is that CSs are powerfully modulated with precise timing shortly after the cue signalling the size of the future reward. This modulation is dramatically different for large reward predicting cues than for small reward predicting cues. The results are unexpected and the authors show convincingly that the CS are related to the reward predicting cue rather than to the subsequent reward. High reward predicting cues are followed by faster movements, so the cue related CS signal correlates with that. Similar reward predicting cue-related modulation is seen in both cells with target/eye-movement tuned CSs and non-target/eye movement related CSs. A further remarkable property is that responses to large reward predicting cues increase complex spike frequency whereas small reward predicting cues decrease it, suggestive of ideas from temporal difference (TD) learning.
This is a big finding for the field and would fit nicely with other recent work on reward signals in the cerebellum and connections between the cerebellum and dopamine system. Similar cue -related activity has been described previously for limb motor tasks (prefrontal cortex (Tsujimoto et al., 2012), anterior striatum/SMA (Romoand Schultz, 1992) and for saccade-related frontal lobe areas (Roesch and Olson, 2003), as well as dopaminergic substantia nigra/VTA neurons, but not previously in CSs. There was considerable enthusiasm about this work, and also some issues that should be addressed.
Essential revisions:
1) Additional effort should be made to address the possibility of "lurking" variables, such as movements, that could explain the reward-related CS activity rather than reward per se. It is likely that there are many motor responses related to orienting that may be correlated with "reward prediction. The analyses presented in Figure 3E and F or Supplementary figure 3 are not sufficient to rule out these potential confounds. While it would be unreasonable to expect the authors to control for all of them, additional exposition of some of the most likely should be possible through additional analyses of data already on hand and additional discussion of this issue. This is especially relevant since the authors did not do the experiments typically required to show a TD-like signal (e.g. reward omission) as has been reported in dopamine neurons (Schultz) and aversive CS signals (Ohmae and Medina, 2015). In particular, the following would improve the claim that the CS response can be attributed to differences in predicted reward size.
a) More extensive analysis of licking would be helpful.
Figure 3E,F plot the fraction of trials with lick? What about the number or timing of licks? Also, while it is true, as the authors state, that the monkey's licking rates are very high before the cue and decrease around the time of the cue, there does appear to be a substantial difference in licking behavior for large and small reward trials following the cue--around the time of the "reward prediction" CS. As IO neurons tend to depress during prolonged activation, it's perhaps not surprising that they would not be strongly driven during the pre-cue licking bouts, especially if the licks are not time locked to a particular event (which would cause jitter that could "blur" out consistent spiking). Further, the authors never show what the CSs are doing more than 200 ms preceding the cue (when licking is maximal).
b) The analysis of saccades was not especially compelling. An advantage of doing this study in the floccular complex is that we know that complex spikes encode retinal slip, so for the neurons with directionally tuned CSs there is one key potential lurking variable, which is readily addressed, and that is eye movements that could produce retinal slip in the preferred direction of the CSs-i.e., eye movements in the opposite direction. More extensive analysis of the direction and velocity of eye movements during the cue period is needed.
c) A key question is whether there is any cue-related SS modulation – might throw light on the function of the Purkinje cells that were not tuned for visual target direction and on events around the time of the cue related to the complex spikes.
2) Fuller presentation of the results. In general, the presentation of the results was clear, but to fully understand the details, readers must flit between the Results section, figures and Materials and methods section to try to follow some of the numbers and the specifics of the task.
a) Shortly after the cue presentation the data from Figure 2A suggests that the instantaneous discharge rate of complex spikes reaches around 4 Hz. Does this include multiple complex spikes with short Interspike intervals on a single trial? I strongly feel that an illustration of raw data would be very helpful here to allow complex spike and simple spike waveforms to be compared, and that this should have a broader timescale than in Figure 2. For example, this would make a good supplementary data figure.
b) The description of the task in subsection “Complex spikes encode the size of the expected reward” does not fully describe what happens: the coloured target steps in one direction before moving linearly in the direction orthogonal to that direction. This is buried at the end of the paper in the Materials and methods section, and I suspect a very small proportion of the readers will look at that. It really needs to be before or at the time that the task is described.
c) An issue is that the recordings were sampled from a heterogeneous population of Purkinje cells from the flocculus. The Materials and methods section says there were 149 cells, Figure 2 148.
d) Scatter plots in Figure 2D shows some complex spikes with very low rates in the 100-300 ms interval for both small and large rewards: this could arise if only very few trials were averaged. I cannot see data on minimum numbers of trials in the paper – it should be made available.
e) Differentiation is made in Figure 3D between individual Purkinje cells with significant complex spike modulation to the cue and those in which there was not a significant difference. These numbers need to be available in the manuscript – ideally in subsection “Complex spikes encode the size of the expected reward”. At present, it is buried in the Materials and methods section at the end.
f) Following from this, an example cell is shown in Figure 4 which had both cue-related reward modulation and directional modulation. Again, some numbers on these would be valuable; how many of the sample were significantly modulated by both?
g) Figure 6 compares the complex and simple spike modulations, but does not say anything about simple spike modulation related to the reward predicting the cue. At least a brief mention is needed – were any significantly modulated? Is so, were they different for large and small reward predicting cues?
h) The final paragraph of subsection “Representation of reward and target motion in the population” states that the motion tuned cells add an early response which was not selective to reward, which is clear in Figure 7B. Is this a qualitative description or were these significant differences? Since these are population curves, does this reflect heterogeneity in the response pattern, or is it a genuine representation of the behaviour of individual cells?
i) How were the cross correlations between CS and SS calculated? (subsection “Data analysis”). One might expect that the authors would calculate the correlations of the two types of spikes triggered on the onset of the cue or pursuit or some other trigger related to the task. It seems that instead they have calculated it based on CS triggered SS PSTHs. This is strange since this is typically the way that people verify that the CS is coming from the same Purkinje cell as the SS. Could it be that in CS with low negative correlations with SS the CS and SS are not actually coming from the same Purkinje cell?
j) The MRI images in Figure 1—figure supplement 2 don't do much to help convince readers that the authors are recording in or near the VPFL. At the very least, some labels would help. Even better would be if the imaging can be done with an electrode or similar placed in a recording track (which would hopefully reside in the VPFL).
k) In the Conclusion, the statement that the paper demonstrates "how climbing fibres encode predicting the reward size" is not really appropriate – it shows how a subpopulation of CSs innervating Purkinje cells in the flocculus have activity that encodes predicted reward. I also disagree with the statement that the authors have "found signals that can be used by the cerebellum to drive behaviour that maximises upcoming reward" – that is entirely possible, but is not shown in this paper – may be used would be more appropriate.
3) Additional discussion of the potential function of the CS activity related the reward cue.
It is not clear what causal role the authors believe the "reward prediction" signal is playing in the task the monkeys are performing. The cue-related CSs come at a time when the Purkinje cells are not engaged in movement and relate differently to ongoing SS compared to CS occurring during movement – in this case what is their function? On the one hand, the authors show that the pursuit velocity is higher on the large reward trials; but, on the other hand, they show that the Purkinje cells with the highest "reward prediction" signals are the least tuned for the direction of pursuit. How do the authors think these untuned Purkinje cells fit within the pursuit control system? Do they even have access to the motor neurons controlling the eye? If not, why does the cerebellum need such a "reward prediction" signal during this task? Could the increased velocities on the large reward trials not just as easily be explained by increased attention/motivation that is not necessarily dependent on the cerebellum?
This is especially relevant since the majority of the neurons recorded by the authors did not appear to be within the VPFL, at least as it has been defined by Lisberger and others. That is, the percentage of Purkinje cells with directionally tuned simple spikes is very low compared to what is typically reported for VPFL Purkinje cells and few of the CSs seem to encode retinal slip signals. If the Purkinje cells are not in the VPFL how do the authors know they are even causally involved in the task? More information should be provided about the relative locations of the recorded cells that are tuned and untuned with reference to what is known in the literature about non-visual CS receptive fields in the areas around the flocculus.
A weakness of the task design is that it is impossible to distinguish between quantitative versus relative reward size encoding. Although the authors do not claim to study this directly, it is implicit in the learning models cited that there be quantitative reward size encoding (see point #2).
[Editors' note: further revisions were requested prior to acceptance, as described below.]
Thank you for submitting your article "Cerebellar climbing fibers encode expected reward size" for consideration by eLife. Your article has been reviewed by Jennifer Raymond as the Reviewing Editor, and Ronald Calabrese as the Senior Editor.
The Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
The authors have addressed many of the points raised in the last review. However, this central issue still has not been convincingly addressed, namely, whether the complex spikes are responding to the cue for reward size per se, rather than something else correlated with the cue, such as movements the animals may well make when anticipating reward and the associated sensory feedback.
Essential revisions:
To be fair, this issue is also a potential confound for many or all of the reports that have been coming out about reward coding in the cerebellum. The very exciting thing about the present experimental design is that it has the potential to nail this issue in a way that other studies have not been able to. In contrast to much of the work on reward coding in the cerebellum, which has been done in regions of the cerebellum where very little is known about coding, in this study, the recordings were made from the floccular complex of the cerebellum, where the signals carried by the simple spikes and complex spikes are probably better understood than anywhere in the cerebellum. And what is known about coding by the complex spikes is that they encode image motion on the retina, or retinal slip, in a particular, preferred direction (almost always contraversive or up). Therefore, the most important potential confound for the claim that the complex spikes encode upcoming reward size, is the possibility that the expectation of a large reward causes the animal to make smooth eye movements (slow drift) that results in retinal slip in the preferred direction of the climbing fiber. The authors analyzed licking and saccades, but the most important potential 'lurking variable" to worry about is the known responsiveness of complex spikes in this region of cerebellum to retinal slip at speed of a few degress/s, and that was not convincingly analyzed. The good news is that is quite do-able, for example, by doing a complex spike-triggered average of retinal slip, and comparing the results for the period after cue presentation with those during target motion in the "on" direction for the complex spikes. Unfortunately, the number of cells that were both responsive to the cue and directionally tuned (12 cells) may not be sufficient, hence more cells may need to be recorded.
[Editors' note: further revisions were requested prior to acceptance, as described below.]
Thank you for submitting your article "Cerebellar climbing fibers encode expected reward size" for consideration by eLife. Your article has been reviewed by a Reviewing Editor and Ronald Calabrese as the Senior Editor.
The Reviewing Editor has drafted this decision to help you prepare a revised submission.
The authors have responded in good faith to the concern about the possibility that retinal slip, which is known to drive climbing fibers in the region of the cerebellum recorded, might contribute to the reward-correlated climbing fiber responses. I appreciate that the measurement noise for the camera-based eye tracking limits what can be done to some extent, and the authors have gone to considerable effort to deal the methodological limits.
The analysis provided in Figure 4—figure suppplement 2B is helpful.
Figure 4—figure supplement 2A,C would be more helpful if the variance was provided as well as the mean. If the mean was zero and the variance was 10deg/s, there would be a lot more image motion in the preferred direction than if the mean was zero and variance was zero. This matters since there is an asymmetry in the ability of preferred ad anti-preferred direction image motion to increase and suppress the climbing fiber firing rate because of the low basal firing rate and floor at zero.
The analysis provided in Figure 4—figure supplement 2D is based on an assumption that is not supported by the literature--that the climbing fiber response to retinal slip increases linearly with the velocity of image motion. On the contrary, in rabbit, cat, and rat, climbing fibers in the flocculus are tuned for low retinal slip speeds, and their responses fall off with speeds >1-2{degree sign}/s (Simpson and Alley, 1974; Blanks and Precht, 1983; Kusunoki et al., 1990; Fushiki et al., 1994). In monkeys, some climbing fibers respond to higher speeds, yet the assumption of a linear increase with image velocity is not supported (Noda et al., 1987; Hoffmann and Distler, 1989; Guo et al., 2014).
It is still not clear to me why the authors would not do the obvious thing of a complex spike triggered average of eye velocity in the preceding ~200 ms (for complex spikes in the period associated with the reward cue). After going through the trouble of correcting for pupil size artifacts in the camera data, why not give the climbing fiber spike-triggered eye velocity average a try? Those measurements might be dominated by noise, yet it would still be nice to know that nothing could be detected. Because if something is detected despite the measurement limitations, that would require a reinterpretation of the main result
Essential revisions:
Remove Figure 4—figure supplement 2, panel D, unless the authors can provide some justification for the assumption of linear increase in climbing fiber response with image velocity.
https://doi.org/10.7554/eLife.46870.020Author response
Summary:
This paper presents the dramatic finding that cerebellar Purkinje cell complex spikes (CS), which are generally thought to carry motor error signals, also show activity related to a cue indicating the reward value of successful performance of a future motor task. The consensus view of CS function suggests that they represent precisely timed instructive signals which act to modify cerebellar cortical circuits in order to optimise cerebellar outputs and thus motor performance. In this paper, complex spikes were recorded from Purkinje cells of monkeys performing a smooth-pursuit task. Before the pursuit target is presented, a visual cue predicts the value of a reward contingent upon good motor performance following a "go" signal after a variable delay. The key finding is that CSs are powerfully modulated with precise timing shortly after the cue signaling the size of the future reward. This modulation is dramatically different for large reward predicting cues than for small reward predicting cues. The results are unexpected and the authors show convincingly that the CS are related to the reward predicting cue rather than to the subsequent reward. High reward predicting cues are followed by faster movements, so the cue related CS signal correlates with that. Similar reward predicting cue-related modulation is seen in both cells with target/eye-movement tuned CSs and non-target/eye movement related CSs. A further remarkable property is that responses to large reward predicting cues increase complex spike frequency whereas small reward predicting cues decrease it, suggestive of ideas from temporal difference (TD) learning.
This is a big finding for the field and would fit nicely with other recent work on reward signals in the cerebellum and connections between the cerebellum and dopamine system. Similar cue -related activity has been described previously for limb motor tasks (prefrontal cortex (Tsujimoto et al., 2012), anterior striatum/SMA (Romoand Schultz, 1992) and for saccade-related frontal lobe areas (Roesch and Olson, 2003), as well as dopaminergic substantia nigra/VTA neurons, but not previously in CSs. There was considerable enthusiasm about this work, and also some issues that should be addressed.
Essential revisions:
1) Additional effort should be made to address the possibility of "lurking" variables, such as movements, that could explain the reward-related CS activity rather than reward per se. It is likely that there are many motor responses related to orienting that may be correlated with "reward prediction. The analyses presented in Figure 3E and F or Supplementary figure 3 are not sufficient to rule out these potential confounds. While it would be unreasonable to expect the authors to control for all of them, additional exposition of some of the most likely should be possible through additional analyses of data already on hand and additional discussion of this issue. This is especially relevant since the authors did not do the experiments typically required to show a TD-like signal (e.g. reward omission) as has been reported in dopamine neurons (Schultz) and aversive CS signals (Ohmae and Medina, 2015). In particular, the following would improve the claim that the CS response can be attributed to differences in predicted reward size.
In the revised manuscript we thoroughly controlled for the licking and eye movements. We added Figure 3—figure supplement 1 and Figure 3—figure supplement 2 that show that licking and eye movements cannot explain the reward modulation at cue presentation
a) More extensive analysis of licking would be helpful.
Figure 3E,F plot the fraction of trials with lick? What about the number or timing of licks? Also, while it is true, as the authors state, that the monkey's licking rates are very high before the cue and decrease around the time of the cue, there does appear to be a substantial difference in licking behavior for large and small reward trials following the cue--around the time of the "reward prediction" CS. As IO neurons tend to depress during prolonged activation, it's perhaps not surprising that they would not be strongly driven during the pre-cue licking bouts, especially if the licks are not time locked to a particular event (which would cause jitter that could "blur" out consistent spiking). Further, the authors never show what the CSs are doing more than 200 ms preceding the cue (when licking is maximal).
Licking cannot explain the reward modulation at the cue, for the following reasons. First, we separately calculated the PSTH in each reward condition in trials with and without the initiation of a lick (Figure 3—figure supplement 1A,B). Even when the monkey did not initiate a lick, the difference between reward conditions was significant. Second, we calculated the PETEH of the cells that responded to the cue aligned to the onset and offset of licks (Figure 3—figure supplement 1C). We found that the lick response was small. Third, we used this small response together with the minor differences in the behavior between large and small rewards to show that the contribution of licking to the Cspk response was negligible (Figure 3—figure supplement 1D). Finally, to control for the temporal aspects of the licking we performed the analysis on both onset and offset of the licking.
As requested, we also display more time before cue presentation in Figure 2A to show that the Cspk rate did not increase when licking was maximal before the cue.
b) The analysis of saccades was not especially compelling. An advantage of doing this study in the floccular complex is that we know that complex spikes encode retinal slip, so for the neurons with directionally tuned CSs there is one key potential lurking variable, which is readily addressed, and that is eye movements that could produce retinal slip in the preferred direction of the CSs-i.e., eye movements in the opposite direction. More extensive analysis of the direction and velocity of eye movements during the cue period is needed.
We now further address this point by analyzing saccades during the cue period. We calculated the fraction of trials with saccades at different time points along the trial (Figure 3—figure supplement 2A,B). Only a minority of trials had saccades during the cue period. We repeated the analysis comparing the Cspk response to the different reward conditions for trials with and without saccades (Figure 3—figure supplement 2C,D). We found that Cspks differentiated between reward conditions in both cases.
Similar to the licking analysis (see 1A), we calculated the PETH of the cells that responded to the cue aligned to the occurrence of a saccade (Figure 3—figure supplement 2E). Here as well the response was very small. We approximated the contribution of saccades to the Cspk response and found that it was negligible and did not differentiate between reward conditions (Figure 3—figure supplement 2E).
We additionally checked if, within the minority of trials with saccades, differences in saccade velocity or direction could underlie the effect on Cspk rate. At the behavioral level, the distributions of saccade velocities and directions were mostly similar between reward conditions (Author response image 1A,B; Joshua et al., 2015). The median of the distribution of saccade velocities was smaller in the large reward condition than the small (saccades tended to be slower in the large reward condition; Signed-rank, p<0.001, n = 9833 saccades). However, this difference cannot solely underlie the coding of reward since reward size is coded even without saccades. Furthermore, this seems to be the opposite of what we would expect if saccade velocity were the underlying cause of the difference in firing rate during the cue. We also analyzed trials with slow and fast saccades separately. There was an increase in Cspk rate following the large reward cue in both cases, although we only used a small fraction of trials for each condition (only 32% of trials overall).
Lastly, we tested whether reward condition interacted with saccade direction, aligned to the preferred direction of the simple spikes of the same cell. We calculated the Cspk rate after the cue in trials with different saccade directions and shifted them so that the PD of simple spikes was to the right (Author response image 1D). There was an interaction between reward condition and saccade direction and in particular a large difference between reward conditions at the simple spike null direction (to the left). However, the difference was half a spike, which occurred in a minority of trials and does not explain the effect we observed. We briefly address the velocity and direction analysis in the paper.

Analysis of saccade velocity and direction during the cue.
(A) The distribution of saccade velocity in the 700 ms following the cue and proceeding target motion in both reward conditions. (B) The distribution of saccade directions in the same time period as A. (C) PSTHs aligned to cue for trials with slow and fast saccades (below the 40th and above the 60th percentile respectively). (D) The Cspk response following the cue in trials with different saccade directions. The right direction represents the PD of simple spikes for the same cell.
c) A key question is whether there is any cue-related SS modulation – might throw light on the function of the Purkinje cells that were not tuned for visual target direction and on events around the time of the cue related to the complex spikes.
We agree with the reviewer that understanding the relationship between complex and simple spikes is critical. We therefore expanded the analysis and increased the focus on this in the revised manuscript (Figure 6 and Figure 7). We found that the Sspk population exhibited heterogeneous responses. We now note the difference between this heterogeneity and the consistent coding of reward by Cspks. Furthermore, we could not find a link between a cell's simple and complex spike responses to the cue (Figure 7C-D). We address the lack of Cspk-Sspk correlation following the cue in the discussion.
2) Fuller presentation of the results. In general, the presentation of the results was clear, but to fully understand the details, readers must flit between Results section figures and Materials and methods section to try to follow some of the numbers and the specifics of the task.
a) Shortly after the cue presentation the data from Figure 2A suggests that the instantaneous discharge rate of complex spikes reaches around 4 Hz. Does this include multiple complex spikes with short Interspike intervals on a single trial? I strongly feel that an illustration of raw data would be very helpful here to allow complex spike and simple spike waveforms to be compared, and that this should have a broader timescale than in Figure 2. For example, this would make a good supplementary data figure.
We added a supplementary figure (Figure 2—figure supplement 1) showing example traces for the response of the cell in Figure 2. We also included histograms of the number of spikes in the 100-300 ms time bin following the cue for cells that responded to the cue presentation (Figure 2C,F).
b. The description of the task in subsection “Complex spikes encode the size of the expected reward” does not fully describe what happens: the coloured target steps in one direction before moving linearly in the direction orthogonal to that direction. This is buried at the end of the paper in the Materials and methods section, and I suspect a very small proportion of the readers will look at that. It really needs to be before or at the time that the task is described.
We added an explanation of this point when first describing our task in the Introduction section.
c) An issue is that the recordings were sampled from a heterogeneous population of Purkinje cells from the flocculus. The Materials and methods section says there were 149 cells, Figure 2 148.
We corrected this mistake, the correct number in both cases is 148.
d) Scatter plots in Figure 2D shows some complex spikes with very low rates in the 100-300 ms interval for both small and large rewards: this could arise if only very few trials were averaged. I cannot see data on minimum numbers of trials in the paper – it should be made available.
We thank the reviewers for noticing that this information was missing. We added it to the Materials and methods section. We also repeated the analysis using more stringent criteria (a minimum of 10 trials per condition) and this did not alter our results.
e) Differentiation is made in Figure 3D between individual Purkinje cells with significant complex spike modulation to the cue and those in which there was not a significant difference. These numbers need to be available in the manuscript – ideally in the first section of the results. At present, it is buried in the Materials and methods section at the end.
We added these numbers to the Results section.
f) Following from this, an example cell is shown in Figure 4 which had both cue-related reward modulation and directional modulation. Again, some numbers on these would be valuable; how many of the sample were significantly modulated by both?
In response to this comment, we performed additional analyses on the subpopulations of cells that responded to cue presentation and cells that were directionally tuned. Different approaches to analysis resulted in opposite results and effect directions. We therefore chose to remove Figure. 7 from the paper, as our data do not clearly show that these are two non-overlapping subpopulations. Instead, we now provide a more nuanced presentation in which we presented examples for the encoding of either or both variables (Figure 4—figure supplement 1 in the revised manuscript) and added the requested numbers to the paper.
g) Figure 6 compares the complex and simple spike modulations, but does not say anything about simple spike modulation related to the reward predicting the cue. At least a brief mention is needed – were any significantly modulated? Is so, were they different for large and small reward predicting cues?
We added a figure (Figure 6 in the revised manuscript) to address this point.
h) The final paragraph of subsection “Representation of reward and target motion in the population” states that the motion tuned cells add an early response which was not selective to reward, which is clear in Figure 7B. Is this a qualitative description or were these significant differences? Since these are population curves, does this reflect heterogeneity in the response pattern, or is it a genuine representation of the behaviour of individual cells?
We do not have enough data to make this claim quantitatively (see our response to comment 2F).
i) How were the cross correlations between CS and SS calculated? (subsection “Data analysis”). One might expect that the authors would calculate the correlations of the two types of spikes triggered on the onset of the cue or pursuit or some other trigger related to the task. It seems that instead they have calculated it based on CS triggered SS PSTHs. This is strange since this is typically the way that people verify that the CS is coming from the same Purkinje cell as the SS. Could it be that in CS with low negative correlations with SS the CS and SS are not actually coming from the same Purkinje cell?
The description of the analysis was not clear enough. We have rephrased it in the revised manuscript. This paragraph referred to Figure 1—figure supplement 2D. The purpose was indeed to verify that there was a pause in Sspk rate following a Cspk, as has previously been shown. We now reference the figure in the Materials and methods section to make this clearer. The correlation now shown in Figure 7C and D was the average responses and not the trial-by-trial activity.
j) The MRI images in Figure 1—figure supplement 2 don't do much to help convince readers that the authors are recording in or near the VPFL. At the very least, some labels would help. Even better would be if the imaging can be done with an electrode or similar placed in a recording track (which would hopefully reside in the VPFL).
We added labels to Figure 1—figure supplement 2A. We cannot perform the MRI scan again, since the chambers were removed from both monkeys. Our main criteria for defining the location was the response of the cells to pursuit eye movements. All other cells were recorded on days when we searched for the eye movement neurons. Our ability to reconstruct the recording sites is very limited, probably due to errors in the exact angle of the guide, and error that is especially enhanced in deep brain recordings. We attempted to calculate the cell's location on the MRI images, but this was not accurate and resulted in some cells being placed outside the brain, or cells with clearly identified complex spikes and simple spikes falling outside the cerebellum.
k) In the Conclusion, the statement that the paper demonstrates "how climbing fibres encode predicting the reward size" is not really appropriate – it shows how a subpopulation of CSs innervating Purkinje cells in the flocculus have activity that encodes predicted reward.
We replaced this statement with a more accurate one.
l) I also disagree with the statement that the authors have "found signals that can be used by the cerebellum to drive behaviour that maximises upcoming reward" – that is entirely possible, but is not shown in this paper – may be used would be more appropriate.
We replaced this statement with a more accurate one.
3) Additional discussion of the potential function of the CS activity related the reward cue.
It is not clear what causal role the authors believe the "reward prediction" signal is playing in the task the monkeys are performing. The cue-related CSs come at a time when the Purkinje cells are not engaged in movement and relate differently to ongoing SS compared to CS occurring during movement – in this case what is their function? On the one hand, the authors show that the pursuit velocity is higher on the large reward trials; but, on the other hand, they show that the Purkinje cells with the highest "reward prediction" signals are the least tuned for the direction of pursuit. How do the authors think these untuned Purkinje cells fit within the pursuit control system? Do they even have access to the motor neurons controlling the eye? If not, why does the cerebellum need such a "reward prediction" signal during this task? Could the increased velocities on the large reward trials not just as easily be explained by increased attention/motivation that is not necessarily dependent on the cerebellum?
This is especially relevant since the majority of the neurons recorded by the authors did not appear to be within the VPFL, at least as it has been defined by Lisberger and others. That is, the percentage of Purkinje cells with directionally tuned simple spikes is very low compared to what is typically reported for VPFL Purkinje cells and few of the CSs seem to encode retinal slip signals. If the Purkinje cells are not in the VPFL how do the authors know they are even causally involved in the task? More information should be provided about the relative locations of the recorded cells that are tuned and untuned with reference to what is known in the literature about non-visual CS receptive fields in the areas around the flocculus.
We agree with the reviewer that it is extremely important to move beyond representation to a mechanistic understanding of how reward affects cerebellar computation and behavior. We think this is one of the main upcoming challenges of the field. In fact, when we designed the experiment this was exactly what was on our mind. Reward coding at motion onset might have led to linking the behavioral effect, complex spikes and simple spikes, like what was found in the cerebellum during learning. However, this is not what we found. It is possible that the difference in behavior (Figure 1B,C) was not caused by the Cspk signal, so we were careful not to claim that it did and have made this clearer in the discussion in the revised paper. This signal could be involved in the learning of the associations, when the difference in behavior is acquired, but our experiment does not allow us to test this.
We did not manage to accurately calculate the locations of individual cells (see our response to comment 2J), so it is possible that some the neurons we recorded were outside the eye movement areas. We therefore focused on the existence of this signal during the cue and refrained from making claims as to their involvement in movement. When discussing movement-related activity (Figure 4 and Figure 5) we only included cells that were directionally tuned. We reviewed the literature and discussed this with other researchers to find references to non-eye movement areas that surround the Flocculus but could not find information that we could relate to our results.
A weakness of the task design is that it is impossible to distinguish between quantitative versus relative reward size encoding. Although the authors do not claim to study this directly, it is implicit in the learning models cited that there be quantitative reward size encoding (see point #2).
We agree with the reviewer that the coding might be relative rather than absolute. We mention this point in the discussion to prevent over-interpretation of our data.
[Editors' note: further revisions were requested prior to acceptance, as described below.]
Essential revisions:
To be fair, this issue is also a potential confound for many or all of the reports that have been coming out about reward coding in the cerebellum. The very exciting thing about the present experimental design is that it has the potential to nail this issue in a way that other studies have not been able to. In contrast to much of the work on reward coding in the cerebellum, which has been done in regions of the cerebellum where very little is known about coding, in this study, the recordings were made from the floccular complex of the cerebellum, where the signals carried by the simple spikes and complex spikes are probably better understood than anywhere in the cerebellum. And what is known about coding by the complex spikes is that they encode image motion on the retina, or retinal slip, in a particular, preferred direction (almost always contraversive or up). Therefore, the most important potential confound for the claim that the complex spikes encode upcoming reward size, is the possibility that the expectation of a large reward causes the animal to make smooth eye movements (slow drift) that results in retinal slip in the preferred direction of the climbing fiber. The authors analyzed licking and saccades, but the most important potential 'lurking variable" to worry about is the known responsiveness of complex spikes in this region of cerebellum to retinal slip at speed of a few degress/s, and that was not convincingly analyzed. The good news is that is quite do-able, for example, by doing a complex spike-triggered average of retinal slip, and comparing the results for the period after cue presentation with those during target motion in the "on" direction for the complex spikes. Unfortunately, the number of cells that were both responsive to the cue and directionally tuned (12 cells) may not be sufficient, hence more cells may need to be recorded.
Several lines of evidence suggest that the response to the reward cue is not a result of the retinal slip caused by drifts of the eye. We used both our data and behavioral data from four additional monkeys recorded using coils on similar tasks to control for the possible confound of drift movements. We present these controls in Figure 4—figure supplement 2 of the revised manuscript.
We recorded eye movements using an eye tracker camera. We follow the lead of some groups that use eye tracker data to analyze drift (Engbert and Kliegl, 2004; Herrmann et al., 2017). However, we note that others show that camera measurement noise might dominate the drift signal (Kimmel et al., 2012; Ko et al., 2016). We therefore first reduced this measurement noise to the best of our ability. The eye tracker measurement of vertical eye velocity was highly correlated to the pupil area measurement we collected simultaneously (Author response image 1). We therefore fitted a linear model between pupil size and eye position for the averages of each recording session. Based on the model, we predicted the measured drift from the pupil size for each trial and subtracted it from the measured drift to obtain an assessment of the drift that is independent of the pupil size artifact. We used these data for the analysis in Figure 4—figure supplement 2A-C and Figure 4—figure supplement 2D second column.
Even after correcting for the pupil artifacts, the signal might still have been dominated by measurement noise. We therefore also used behavioral data recorded on similar tasks using a coil to assess the size of the drift (Author response image 2). In some monkeys the drift differed between reward conditions. However, the direction of the difference was not consistent, and in some monkeys, there was no difference at all. Even in the monkey with the largest difference between drift conditions (Monkey R in Author response image 2) the difference between large and small rewards was very small (~0.2 deg/s). We used these traces to show that the contribution of the drift to the firing rate is expected to be very small (Figure 4—figure supplement 2D).
Lastly and importantly, many of the cells that responded to the cue did not respond during target motion. Therefore, unless these cells responded specifically to small drifts or small retinal slips (which has never been documented in the cerebellum) the reward related response in these cells was not related to the drift.
Overall, the controls above suggest that it is unlikely that the tiny drift at motion onset would drive the robust response after cue onset.

Correcting the influence of pupil size on eye position measurements.
(A) Vertical eye position aligned to cue presentation, for each reward condition. (B) Horizontal eye position aligned to cue presentation, for each reward condition. (C) Pupil area in arbitrary units aligned to cue presentation, for each reward condition. (D) Distribution of R2s for models fitting pupil area to vertical (up) or horizontal (down) eye position for each recording session (vertical median = 0.94, horizontal median = 0.17, n = 208).

Drift following reward size cue presentation, measured using a coil.
(A-D) Vertical drift velocity measured using coil following reward cue presentation. (E-H) Horizontal drift velocity measured using coil following reward cue presentation.
[Editors' note: further revisions were requested prior to acceptance, as described below.]
The analysis provided in Figure 4—figure supplement 2B is helpful.
Figure 4—figure supplement 2A,C would be more helpful if the variance was provided as well as the mean. If the mean was zero and the variance was 10deg/s, there would be a lot more image motion in the preferred direction than if the mean was zero and variance was zero. This matters since there is an asymmetry in the ability of preferred ad anti-preferred direction image motion to increase and suppress the climbing fiber firing rate because of the low basal firing rate and floor at zero.
The analysis provided in Figure 4—figure supplement 2D is based on an assumption that is not supported by the literature--that the climbing fiber response to retinal slip increases linearly with the velocity of image motion. On the contrary, in rabbit, cat, and rat, climbing fibers in the flocculus are tuned for low retinal slip speeds, and their responses fall off with speeds >1-2{degree sign}/s (Simpson and Alley, 1974; Blanks and Precht, 1983; Kusunoki et al., 1990; Fushiki et al., 1994). In monkeys, some climbing fibers respond to higher speeds, yet the assumption of a linear increase with image velocity is not supported (Noda et al., 1987; Hoffmann and Distler, 1989; Guo et al., 2014).
It is still not clear to me why the authors would not do the obvious thing of a complex spike triggered average of eye velocity in the preceding ~200 ms (for complex spikes in the period associated with the reward cue). After going through the trouble of correcting for pupil size artifacts in the camera data, why not give the climbing fiber spike-triggered eye velocity average a try? Those measurements might be dominated by noise, yet it would still be nice to know that nothing could be detected. Because if something is detected despite the measurement limitations, that would require a reinterpretation of the main result
Essential revisions:
Remove Figure 4—figure supplement 2, panel D, unless the authors can provide some justification for the assumption of linear increase in climbing fiber response with image velocity.
In Figure 4—figure supplement 2, the SEMs were included but were small. We replaced them with STDs to make the comparison between variances possible (Figure 4—figure supplement 2A-D in the revised paper). We removed Figure 4—figure supplement 2 as requested. We added the Cspk-triggred drift (Figure 4—figure supplement 2E in the revised paper).
References
Engbert R, Kliegl R. 2004. Microsaccades Keep the Eyes’ Balance During Fixation. Psychol Sci 15:431–431.
Herrmann CJJ, Metzler R, Engbert R. 2017. A self-avoiding walk with neural delays as a model of fixational eye movements. Sci Rep 7:12958.
Ko H, Snodderly DM, Poletti M. 2016. Eye movements between saccades: Measuring ocular drift and tremor. Vision Res 122:93–104.
https://doi.org/10.7554/eLife.46870.021Article and author information
Author details
Funding
H2020 European Research Council (imove 755745)
- Mati Joshua
Human Frontier Science Program (CDA 00056)
- Mati Joshua
Israel Science Foundation (38017)
- Mati Joshua
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Y Botschko for technical assistance. This study was supported by a HFSP career development award, the Israel Science Foundation and the European Research Council.
Ethics
Animal experimentation: All the procedures described in this paper were approved in advance by the Institutional Animal Care and Use Committees of the Hebrew University of Jerusalem (ethics approval number MD15145854) and were in strict compliance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals.
Senior Editor
- Ronald L Calabrese, Emory University, United States
Reviewing Editor
- Jennifer L Raymond, Stanford University School of Medicine, United States
Reviewer
- Steve A Edgley, University of Cambridge, United Kingdom
Version history
- Received: March 14, 2019
- Accepted: October 24, 2019
- Accepted Manuscript published: October 29, 2019 (version 1)
- Version of Record published: November 11, 2019 (version 2)
Copyright
© 2019, Larry et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,379
- Page views
-
- 459
- Downloads
-
- 39
- Citations
Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Consumption of food and water is tightly regulated by the nervous system to maintain internal nutrient homeostasis. Although generally considered independently, interactions between hunger and thirst drives are important to coordinate competing needs. In Drosophila, four neurons called the interoceptive subesophageal zone neurons (ISNs) respond to intrinsic hunger and thirst signals to oppositely regulate sucrose and water ingestion. Here, we investigate the neural circuit downstream of the ISNs to examine how ingestion is regulated based on internal needs. Utilizing the recently available fly brain connectome, we find that the ISNs synapse with a novel cell-type bilateral T-shaped neuron (BiT) that projects to neuroendocrine centers. In vivo neural manipulations revealed that BiT oppositely regulates sugar and water ingestion. Neuroendocrine cells downstream of ISNs include several peptide-releasing and peptide-sensing neurons, including insulin producing cells (IPCs), crustacean cardioactive peptide (CCAP) neurons, and CCHamide-2 receptor isoform RA (CCHa2R-RA) neurons. These neurons contribute differentially to ingestion of sugar and water, with IPCs and CCAP neurons oppositely regulating sugar and water ingestion, and CCHa2R-RA neurons modulating only water ingestion. Thus, the decision to consume sugar or water occurs via regulation of a broad peptidergic network that integrates internal signals of nutritional state to generate nutrient-specific ingestion.
-
- Neuroscience
Complex behaviors depend on the coordinated activity of neural ensembles in interconnected brain areas. The behavioral function of such coordination, often measured as co-fluctuations in neural activity across areas, is poorly understood. One hypothesis is that rapidly varying co-fluctuations may be a signature of moment-by-moment task-relevant influences of one area on another. We tested this possibility for error-corrective adaptation of birdsong, a form of motor learning which has been hypothesized to depend on the top-down influence of a higher-order area, LMAN (lateral magnocellular nucleus of the anterior nidopallium), in shaping moment-by-moment output from a primary motor area, RA (robust nucleus of the arcopallium). In paired recordings of LMAN and RA in singing birds, we discovered a neural signature of a top-down influence of LMAN on RA, quantified as an LMAN-leading co-fluctuation in activity between these areas. During learning, this co-fluctuation strengthened in a premotor temporal window linked to the specific movement, sequential context, and acoustic modification associated with learning. Moreover, transient perturbation of LMAN activity specifically within this premotor window caused rapid occlusion of pitch modifications, consistent with LMAN conveying a temporally localized motor-biasing signal. Combined, our results reveal a dynamic top-down influence of LMAN on RA that varies on the rapid timescale of individual movements and is flexibly linked to contexts associated with learning. This finding indicates that inter-area co-fluctuations can be a signature of dynamic top-down influences that support complex behavior and its adaptation.