1. Neuroscience
Download icon

Cerebellar involvement in an evidence-accumulation decision-making task

  1. Ben Deverett  Is a corresponding author
  2. Sue Ann Koay
  3. Marlies Oostland
  4. Samuel S-H Wang  Is a corresponding author
  1. Princeton University, United States
  2. Rutgers Robert Wood Johnson Medical School, United States
Research Article
  • Cited 0
  • Views 1,957
  • Annotations
Cite this article as: eLife 2018;7:e36781 doi: 10.7554/eLife.36781

Abstract

To make successful evidence-based decisions, the brain must rapidly and accurately transform sensory inputs into specific goal-directed behaviors. Most experimental work on this subject has focused on forebrain mechanisms. Using a novel evidence-accumulation task for mice, we performed recording and perturbation studies of crus I of the lateral posterior cerebellum, which communicates bidirectionally with numerous forebrain regions. Cerebellar inactivation led to a reduction in the fraction of correct trials. Using two-photon fluorescence imaging of calcium, we found that Purkinje cell somatic activity contained choice/evidence-related information. Decision errors were represented by dendritic calcium spikes, which in other contexts are known to drive cerebellar plasticity. We propose that cerebellar circuitry may contribute to computations that support accurate performance in this perceptual decision-making task.

https://doi.org/10.7554/eLife.36781.001

Introduction

Although the cerebellum is best known for its role in controlling movement, clinical and experimental evidence have long indicated that the posterior cerebellum regulates a wide range of cognitive functions (Konarski et al., 2005; Schmahmann and Sherman, 1998; Stoodley et al., 2012), including decision-making and working memory (Blackwood et al., 2004; Desmond et al., 1997; Ernst et al., 2002; Kansal et al., 2017). For example, focal cerebellar lesions in humans lead to impairment in working memory performance (Gottwald, 2004), and cerebellar fMRI activation increases with working memory demands (Küper et al., 2016). However, very little is known about the circuit mechanisms supporting these roles.

In the domain of movement control, the cerebellum is thought to use sensory and internal information as a means of adjusting action on a subsecond scale (Krakauer and Shadmehr, 2006). The cerebellar cortex consists of highly characteristic circuitry occurring in repeating modules which are likely to perform similar manipulations on information irrespective of whether the information is sensory, motor, or neither (Popa et al., 2014; Reeber et al., 2013). Thus, well-established models of cerebellar motor learning may be expanded to support the control of cognitive processing (Ito, 2008).

Neuronal correlates of the perceptual decision-making process have been studied using behavioral tasks in animal models (Carandini and Churchland, 2013) including evidence accumulation paradigms in which animals must continuously update the contents of working memory to guide a decision (Brunton et al., 2013; Gold and Shadlen, 2007; Morcos and Harvey, 2016; Pinto et al., 2018a). Behavioral performance in these tasks develops over time and can be marked by decision side biases, history effects, and error rates that diminish with training (Pinto et al., 2018a). The detailed mechanisms by which accurate decisions are formed and errors are reduced remain unsolved.

Neurons in multiple brain structures across species have been found to represent various stages in the transformation from sensory information to decision signals. These regions include prefrontal, premotor, parietal, and primary and secondary sensory cortices, striatum, midbrain structures, and possibly others (Akrami et al., 2018; Brody and Hanks, 2016; Gold and Shadlen, 2007; Scott et al., 2017; Yartsev et al., 2018). Many of these structures are reciprocally connected with the cerebellum, notably with posterior cerebellar regions such as crus I (Kelly and Strick, 2003; Prevosto et al., 2010; Strick et al., 2009). Communication between forebrain structures and the posterior cerebellum (Buckner et al., 2011; Stoodley et al., 2017) raises the possibility that the cerebellum might participate in the formation or updating of decision-related signals.

We investigated cerebellar neural activity during decision-making in a head-fixed rodent model. Like previously developed decision-making tasks (Brunton et al., 2013; Morcos and Harvey, 2016; Shadlen and Newsome, 2001), our task demands dynamic manipulation of working memory and decision-making under uncertainty, which recruit cerebellar activation in humans (Blackwood et al., 2004; Kansal et al., 2017), as well as the correction of errors, a cerebellar role that may extend beyond the motor domain (Ito, 2008).

Results

A decision-making task for cerebellar investigations

To study decision-making in the cerebellum, we developed a task with five key properties: (1) integration of evidence over seconds (Scott et al., 2015), (2) minimal movement until presentation of a readout cue (Shadlen and Newsome, 2001), (3) task structure to match established decision-making frameworks (Brunton et al., 2013), (4) sensorimotor engagement of the lateral posterior cerebellum (Manni and Petrosini, 2004), and (5) amenability to head-fixed conditions to facilitate two-photon imaging (Dombeck et al., 2007). In our evidence accumulation task (Figure 1A, Video 1), each trial contains a 3.8 s cue period in which a series of air puffs (pieces of evidence) is delivered to the left and right whiskers. Then, following a short delay period with no stimuli, lick ports are brought into the animal’s reach and mice lick leftward or rightward to indicate which side received the greater number of stimuli, with a correct response leading to a water reward to end the trial. Puffs are generated randomly with differing rates on each side, demanding that mice continually attend to the stimuli to achieve optimal performance (Brunton et al., 2013).

Figure 1 with 3 supplements see all
A somatosensory decision-making task that depends on the cerebellum.

(A) In each trial, two streams of random, temporally Poisson-distributed air puffs were delivered to the left and right whiskers. After a delay, mice licked one of two lick ports indicating the side with more cumulative puffs to receive a water reward. Gray-shaded regions from left to right: cue period, delay, intertrial interval. Decision lick: first detected lick after the delay. (B) Psychometric performance data on the evidence accumulation task. Gray lines, individual mice; black points, average across all trials from all animals (n = 38,615 trials, 12 mice). (C) Logistic regression analysis correlating animal choice with cues delivered at different time bins of evidence presentation, demonstrating that the entire cue period was used to guide decisions. Each point indicates the magnitude of that time bin’s influence on decisions (all points significantly greater than zero, Wald test, p<0.0001). For comparison, bins (gray points) or choices (shaded 1 s.d. gray zone) were shuffled. Error bars: 95% confidence interval. (D) Behavioral effect of bilateral injections of muscimol or saline into crus I, compared to baseline performance with no injections. Each set of joined points represents one mouse. Error bars: 95% confidence interval. *p<0.05, n.s.: not significant (two-tailed paired t-test). (E) Movie-based licking measurements from mice over the duration of trials. Bar heights show mean ±s.e.m. across animals of trial-averaged licking signals. (F) Example cranial window over the left posterior hemispheric cerebellum, indicating the site of imaging and inactivation.

https://doi.org/10.7554/eLife.36781.002
Video 1
Example trials of a mouse performing somatosensory evidence accumulation.

Flashes along the sides indicate air puffs delivered to the whiskers. Flashes along the bottom indicate detected licks.

https://doi.org/10.7554/eLife.36781.006

Mice learned to perform this task with high accuracy (Figure 1B, Figure 1—figure supplement 1). Behavioral regression analysis demonstrates that mice used evidence throughout the entire cue period to guide decisions, with a bias for evidence toward the end of the cue period (Figure 1C), similar to some recency strategies that have been documented in human evidence accumulation (de Lange et al., 2010). Like other tasks in which movement is minimal until a go cue is presented (Scott et al., 2017; Shadlen and Newsome, 2001), mice learned not to lick during evidence presentation (Figure 1E). This task can therefore be used to study working memory, evidence accumulation, and decision-making under head-fixed conditions.

We focused our study on the ansiform area (crus I) (Luo et al., 2017) of the posterior hemispheric cerebellum (Figure 1F), a region that evolutionarily expanded in tandem with prefrontal cortex (Balsters et al., 2010) and communicates bidirectionally with forebrain regions including prefrontal, parietal, and somatosensory cortex (Kelly and Strick, 2003; Prevosto et al., 2010; Proville et al., 2014). This cerebellar region represents orofacial features under anesthesia (Manni and Petrosini, 2004; Shambes et al., 1978), suggesting that it might aid in processing complex task-related information. First, to determine whether activity in this cerebellar region participates in the decision-making behavior, we injected the GABAA agonist muscimol bilaterally into crus I. Inactivations reduced choice accuracy while leaving intact the ability to lick and perform trials (Figure 1D, Figure 1—figure supplement 2). To quantify the behavioral effects of the perturbation, we fit the inactivation data to a logistic regression model that considers the animal’s choice on a trial-by-trial basis as a function of current evidence, the previous trial choice and outcome, and a bias (Busse et al., 2011; Licata et al., 2017). Fits to this model suggest that inactivations altered multiple behavioral parameters, notably including a reduction in animals’ sensitivity to evidence and an increased tendency to make the same choice as in the previous trial (Figure 1—figure supplement 2C). Therefore, activity in this region is necessary for successful performance of the task, suggesting it may play a role in decision-making computations.

Purkinje cell somatic calcium encodes task-relevant information

In previously investigated brain regions, neurons exhibit choice- and evidence-specific modulations of activity over the duration of evidence accumulation and decision-making (Ding and Gold, 2012; Hanks et al., 2015; Latimer et al., 2015; Scott et al., 2017; Shadlen and Newsome, 2001). To test for choice- and evidence-related activity in Purkinje cells, we imaged somatic calcium, which follows modulations in simple-spike rate (Lev-Ram et al., 1992; Ramirez and Stell, 2016), using the genetically encodable calcium indicator GCaMP6f in mice performing the decision-making task (Figure 2A,B). We imaged a total of 843 Purkinje cell somata in four mice. We found a population of cells in which calcium activity was modulated during the cue period, exhibiting increases or decreases in fluorescence spanning the duration of evidence accumulation and decision formation (Figure 2C–E). In 70% of cells, cue-period fluorescence was better correlated with time than pre-cue-period fluorescence was (95% CI: 67–72%, bootstrap). This was significant compared to when cue and pre-cue period identity was shuffled (49% of cells; 95% CI: 46–52%). These modulations were sometimes evident at the level of individual trials (Figure 2—figure supplement 1). At the end of each trial, activity returned to baseline (p=0.91, two-tailed paired t-test).

Figure 2 with 4 supplements see all
Task-dependent modulation of Purkinje cell somatic calcium signals.

(A) Example two-photon field of view of Purkinje cell somata. (B) Traces of extracted calcium signals from somata indicated in (A). Shaded regions and ramps at top indicate cue periods. (C) Trial-averaged activity during evidence presentation from two example cells. Modulation index r was defined as the Pearson correlation between the averaged signal and time in the cue period. Confidence interval on traces indicates s.e.m. (D) Cue-period fluorescence modulation in all imaged somata (n = 4 mice, 843 cells). Modulation index r was computed preceding the cue period (‘pre-cue’) and during the cue period. (E) Trial-averaged activity during the cue period of neurons with the highest absolute modulation index (top 5%) in each session. ∆F/F signals are mean-subtracted. (F) Output of a linear decoder predicting the animal’s upcoming choice and the side with more evidence on a trial-by-trial basis using somatic data from the cue period of each trial. Each trace represents the mean ±s.e.m. (n = 6 sessions in four mice). Choice: side of the animal’s decision. Evidence: side with more evidence. Gray-shaded regions: cue period. Shuffle: relevant variable (choice or evidence, respectively) was shuffled across trials. Ind: relevant variable (choice or evidence, respectively) was shuffled while holding the other variable constant, to compute the independence of encoding of the relevant feature. *: p<0.01 (paired t-test using cue-period-only data).

https://doi.org/10.7554/eLife.36781.007

Cytoplasmic calcium acts as a temporally filtered readout of firing rate, and calcium extrusion in Purkinje cells occurs on a slower time scale (see Figure 3BKonnerth et al., 1992; Lev-Ram et al., 1992; Fierro and Llano, 1996; Rokni and Yarom, 2009; Ramirez and Stell, 2016) than in neocortical neurons (Chen et al., 2013). Therefore, our observed increasing and decreasing time courses of calcium could reflect various firing rate profiles, such as impulse responses, ramps, or steps. We did find that electrically recorded Purkinje cells exhibited gradually increasing rates of firing throughout the cue period (Figure 2—figure supplement 2).

Figure 3 with 1 supplement see all
Purkinje cell representations of choice and evidence.

(A) Left: mean activity of four example somata during the cue period, split according to the choice made in each trial. Traces represent mean ±s.e.m. over all trials of a particular choice. Right: summary of the relationship between modulation index r and animal choice for all imaged cells. Red x’s: cells shown on left. (B) Top: mean cue-period activity in correct trials from one example cell, split according to the strength of evidence presented (strong: #L puffs > 9; weak: #L puffs < 2). Bottom: mean puff-triggered response of one example cell to left (L)- and right (R)-sided puffs. Mean t1/2 decay: 406 ms. Shading: s.e.m. (C) A linear model was used to determine the influence of left- and right-sided puffs on pre-decision fluorescence activity for each cell over all trials. Left: each dot represents one cell. Modulation: normalized coefficient of the linear fit between puff number and fluorescence. Colored data points indicate cells with significant coefficients. Right: Proportion of cells in each category on left. Shuffle: puff counts were shuffled across trials of the same choice before regression. Percent of modulated cells is significantly above the shuffle for the +L, +R and ±(L,R) conditions (p<0.0001, two-tailed z-test). (D) Mean cue-period activity in correct trials across all evidence-modulated cells, split according the level of evidence presented in the trial (strong: #pref side puffs-#nonpref side puffs > 8; weak: #pref side puffs-#nonpref side puffs<-8).

https://doi.org/10.7554/eLife.36781.012

Elsewhere in the brain, neuronal activity during evidence accumulation and decision-making can encode behavioral variables of choice and evidence (Gold and Shadlen, 2007; Latimer et al., 2015; Scott et al., 2017). To determine whether predictive behavioral information was represented in the population activity of these Purkinje cells, we constructed linear classifiers based on all somatic signals in each animal. These classifiers accurately decoded the upcoming choice and the side with greater evidence (Figure 2F), indicating that as a population, the imaged neurons encode behaviorally relevant features of the decision-making process.

Because choice and evidence are correlated when mice successfully perform the task, we asked whether choice- or evidence-related information existed independently at the population level in the neuronal signals. To separate the two, we determined how decoding accuracy changed after removing information about one of the variables, by shuffling its identity across trials while holding the other variable constant. For example, when the choice on each trial was randomly assigned to another trial with the same sensory evidence, choice decoding accuracy dropped significantly (Figure 2F, top panel, top two traces). The difference in decoding accuracy between the original and shuffled data indicates the magnitude of independent choice-related information in the population-level neural activity. We performed the converse test as well, shuffling evidence while holding choice constant, and found that evidence-related information is also represented independently in population-level neuronal activity (Figure 2F, bottom panel). Therefore, somatic signals encode both choice- and evidence-related information.

The encoding of choice and evidence variables suggests that these neurons might play a role in decision-making computations. However, in an alternative hypothesis, the somatic signals we observed might represent motor behaviors that occur as an independent consequence of the decision-making process, for example by encoding motor commands for licking or other movements. The imaged region is known to encode primarily orofacial features in rodents (Bosman et al., 2010; Manni and Petrosini, 2004). To test for pre-decision movements, we used camera recordings to measure licking as well as five other motor behaviors during evidence accumulation for trials with differing evidence and choices (Figure 2—figure supplements 3 and 4, Videos 2 and 3). Licking, nose, whisker, and forepaw movements did not differ across trial types and were unable to predict choice and evidence variables. Therefore, Purkinje cells encode choice- and evidence-related variables with minimal information about predictive anticipatory movements.

Video 2
Measurement of orofacial movements from behavioral movies.

Traces represent the extracted movement metric (see Methods) from the corresponding regions outlined in the movie.

https://doi.org/10.7554/eLife.36781.014
Video 3
Measurement of forepaw movements from behavioral movies.

Traces represent the extracted movement metric (see Materials and methods) from the denoted paws.

https://doi.org/10.7554/eLife.36781.015

Dynamics of choice- and evidence-related information in Purkinje cells

To determine how individual Purkinje cells represented choice, we examined their coding properties in left- and right-choice trials. In 80% (678/843) of cells, cue-period calcium was modulated in the same direction (i.e. upward or downward) without regard to whether the upcoming decision was left or right, while in the remaining 20% (165/843) of cells, activity for left choices and right choices was modulated in opposite directions (Figure 3A). In 30% (256/843) of cells, pre-decision fluorescence (measured in the 500 ms preceding the end of the delay) differed significantly between L-choice and R-choice trials (criterion p<0.05, two-tailed t-test). Of these choice-selective cells, 63% (162/256) exhibited greater activity in left-choice trials, compared with 37% (94/256) in right-choice trials. While recordings from a single (left) hemisphere might have been expected to produce strongly lateralized representations, these mixed representations of left and right choices are consistent with neocortical recordings in decision-making, particularly in frontal regions (Erlich et al., 2011).

We next asked how Purkinje cells represented evidence. We observed cells in which cue-period activity was modulated by the strength of evidence presented, and responses to individual sensory events were apparent in some cells as puff-triggered averages that rose and fell in approximately 1 s (Figure 3B). Therefore, to quantify the extent to which the strength of evidence affected the activity of each neuron, we used linear regression to fit trial-by-trial dependence of pre-decision fluorescence on evidence quantity (Figure 3C), where evidence was defined as the total number of right puffs (#R) or left puffs (#L) in a trial. Based on the significance and coefficients of these fits, each cell was categorized as having either a positive (+) or negative (-) relationship between fluorescence and evidence on the left (#L), right (#R), or both (#L,#R) sides. We found significant relationships in 26% (216/843) of neurons, with most cells exhibiting a correlation with single-sided evidence (+L, -L, +R, and -R; 178 cells), and a smaller number showing a correlation with a linear sum or difference of evidence (±(L,R)/±(L, -R), 38 cells). Therefore, individual cells were predominantly but not exclusively sensitive to evidence presented on one side, consistent with properties of some neocortical neurons in evidence accumulation (Scott et al., 2017). As a population, these evidence-modulated cells encoded the strength of evidence presented for decision-making (Figure 3D). In animals not performing the decision-making task, cue-period fluorescence modulation, evidence side decoding, and evidence modulation were not observed (Figure 3—figure supplement 1), indicating that the signals we observed are task-specific and are not consequences of baseline Purkinje cell response properties. The evidence-related representations we observed do not demonstrate precise moment-to-moment integration of evidence that is thought to occur in some forebrain neurons (Gold and Shadlen, 2007; Hanks et al., 2015), but they do suggest an engagement of the cerebellum in the processing of important task variables.

Error-associated signaling in Purkinje cell dendrites

In theories of cerebellar learning, the transformation of mossy-fiber input to Purkinje cell output is refined by climbing fiber-driven error signals which drive plasticity (Marr, 1969). These instructive error signals evoke calcium transients in Purkinje cell dendrites (Ozden et al., 2009; Tank et al., 1988) and an accompanying complex electrophysiological spike (Llinás and Sugimori, 1980). To test for task-related activity in this pathway, we imaged calcium using GCaMP6f in Purkinje cell dendrites (Figure 4A,B). We observed that in many cells, dendritic events occurred directly following the animal’s decision, specifically when that decision was an error (Figure 4C–E). In 82% of cells, the mean activity following errors exceeded that following rewards (p<0.0001, Wilcoxon signed-rank test). This increase in activity occurred in both left- and right-choice trials, in which sensory events differed, suggesting that the signal was reporting a task feature that was independent of pre-decision evidence.

Figure 4 with 1 supplement see all
Purkinje cell dendrites encode decision errors.

(A) Example two-photon field of view of Purkinje cell dendrites. (B) Signals extracted from cells indicated in (A). Red ticks: dendritic calcium transients extracted from the bottom trace. (C) Activity of one cell in six trials, aligned to the moment of the decision lick. (D) Mean activity of one example cell aligned to the moment of the decision lick. Left: activity is divided into correct and error trials. Right: activity is further divided into left-choice and right-choice trials. Error shading indicates s.e.m. (E) Summary of mean activity in the 800 ms following reward delivery (correct trials) or lack thereof (error trials) (n = 6 mice, 599 cells). (F) Left: mean response of an example dendritic signal aligned to moments when licking ceased, split according the outcome of the trial in which the lick cessation occurred. Right: histograms indicating the magnitude of dendritic activity measured at moments when animals ceased (top) or initiated (bottom) licking, presented as a ratio of activity in error vs correct trials; cells with values greater than one exhibited increased activity when lick-cessation/initiation events occurred with errors, in comparison to the same motor event in correct trials. Error activity is elevated in a significant fraction of cells for all four histograms shown (p<0.0001, Wilcoxon signed-rank test). (G) Outcome (correct/error) decoding on a trial-by-trial basis using neuronal population activity in the period following reward delivery or lack thereof (post-choice), or the period preceding the decision (pre-choice). One line per behavioral session (n = 7 sessions, six mice). Thick lines: mean across sessions.

https://doi.org/10.7554/eLife.36781.016

When mice make errors, their behaviors, especially their licking patterns, differ from correct trials. We therefore tested whether the elevation in dendritic signaling could reflect a purely motor event such as a lick-cessation signal, since mice cease licking at moments of error (Figure 4—figure supplement 1). Such a lick-cessation signal should occur not just at moments of error, but whenever licking ceases, including in correct trials. We therefore measured dendritic signals at every instance of lick cessation in both error and correct trials, and compared the magnitude of the signals across the two contexts. We found that lick-cessation-aligned dendritic signalling was significantly elevated in error trials relative to correct trials (Figure 4F), indicating that our results are not explained by lick-cessation signals. We tested a number of similar hypotheses, including lick initiation events, orofacial movements, varying licking magnitudes, and trials in the absence of auditory cues, and found that dendritic signals were also significantly error-modulated in all cases (Figure 4F, Figure 4—figure supplement 1). Thus dendritic events encode an error-associated signal that is not specific to measured parameters of movement.

Dendritic signalling was consistently elevated in error trials relative to correct trials across varying trial difficulties, with a modest but non-significant reduction in magnitude in trials with stronger evidence (Figure 4—figure supplement 1). These error-associated events may potentially represent a training signal which can be useful to guide learning (Schultz et al., 1997). Indeed, we found that the population of Purkinje cells could decode trial outcome (correct or error) on a trial-by-trial basis (Figure 4G, post-choice correct/error decoding greater than shuffle and pre-choice conditions, p<0.01, two-tailed paired t-test).

Discussion

The present work reports the necessity of the cerebellum in an evidence-accumulation-based decision-making task. We have identified two Purkinje-cell signals that may contribute to this process: somatic activity that reflects evidence and choice, and dendritic signals that report errors. This convergence of task-relevant information onto Purkinje cells suggests that cerebellar activity may play important roles in decision-making, consistent with established hypotheses of cerebellar function in complex domains (Ito, 2008).

Cerebellar crus I communicates with numerous forebrain structures including somatosensory, frontal, and parietal regions (Prevosto et al., 2010; Proville et al., 2014; Strick et al., 2009) via the ventral dentate nucleus (Bernard et al., 2014; Parker et al., 2017) and thalamic intermediates (Asanuma et al., 1983; Dum and Strick, 2003). Posterior hemispheric cerebellar cortex and its principal target, the dentate nucleus, show sensorimotor activity relating to whisker sensation (Bosman et al., 2010), licking (Gaffield et al., 2016), and reward (Wagner et al., 2017), as well as preparatory activity (Middleton and Strick, 1998; Popa et al., 2017) and firing rate ramps (Ashmore and Sommer, 2013) that can influence thalamocortical circuits (Parker et al., 2017). The cerebellum is thought to use information from elsewhere in the brain to form internal models that predict and modulate brain activity (Ito, 2008; Marr, 1969; Wolpert et al., 1998). In the context of our decision-making task, the lateral posterior cerebellum may receive evidence/decision-related efference copy from forebrain regions, where evidence and decision-related variables have been observed and proposed to support decision-making (Ding and Gold, 2012; Hanks et al., 2015; Licata et al., 2017; Morcos and Harvey, 2016; Shadlen and Newsome, 2001). Thus, the cerebellum is positioned to be part of a closed feedback-loop circuit in which it both receives and sends task-related information.

Previous studies have established a sophisticated conceptual framework for understanding the computational basis for evidence accumulation and decision-making (Brunton et al., 2013; Gold and Shadlen, 2007; Juavinett et al., 2018; Morcos and Harvey, 2016; Scott et al., 2017). Our results are a first stage of discovery suggesting that the cerebellum may constitute an additional node in the distributed network of regions that support this process (Pinto et al., 2018b). Muscimol disrupted the proportion of correct choices without disrupting the ability to make a choice. This suggests that the lateral posterior cerebellum modulates not the mechanics of action, but rather processes that precede the brain’s commitment to act. Our fits to a behavioral choice model indicate that reduced performance was accompanied by a decreased weighting of evidence and increased weighting of choice history parameters. The increased dependence on trial history is interesting in light of recently reported sensory history effects in parietal cortex (Akrami et al., 2018). Complementary to association areas in the neocortex, signals emerging from the cerebellum are known to reach thalamic targets which send widespread projections throughout the brain (Strick et al., 2009), situating the cerebellum in a position to modulate one or many components of forebrain processing.

Our imaging of Purkinje cell somata revealed ramps of fluorescence. Cytoplasmic calcium acts as a temporally filtered readout of firing rate, limited by calcium removal times that are slower in Purkinje cells (see Figure 3BKonnerth et al., 1992; Lev-Ram et al., 1992; Fierro and Llano, 1996; Rokni and Yarom, 2009; Ramirez and Stell, 2016) than in neocortical neurons (Chen et al., 2013). Preliminary electrical recordings also showed ramps, consistent with the idea that temporally filtered firing rate ramps may account for the observed fluorescence signals.

These somatic signals represented task-relevant information related to choice and evidence variables, although it remains an open question as to whether these signals precisely track accumulated evidence over time. They could exhibit firing ramps (Shadlen and Newsome, 2001), steps (Latimer et al., 2015), or more complex response profiles that form a temporal basis for evidence accumulation (Scott et al., 2017). The evidence representations we observed were primarily of single-sided evidence, consistent with neural recordings in PPC and FOF of rats performing a similar task (Scott et al., 2017). This could indicate that cerebellar involvement is upstream of the calculation of the decision variable (#R-#L), and we consider this a likely possibility. It is also notable that some studies (Scott et al., 2017; Scott et al., 2015) have suggested that the decision-making process may be supported by two weakly coupled single-sided accumulators, which may modify the interpretation of our results. Whichever the case may be, the neural activity we observed contains task-relevant information that may be used during evidence accumulation and decision-making.

Cerebellar theories propose that the mossy fiber-granule cell pathway encodes contextual or efference-copy signals which are used to generate short-term predictions (Shadmehr et al., 2010). Climbing fiber activity may shape the processing of granule cell inputs by inducing plasticity at multiple cerebellar sites (Albus, 1971; Marr, 1969; Medina and Lisberger, 2008). For example, climbing fiber-derived error signals may modify synaptic weights at parallel fiber-Purkinje cell synapses, providing a mechanism for weighting the contextual signals entering the cerebellum. This motivated us to ask whether in this task, signals may be observed in this pathway that report errors, for example of the outcome of the animal’s choice. We observed an excess of dendritic calcium events coincident with decision errors, which has not been previously reported in the cerebellum. If the error-associated response were analogous to dopamine reward prediction errors, one might have expected strong modulation of the error response magnitude by trial difficulty, with the easiest trials producing the largest error response. However, no such trend was apparent.

We suppose that these dendritic signals might not represent graded information but rather a more categorical signal for updating the cerebellar representation. It might alternatively be the case that the slight but non-significant trend we did observe, which appears inverted relative to traditional reward error signals, could be an example of inverted reward signalling seen elsewhere in the brain (Cohen, 2007; Matsumoto and Hikosaka, 2007). In all cases, the consequences of this error signalling could be reflected in the behavioral learning of the animal, as found via trial-by-trial analyses in some motor tasks (Brooks et al., 2015; Medina and Lisberger, 2008; Ten Brinke et al., 2017), but such effects are difficult to resolve in decision-making tasks like ours where learning is slow, spanning a period of many days or weeks.

The involvement of the cerebellum, with its clearly delineated cell types and connectivity (Dean et al., 2010; Ito, 2012) opens many attractive avenues for future studies in decision-making. The data presented in this study suggest multiple possible roles for cerebellar involvement in evidence-accumulation-based decision-making. For example, output signals from the cerebellum may be combined with signals in sensory circuits to control the input gain of sensory information into accumulators elsewhere in the brain. This model would be consistent with observations of cerebellar involvement in gating sensory information (Apps et al., 1997; Ozden et al., 2012) and inputs to working memory (Baier et al., 2014; Sobczak-Edmans et al., 2016). In a second possibility, the cerebellum may modulate dynamics of the accumulation process. Finally, cerebellar signals may modulate activity that converts the accumulator value into a decision. Such a post-categorization influence has been observed in prefrontal regions during evidence accumulation (Erlich et al., 2015). Detailed inactivation studies with high spatial and temporal precision can resolve these alternatives. In all cases, activity from the cerebellum may be combined with activity in forebrain structures to produce a refined signal that is more likely to yield a reward.

Materials and methods

Mice

Experimental procedures were approved by the Princeton University Institutional Animal Care and Use Committee and performed in accordance with the animal welfare guidelines of the National Institutes of Health. Data for the behavioral task came from 12 mice (six female, six male, 8–9 weeks of age at the start of experiments) of genotypes Pcp2-Cre (five mice, Pcp2-Cre line derived from The Jackson Laboratory, Stock #010536, RRID:IMSR_JAX:010536) and Pcp2-Cre-Ai148 or Ai148 (seven mice, Ai148 line acquired from Hongkui Zeng, Allen Brain Institute); for Purkinje cell dendritic imaging from six mice (four male, two female; 5 Pcp2-Cre, 1 Pcp2-Cre-Ai148), for Purkinje cell somatic imaging from six mice (two female, four male; 3 Pcp2-Cre, 3 Pcp2-Cre-Ai148), for inactivation experiments from six separate mice (three female Pcp2-Cre-Ai148, two male Pcp2-Cre-Ai148, one female Ai148; one was used in behavioral data but was never subjected to inactivation), and for electrophysiology experiments from another three mice (three male Pcp2-Cre-Ai148). Mice were housed in a 12 hr:12 hr reverse light:dark cycle facility, and experiments were performed during the dark cycle. During the experimental day, mice were housed in darkness in an enrichment box containing bedding, houses, wheels (Bio-Serv Fast-Trac K3250/K3251), climbing chains, and play tubes. At other times, mice were housed in cages in the animal facility, in groups of 2–4 mice per cage. Mice received 1.0–1.4 mL of filtered water per day. Body weight and condition was monitored daily.

Surgical procedures

Mice were anesthetized with isoflurane (5% for induction, 1.0–2.5% for maintenance) and underwent surgical procedures lasting 3–4 hr. For mice in imaging experiments, a 3-mm-diameter craniotomy was drilled over the left posterior hemispheric cerebellum. In Pcp2-Cre imaging mice, AAV1.CAG.Flex.GCaMP6f.WPRE.SV40 (Penn Vector Core) virus was injected into crus I and surrounding regions, 220–280 µm below the brain surface (two injections, 200 nL per injection at 20 nL/min), using borosilicate glass pipettes (World Precision Instruments, 1B100F-4, 1/0.58 mm OD/ID) beveled to 30 degrees with a ~ 10 μm tip opening, and an automated injector system (World Precision Instruments Micro4). In all imaged mice, a window composed of a cannula (Ziggy’s Tubes and Wires, 316 S/S Hypo Tube 9R GA. 0.1470/0.1490’ OD x 0.1150/0.1200’ ID x 0.0197’ long) glued (Norland Optical Adhesive 71) to a glass coverslip (Warner Instruments 64–0720) was cemented atop the craniotomy, then a custom-machined titanium headplate (Dombeck et al., 2007) was cemented to the skull using dental cement (C and B Metabond, Parkell Inc.). In Pcp2-Cre mice for inactivations, small (~375 µm radius) craniotomies were drilled over left and right crus I, and guide pedestals (Plastics One, C315GS-5/0-.4 Guide 26GA 5 mm pedestal, cut to 0 mm) were implanted over each. After surgery, dummy cannulas were kept in guide pedestals at all times when injections were not being performed, and were changed approximately every 3 days. For electrophysiology experiments, headplates as described above were implanted, and a 2 mm craniotomy was drilled over crus I and the dura removed. Per animal, one 64-channel silicon probe (Neuronexus, Buzsaki64-H64LP_30 mm in two mice and A4 × 16-Poly2-5mm-23s-200–177 H64_30 mm in one mouse) was placed on a custom-printed probe holder (designed in Blender and printed on a Formlabs Form2 3D printer), and lowered into crus I. During lowering of the probe, recordings were performed as described below to determine the final location of the probe. Ground and reference wires were attached to two stainless steel screws (000–120 × 1/16 SL bind machine screws, Antrin miniature specialties, Inc) above the forebrain. Absolute Dentin (Parkell) was used to secure the probe to the skull, and to create a well which was filled with silicone gel (Dow Corning 3–4680) and ophthalmic ointment (Puralube) to cover the craniotomy before being further secured using dental cement (C and B Metabond, Parkell Inc.) and dental acrylic (Jet Denture Repair, Lang Dental Manufacturing Co.). All animals were given buprenorphine (0.1 mg/kg body weight) and rimadyl (5 mg/kg body weight) after surgery and were given at least 5 days of recovery in their home cages before the start of experiments.

Equipment

Behavioral training, inactivation, and imaging experiments were performed in a light-proof and sound-dampened chamber, with white noise playing at all times. Mice were seated in a custom aluminium tube (9 cm long, 3.5 cm inner diameter) with their head protruding out the front and an opening on the top of the tube for access of the imaging objective. For two mice in one session each, the mice stood on a cylindrical treadmill instead of the tube. Mice were head-fixed using custom head bars screwed into the head plate, angled with the anterior end downward at 18.5 degrees from the horizontal. A set of custom-machined polyoxymethylene alignment tools was used to align the headplate in the lateral axis and in yaw for precise repositioning across days.

Air puffs were produced by activation of solenoids (NResearch, standard two-way normally closed isolation valve, 161T011) with input from an air source regulated to 10 psi (ControlAir Type 850 Miniature Air Pressure Regulator). Air was delivered via two tubes (Ziggy’s Tubes and Wires, 316 S/S Hypo Tube 16T GA. 0.0645/0.0655’ OD x 0.0525/0.0545’ ID) custom-machined with uniform openings, and positioned parallel to one another, parallel to the anteroposterior axis of the animal, 10 mm apart mediolaterally and ~1 mm anterior to the nose of the animal. Water rewards were produced via activation of similar solenoids, and delivered via similar tubes positioned 4.5 mm apart from one another mediolaterally, and 0.5–1.0 mm anterior to the opening of the animal’s mouth. Licks were detected via completion of one of two parallel electrical circuits for the left and right ports. Puff ports and lick ports were positioned in individual custom-machined polyoxymethylene brackets. The lick port bracket was mounted to a linear actuator (Actuonix, L16/P16 Mini linear actuator with feedback) which enabled retraction of the ports to and from the reach of the animal’s tongue (within approximately 300 ms). The ports, brackets, and actuator were mounted via a custom-machined bracket to a micromanipulator (Sutter Instrument Company, MP-285 motorized micromanipulator) for precise positioning of the apparatus in a unique position for each animal. The entire experimental apparatus was mounted on a translatable stage (Danaher Precision Systems) atop a rotating optical breadboard (Thorlabs, RBB12A) to allow custom positioning in the x, y, and rotational axes for imaging. Mice were positioned according to a set of unique coordinates for each mouse on the stage, micromanipulator, and alignment pieces, which were maintained across behavioral sessions.

Behavioral movies were acquired using two USB cameras (Playstation Eye), modified by removal of infrared filters and encasings. One camera was positioned directly below the animal’s mouth and the other at the side of the animal’s face. Images were acquired at 30 Hz with 320 × 240 pixel resolution. Illumination was provided by an infrared LED array (Yr.seasons 48-LED Illuminator Light CCTV 850 nm IR Infrared Night Vision). Sounds were delivered to the apparatus by a speaker (Sony Tweeter XS-H20S) mounted below the apparatus.

All behavioral equipment and data collection were controlled by custom multi-process software written in Python (https://github.com/wanglabprinceton/accumulating_puffs; Deverett, 2018a; copy archived at https://github.com/elifesciences-publications/accumulating_puffs). A DAQ board (National Instruments, NI PCI-MIO-16E-4) was used to deliver and read electrical signals from the experimental apparatus. Solenoids were controlled using digital outputs to custom transistor-based switch circuits. Electrical signals corresponding to licks, solenoid signals, and microscope galvanometer position were acquired using analog inputs at 500 Hz. Camera signals were acquired using a custom Python wrapper to the CLEye API (Code Laboratories, https://codelaboratories.com/downloads). The linear actuator was controlled using an LAC board (Actuonix) and USB control via a custom Python wrapper to the C API (available at https://www.actuonix.com/LAC-Board-p/lac.htm). The micromanipulator was controlled using a Python wrapper to the serial control interface. Experiments were monitored live using a custom interface to display trial information, performance, video input, and electrical signals.

A second computer was used to control the two-photon microscope (Sutter Instrument Company, movable objective microscope with resonant scanning) using the MATLAB software ScanImage 2015 (Pologruto et al., 2003) (Vidrio Technologies, RRID: SCR_014307). Excitation light was provided by a Mai Tai Sapphire laser (Spectra-Physics) at 920 nm. A 16x objective lens (Thorlabs, 16X Nikon CFI LWD Plan fluorite objective, 0.80 NA, 3.0 mm WD) was used with ultrasound gel (Sonigel, Mettler Electronics) as the immersion medium. Excitation power measured at the output of the objective lens ranged from 10 to 50 mW. Images were acquired at 28 Hz or 56 Hz, 512 × 512 or 256 × 512 pixel resolution.

Two forms of synchronization signal were sent from the behavior computer to the imaging computer during imaging. The first was a TCP/IP signal indicating animal and session identity. The second was an I2C-based signal routed through a National Instruments card (NI USB-8451). Signals were sent at multiple timepoints throughout each trial, delivering information corresponding to individual defined moments in the trial, which were then embedded in microscope image frames via ScanImage I2C functionality.

Extracellular recordings were performed using 64-channel Neuronexus silicon probes, which were connected to two amplifier boards (RHD2132, Intan Technologies) using a dual headstage adapter (RHD2000, Intan Technologies). Recordings were made using an Open Ephys acquisition board at a sampling rate of 30 kHz. Similar to imaging, synchronization pulses containing information about timing of the puffs, licks, and rewards were routed through a National Instruments card (NI USB-8451) and connected to the Open Ephys acquisition board using an I/O board (Open Ephys).

Behavior

Task

The evidence accumulation task was based on a task used in rats (Brunton et al., 2013). Mice were trained and imaged in 40- to 70 min behavioral sessions, corresponding to 200–300 trials. A session consisted of trials each ranging from 10 to 20 s in duration. In the first phase of the trial, a start tone was presented, followed by a 1 s delay. Next, in the ‘cue period,’ puffs of air (40 ms duration) were delivered to the whiskers over a period ranging from 1 to 5 s (for all but two sessions, this duration was 1.5 or 3.8 s with 0.15 and 0.85 probability respectively), randomly selected in each trial. The total number of puffs presented was determined by a Poisson process with a rate of 2.5 Hz, a ratio of 1:4 puffs on the two sides, and constrained to a minimum inter-puff interval of 200 ms on each side. In addition to these puffs, bilateral puffs were simultaneously delivered at the start and end of the cue period of every trial. The correct side for a given trial was defined as the side with more puffs, regardless of the generative rates for the two sides. This was followed by a delay period of 200 ms. Then, in the decision phase, the lick ports were brought into the range of the mouse’s mouth, and the mouse made a decision by licking the left or right port at any point over a period of 3 s. The decision was defined as the side of the port licked first, regardless of subsequent licks. If the animal licked the correct port, a water reward (4 µL) was immediately dispensed from that port. If the animal licked the incorrect port, an error sound played and no water was delivered. Following the decision was a 3 s phase for the consumption of the reward, if the animal received one. In most experiments, ports were maintained in the lick position for the same duration even in the absence of a reward, but in some sessions, ports were immediately retracted after errors. Following the reward phase was a intertrial interval of 3.5 or 6 s for correct and error trials, respectively. If the mouse made contact with either lick port at any point before the decision phase of the trial, an auditory tone was played, and the trial was cancelled and excluded from analyses. If the mouse made no decision lick, the trial was excluded from analyses.

Training

Mice were trained to perform the task through a behavioral shaping procedure which typically spanned a period of 10–20 days. In the first stage of training, trials consisted of a 1 s cue period with the same puff rates as the final task, except all puffs following an initial bilateral puff were presented on the correct side only. A water reward was delivered on the correct side in each trial regardless of the actions of the mouse. When a mouse consumed 14 consecutive rewards, it was advanced to the next stage. In the next stage, trial structure remained similar to the previous stage, except rewards were delivered only if the mouse licked the port on the correct side for a given trial at any point in the 3 s response window. In addition, the delay period was extended to 1.3 s during which periodic (2.5 Hz) guide puffs were delivered on the correct side from the end of the cue period until the animal made a decision. To advance, mice completed at least 100 trials at this stage, with 55% correct in a window of 40 consecutive trials, where a correct trial was defined as a match between the side licked first and the correct side for that trial. In the next stage, trial structure remained similar, except bilateral puffs were included at the end of the cue period. In this stage, rewards were only delivered if the first detected lick was on the correct side. As in the final version, error trials included a buzzer sound and a prolonged inter-trial interval. To advance, mice completed at least 200 trials at this stage, with at least 80% correct trials in a consecutive window of 50 trials. In the next stage, the cue period duration was selected randomly as either 2, 2.8, or 3.8 s, with 0.4, 0.3, and 0.3 probability respectively, with a delay period of 200 ms, and guide puffs were no longer delivered. After 100 trials at this stage, and at least 75% correct in a window of 40 trials, mice were advanced. In the next stage, the cue period was its final 3.8 s duration. After a minimum of 125 trials at this stage, and at least 80% correct trials in a window of 25 consecutive trials, mice were advanced to the next stage. In this stage, mice were required to accumulate evidence, with a generative evidence rate ratio of 1:9 for the two sides. When mice completed at least 300 trials and at least 80% correct in 40 consecutive trials, they were advanced to the final task (described above).

Four anti-biasing procedures were implemented during the behavior. First, the probability of drawing a trial on a given side was weighted to produce more trials on the side with worse performance. This mechanism was implemented if in a window of 6 consecutive trials on each side, performance (fraction correct) on one side was 1.5x greater than on the other. In that case, the draw probability of a trial on the worse side was set equal to the ratio of performances on the better to the worse side within the six-trial anti-biasing window. Second, when the same bias metric exceeded 2x, a larger water reward volume (either 1.2 or 1.4 µl, chosen randomly) was given with 60% probability. Third, if bias persisted at the 2x threshold for five consecutive trials, the experimental apparatus was shifted 50 µm toward the side of better performance to bring the stimuli and lick ports of the worse side closer to the animal. At the start of each session, a set of 15–30 warm-up trials were delivered drawing from one or two stages preceding the current one in the shaping procedure, and in warm-up trials of the first four shaping stages, the correct side, instead of being randomized, was alternated. Mice were trained for a maximum of 70 min and a minimum of 40 min unless the animal stopped licking for extended periods of time sooner, in which case they were dismounted early. Occasionally, manual rewards were delivered to encourage licking, and stage advancements or decrements were performed manually to suit animal performance.

Muscimol inactivation experiments

Mice were trained until they achieved at least 70% correct on the final version of the task in two consecutive sessions. Mice were then subjected to injection sessions, corresponding to saline or muscimol in a randomized order across mice. On injection days, mice were anaesthetized using isoflurane (2–3% for induction and maintenance) for approximately 20 min, during which 100–120 nL of either muscimol (Sigma M1523, 2 mg/ml in sterile saline) or saline were delivered through the injection cannulas (Plastics One, Internal 33GA, 0.9 mm proj) at a rate of 50 nL/min, using an automated injection system (World Precision Instruments Micro4). The injection cannula was left in place for 3 min following each injection to allow diffusion of the solution into the brain. Approximately 1 hr post-injection, mice were mounted on the rig and performed a behavioral session. On at least 2 days prior to the first injection, mice were anesthetized with the same protocol prior to their behavioral sessions in order to acclimatize them to anesthesia. When mice performed below 65% on a session, one or two recovery sessions with no injections were subsequently given such that the animal returned to at least 70% performance, before any additional injection sessions. Confidence intervals for performance in each session were computed as binomial proportion confidence intervals using the Jeffreys method.

Following the completion of the study (by approximately 3 weeks), mice were injected under the identical protocol with a fluorescent solution (1.8 mg/mL fluorescein (Fisher Scientific S25328) and 0.5 mg/mL CTB-555 (Invitrogen C22843) in sterile saline) to recover the location and spread of the muscimol injection site. After approximately 1 hr, mice were perfused and brains were extracted. Brains were cleared using iDISCO (Renier et al., 2014) and imaged on a lightsheet microscope (LaVision BioTec Ultramicroscope II, 488 nm excitation laser, 514/30 nm emission filter (Semrock FF01-514/30-25)).

Data analysis

Software

Data analysis was performed using custom analysis packages written in Python (RRID:SCR_008394) 3.5 and 3.6 (https://github.com/bensondaled/pyfluoDeverett, 2018b; copy archived at https://github.com/elifesciences-publications/pyfluo), which make use of Numpy 1.12.1 and Scipy 0.19.1 (van der Walt et al., 2011), Pandas 0.20.3 (McKinney, 2010), Matplotlib 2.0.2 (Hunter, 2007), IPython 6.1.0 (Perez and Granger, 2007), Scikit-learn 0.18.1 (Pedregosa et al., 2011), Scikit-image 0.13.0 (scikit-image contributors et al., 2014), OpenCV 3.3.0 (OpenCV team, 2017), Statsmodels 0.8.0 (Perktold et al., 2017), and OASIS (Friedrich et al., 2017).

Psychometrics

Figure 1B shows psychometric performance data for individual mice and for the ‘meta-mouse,’ consisting of pooled trials from all trained mice. Data for psychometrics were obtained only from trials in the accumulation stages of the task, and not from the preceding stages during the shaping procedure. All analyses contain only trials in which mice made decision licks, such that incorrect trials correspond to licks in the wrong direction, and never the absence of licks. Psychometric curves in Figure 1—figure supplement 2D were fit to a four-parameter logistic function of the form:

y(x)=y0+A1 + e-(x-x0)b

Behavior regression analysis

To determine the dependence of animal choice on stimuli in different temporal bins of the cue period, we performed a regression-based analysis. A logistic regression was computed, with animal decision on a trial-by-trial basis as the predicted variable. The input for each trial was a vector of 5 values, corresponding to the difference in right vs left puffs in five temporally uniform bins of the cue period. Data for regression analysis consisted of trials with the primary cue period duration of 3.8 s. In the ‘shuffle bins’ control, the vector of R-L values was shuffled across bins for each trial. In the ‘shuffle choices’ control, the choices of the animal were shuffled across trials.

Behavioral regression model

Effects of the muscimol inactivation were assessed by fitting behavioral data to a regression model (Busse et al., 2011; Licata et al., 2017) that considers four factors which contribute to the animals’ decisions: stimulus strength, s; bias; success (correct decision) on the previous trial, hsuccess; and failure (incorrect decision) on the previous trial, hfailure. Stimulus strength s was defined as the #R-#L puffs scaled from -1 to 1. The history term hsuccess was 0 if the previous trial was incorrect and -1 or 1 if the previous trial was a correct left or right choice respectively. hfailurewas 0 if the previous trial was correct and -1 or 1 if the previous trial was an erroneous left or right choice respectively. The decision was modeled as a random variable such that the probability p of a rightward choice in the current trial was modeled as

lnp1-p= bevidence sensitivitys +bsuccess historyhsuccess+bfailure historyhfailure + bbias

The four coefficients bevidence sensitivity, bsuccess history, bfailure history, bbias were fit separately for each muscimol session and control session directly preceding it, by a binomial generalized linear model with a logit link function. To compute the significance of the behavioral effects caused by muscimol inactivation, each of the four best-fit parameters from the muscimol trials was compared to the best fits for that parameter in 10,000 bootstrapped datasets from the baseline trials, and the p-value was computed as the fraction of instances in which the muscimol fit exceeded the baseline fit.

Imaging data preprocessing

Imaging data were first motion corrected using a custom template matching-based procedure, then regions of interest (ROI) were selected manually based upon cell morphology and calcium activity. The manually selected dendritic ROI were subsequently refined using the following procedure: for each ROI, the mean activity of all pixels inside the ROI was extracted for each frame in the behavioral session, yielding a single time series. The Pearson correlation coefficient was computed between this time series and every pixel in the dataset. These correlation coefficients were assembled into a correlation coefficient image with the same dimensions as an imaged frame, and regions of interest were automatically extracted from this image by applying a median filter, followed by thresholding the image at two standard deviations above the mean, then detecting and selecting the connected component that most overlapped with the manual ROI. The resulting ROI was excluded if it contained less than 0.5 or greater than 3.5 times the number of pixels in the manually selected ROI. After all ROI were refined in this way, a second processing step merged overlapping and highly correlated ROI to ensure that the resulting ROI corresponded to unique physiological sources. First, mean activity time series traces were extracted for each refined ROI. Next, the pairwise correlation coefficients of traces from each ROI were computed, and an undirected graph was constructed where each ROI composed a node, and edges were placed between nodes with correlation values exceeding a threshold of 0.5. Next, connected component clusters in the graphs with greater than one node were subjected to the following procedure: for each pair of ROI within the cluster, the two ROI were merged if >50% of either ROI’s pixels was contained within the other’s boundaries. A similar graph-based approach was used to detect neighboring ROI with high correlations corresponding to a single cellular origin; if the correlation values of two ROI exceeded 0.8 and they fell within 5 micrometers of each other, they were merged. Finally, pixels within the motion boundaries of the motion-corrected images were removed from all ROI.

For each ROI, the mean activity of all pixels in the ROI was computed for each frame, yielding raw time series data. ∆F/F0 was then computed from these raw traces, with baseline F0 being computed as the minimum of a median-filtered (1 s kernel) 12 s sliding window preceding each time point. Somatic data analysis was restricted to trials with matching 3.8 s cue periods, which comprised 85% of trials. Dendritic data analysis included trials of all cue period durations.

Modulation index r

Modulation index r was defined as Pearson’s correlation between the somatic calcium signal of a given cell and a uniform time vector (‘cue-period time’). For trial-by-trial analyses (choice and evidence decoding, linear model to evaluate puff-fluorescence relationship), r was computed on a trial-by-trial basis. For other analyses, r was computed over the mean somatic signal in the specified time period. For comparison of r in the pre-cue period and cue period, r was computed over a 2 s period immediately preceding, or in the middle of the cue period, respectively. For all other analyses, r was computed over the duration of the cue period. Significance of this modulation was computed by comparing the fraction of cells for which |rcue period| exceeded |rpre-cue period|. The 95% confidence interval for this fraction was computed using bootstrapping: cells were randomly sampled with replacement to create resampled datasets (n = 10,000), and the same fraction was computed for each of these datasets. Boundaries of the confidence interval were then computed as the 5th and 95th percentile of these bootstrapped fractions. Return to baseline was computed by comparing the mean activity of each cell before the cue period and after the inter-trial interval; in total, 198,542 such comparisons were made for all cells in the study.

Analysis of electrophysiological recordings. 

High-pass filtering of the raw data at 300 Hz, common median referencing, and automatic spike sorting was achieved using Kilosort (Pachitariu et al., 2016cortex-lab, 2018; https://github.com/cortex-lab/Kilosort). Spikes were further sorted manually using Phy (https://github.com/kwikteam/phykwikteam, 2018). Instantaneous firing frequency was used to determine the firing rate. Purkinje cells were identified based on their firing rate and the occurrence of complex spikes on either the same channel or on neighboring channels.

Puff-triggered responses

Puff-triggered responses were computed by averaging the ∆F/F signal of a given region of interest triggered to the onset of individual puffs, drawn from trials with sparse (<3) puffs on the given side, and excluding bilateral start and end puffs. Mean tt/2 decay of the puff-triggered response was computed as the time elapsed from the peak value of the mean response to L and R puffs until it fell 50% of the way to baseline, where baseline was defined as the mean value of the response in the 400 ms preceding the puff.

Choice and evidence decoding

Somatic activity or behavioral movie measurements were used to decode the choice or correct side (evidence) on a trial-by-trial basis. Each trial was segmented into 400 ms time bins, in which the mean activity of each cell (somatic decoding) or the mean movement index for each movement feature (behavioral decoding) was computed. For each time point, the vector of these mean values was used to predict the behavioral variable (upcoming choice or side with more total evidence at end of trial) by logistic regression. Each regression was scored by k-fold cross validation with k = 5. Because choice and evidence are correlated in the behavioral task, we employed a procedure to investigate their effects separately. First, to determine whether evidence-related information was contained in the neural activity independently of choice-related information: we compared evidence-decoding accuracy with accuracy on a shuffled dataset in which the side of evidence on each trial was randomly assigned to another trial in which the same choice was made as the original trial. In this shuffled dataset, all evidence-decoding accuracy arises from correlated choice-related information, since evidence-related information was removed; therefore, any excess decoding performance relative to this baseline exhibited by the unshuffled dataset arises from independent evidence information. We also did the converse to test for independent choice-related information: that is, shuffled choice across trials within the same evidence category (where evidence was binned into four groups based on the magnitude [#R-#L] of evidence, to account for the relationship between evidence magnitude and choice probability), and evaluated choice decoding accuracy. These ‘independent’ shuffles constitute the ‘Ind’ condition for choice and evidence decoding. In addition, complete shuffles were performed in which choice/evidence was shuffled across trials with no regard for the other variable. These shuffles constitute the ‘Shuffle’ condition.

Linear model for evaluating puff-fluorescence relationship

To determine the relationship between somatic signals and puff number on a trial-by-trial basis, a two-factor linear model was constructed for each cell. The factors corresponded to the number of left-sided and right-sided puffs in each trial, and the dependent variable to the mean pre-decision fluorescence (500 ms preceding the decision phase) of the given cell in each trial. The best-fit coefficients were interpreted as the relationship between fluorescence and puff number for that cell. Significance of these coefficients was scored by the p-value associated with each coefficient in the model (two-tailed unpaired t-test of regression weight values). Cells with p<0.05 for a given coefficient were considered to have ramps significantly correlated with puffs on the corresponding side. Significance cells were grouped into categories according to the sign of the coefficient (+ or –), and the side of the puffs with significant coefficients (L, R, or LR for both). In the shuffle condition, the puff numbers were shuffled across trials while holding the animal’s choice constant, and the same analysis was run.

Movie-based behavior measurements

Movements of the mouth, whiskers, nose, and paws were made using the behavioral movies acquired during all behavioral sessions in which calcium imaging was performed. For each movie from each animal, regions of interest (ROI) were manually selected corresponding to the aforementioned regions in the field of view. Full-session traces were extracted as the mean pixel value within the ROI in each frame. Mouth movements were measured using the laterally positioned camera, where deviation of the mouth and chin was best detected. Full-session traces corresponding to mouth movement were mean-subtracted and normalized from 0 to 1. For those measurements, the normalized licking measure was computed as the absolute value of the mouth-movement trace, with a median-filtered (0.5 s kernel) drifting baseline signal subtracted off, normalized from 0 to 1. Deviations in overall image brightness were normalized by a fiducial measurement in the field of view not involving the animal that reliably tracked field-of-view brightness. The remaining movement measurements (nose, left whiskers, right whiskers, left paw, and right paw) were extracted from the camera positioned below the animals’ faces. For these measurements, traces were extracted in the same manner as above, and the normalized movement index was computed as absolute value of the trace derivative, normalized from 0 to 1.

Dendritic events

Dendritic events were detected using the OASIS autoregressive deconvolution algorithm (Friedrich et al., 2017), with AR(1) and an L0 sparsity penalty, and binarized with a threshold of 0.1.

Analysis of dendritic responses

To test the hypothesis that dendritic signalling encoded motor events, we computed the mean dendritic response to a number of specific motor actions, and compared this response across error and correct contexts on a cell-by-cell basis. In these analyses, lick initiation (‘start licking’) and cessation (‘stop licking’) were defined as licks that were preceded or followed by, respectively, at least 250 milliseconds without licking on that side. For the analysis in Figure 4F, activity was measured in a 1-second window centered at this event. In addition, supplementary analyses measure the signal instead in the 200 ms preceding or following the event to account for the possibility that signals encode specifically preparatory or response signals to the events, respectively (Figure 4—figure supplement 1). For licking magnitude analyses, trials were categorized according to the quantity of licks in the same time window in which the dendritic response was measured (Figure 4E), and the dendritic response was measured in the same manner as the data in Figure 4E. For analyses of orofacial movements, the onset times of distinct movements in each orofacial region were determined, according to frames at which the time series data for that feature exceeded 2.5 standard deviations above the mean, and activity was measured in the one-second window centered at these events.

For all of the above movement measurements, movement events were split into those which occurred in error contexts (i.e. the time window for which the dendritic response was measured for Figure 4E, in error trials), and those which occurred outside error contexts (in correct trials). Mean dendritic activity events were computed for both contexts for every movement measurement, and the error:correct activity ratio was then computed for each individual cell. To compare dendritic responses at varying trial difficulties, the difference was computed on a session-by-session basis between the mean dendritic response in the strongest and weakest evidence (|#R-#L|) trials, and the significance of this difference was compared using a two-tailed paired t-test.

Outcome decoding

Dendritic activity was used to decode the outcome (correct or error) on a trial-by-trial basis. For each cell in each trial, the post-decision response was computed as the mean ∆F/F value in the 800 ms following (‘Post-choice’) or preceding (‘Pre-choice’) the decision. The outcome of each trial was predicted by logistic regression using these values as predictors. For each session analyzed, 1000 batches of trials were randomly sampled in which the number of correct and error trials were matched, and a regression was performed on each, scored by k-fold cross validation with k = 3. In the shuffle conditions, the outcomes of the trials were shuffled and the decoding accuracy on the shuffled dataset was reported.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
    The Cerebellum: Brain for an Implicit Self
    1. M Ito
    (2012)
    FT Press.
  40. 40
  41. 41
  42. 42
  43. 43
    Is the cerebellum relevant in the circuitry of neuropsychiatric disorders?
    1. JZ Konarski
    2. RS McIntyre
    3. LA Grupp
    4. SH Kennedy
    (2005)
    Journal of Psychiatry & Neuroscience : JPN 30:178–186.
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
    Data structures for statistical computing in Python
    1. W McKinney
    (2010)
    Proceedings of the 9th Python in Science Conference. pp. 51–56.
  57. 57
  58. 58
  59. 59
  60. 60
    OpenCV Library
    1. OpenCV team
    (2017)
    OpenCV Library, 3.3.0, http://opencv.org/.
  61. 61
  62. 62
  63. 63
    Fast and accurate spike sorting of high-channel count probes with KiloSort
    1. M Pachitariu
    2. NA Steinmetz
    3. SN Kadir
    4. M Carandini
    5. KD Harris
    (2016)
    In: D. D Lee, M Sugiyama, U. V Luxburg, I Guyon, R Garnett, editors. Advances in Neural Information Processing Systems, Vol 29. Urran Associates, Inc. pp. 4448–4456.
  64. 64
  65. 65
    Scikit-learn: machine learning in Python
    1. F Pedregosa
    2. G Varoquaux
    3. A Gramfort
    4. V Michel
    5. B Thirion
    6. O Grisel
    7. É Duchesnay
    (2011)
    Journal of Machine Learning Research : JMLR 12:2825–2830.
  66. 66
  67. 67
    StatsModels: Statistics in Python
    1. J Perktold
    2. S Seabold
    3. J Taylor
    4. statsmodels-developers
    (2017)
    StatsModels: Statistics in Python, 0.8.0, http://www.statsmodels.org/stable/index.html.
  68. 68
  69. 69
    Cosyne Abstracts
    1. L Pinto
    2. D Tank
    3. C Brody
    4. S Thiberge
    (2018)
    Widespread cortical involvement in evidence-based navigation, Cosyne Abstracts, USA, Denver CO.
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88
  89. 89
  90. 90
  91. 91
  92. 92
  93. 93
  94. 94
  95. 95
  96. 96

Decision letter

  1. Megan R Carey
    Reviewing Editor; Champalimaud Foundation, Portugal
  2. Richard B Ivry
    Senior Editor; University of California, Berkeley, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "A cerebellar role in evidence-guided decision-making" for consideration by eLife. Your article has been reviewed Richard Ivry as the Senior Editor, a Reviewing Editor, and three reviewers. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

This is an interesting study describing a potential role for the lateral posterior cerebellum in a novel decision-making task. The cerebellum has received much less focus than cortex in decision-making research despite clear evidence from, among other sources, human imaging and lesion studies that the cerebellum is involved in a wide range of cognitive processes. There is also a rich tradition of theoretical modeling inspired by the cerebellum's unique circuit organization and role in sensorimotor coordination that appears potentially relevant to many cognitive operations, raising interesting questions about what the cerebellum might contribute to decision-making. Therefore, the approach and data reported in the current manuscript are promising and could provide meaningful novel insight into the neural basis of decision-making. Despite this potential, and that it does report some interesting observations, the main limitation of this work is that it is hard to say what the cerebellum is doing during the task. In particular, it is not clear that it is possible to conclude that the cerebellum is truly involved in evidence accumulation. The concern about the support for evidence accumulation holds for the inactivation experiments as well as for the imaging.

Essential revisions:

1) Inactivation results: Inactivation results are suggestive of an important role for the cerebellum in this task. However, from the data presented, it is not clear that it is possible to conclude that the cerebellar inactivations are interfering with evidence accumulation itself, and not just somatosensation or motor output. At a minimum, the authors should show psychometric curves for the 5 individual muscimol-treated animals. Assessing the effects on the slopes and offsets of the individual psychometric curves might be more revealing about which aspects of task performance were affected by inactivation of crus I. Further, there is the possibility that with this experimental design, muscimol inactivation may not be adequate to determine whether the cerebellum is accumulating evidence, because changes in the psychometric function can also be explained by a decrease in sensory sensitivity alone. Careful modeling of the data may help.

2) Evidence for evidence accumulation signals from imaging of somatic calcium in Purkinje cells: The authors report that appear similar to evidence-dependent and choice-selective ramping responses previously reported in many cortical and subcortical areas of primates and rodents. The manuscript reports temporal dynamics of Purkinje cell somatic calcium signals that, in many cells, manifest as a gradual ramping response during the cue period that is correlated with both the number of pulses on one or the other side (evidence) and/or the animal's eventual response (choice). These responses are interpreted with analogy to the ramping responses in trial-averaged spike rates recorded from parietal cortex (along with other forebrain regions) that are thought to encode the time-integral of sensory evidence. However, the manuscript does not convincingly demonstrate that cerebellar neurons are actually encoding accumulated evidence. In general, given the uncertainty about the origin of these signals, more caution is warranted in interpretation of the apparent ramping signals. Specifically:

a) While there appear to be lots of time-modulated signals, the link to evidence accumulation and choice is not clear. It is apparent that the most salient feature of the recorded cerebellar population is a constant modulation, either upwards or downwards, with the passage of time in the trial. The presence of this signal complicates the interpretation of any ramping activity as being attributable to evidence accumulation. However, the statistical analyses (i.e. the linear model reported in subsection “Neuronal signatures of choice and evidence in Purkinje cells”) ignore this component of the responses. This might be valid if the representation of time were truly independent from the representation of evidence. But no such independence is established. For example, cells that are more strongly driven by evidence might also be more strongly driven by time, or more likely to be driven by time in one direction or another. The authors should more carefully consider the relationship between the different components of the cerebellar responses and/or formally control for potentially confounding effects of the time-related responses in the statistical models.

b) Given the temporal resolution of the calcium indicator, it is difficult to interpret any evidence-related signal dynamics as reflecting an underlying ramp in neuronal firing rate. The original description of the use of somatic calcium imaging to track changes in Purkinje cell simple spike rate (Ramirez and Stell, 2016) found that these signals were so slow that step changes in simple spike rate could result in ramps of fluorescence like those seen here. This technical concern would best be addressed by at least some recordings comparing the calcium signals to simple spike rate with electrophysiology. Although the authors have tried to be careful in their writing, the text currently leaves plenty of room for misinterpretation by less technical readers. They should clarify the text further to be much more clear about the limitations of interpreting the temporal dynamics of the calcium signals.

c) The relevant decision variable in the pulse accumulation task is the *difference* in pulses between the two sides. The authors seem to know this point well based on how they plot the psychometric functions. Yet at best a small minority of Purkinje cells encode this value. Subsection “Neuronal signatures of choice and evidence in Purkinje cells” reports that 39/843 cells (less than 5%) encode either a sum or difference of the number of pulses on each side. The specific figure for the number of cells encoding a difference is not reported, but Figure 3 suggests that it is about 5 cells (so ~2% of evidence-modulated cells or ~0.5% of all cells). These numbers question strong conclusions about the representation of the decision variable and accumulated evidence in cerebellum. Further, this result could be taken to show that responses in rodent cerebellum are different from those in primate cortex that are often interpreted as a representation of the decision variable for perceptual discrimination. Notably, it also indicates a dissociation between rodent cerebellum and neocortex, as Scott et al., estimate that as many as 1/3 of evidence-modulated cortical cells are better explained by the difference between sides. Therefore, the data strongly suggest that the critical decision-making operations are actually implemented downstream of the recorded cerebellar population. This strikes me as highly relevant to how the overall results should be interpreted, and it should be emphasized more strongly in the summary and discussion.

d) The analysis of the representation of accumulated evidence (subsection “Neuronal signatures of choice and evidence in Purkinje cells” and Figure 3) ignores the time course of representation, focusing on a short window immediately before the decision. Even if we disregard technical problems explained above in 2a-c, it is unclear whether the cerebellar neurons represent integration of evidence over time or merely its final outcome. Note that the population analysis for the representation of evidence (Figure 2F) does not answer this question (and is generally not quite informative) because evidence is defined as the "correct" choice in that analysis rather than the magnitude of evidence (#R-#L).

3) "Error-related" dendritic responses

In addition to somatic calcium signals, the authors analyzed dendritic calcium signals and found that these were higher on error vs. correct trials. The implications of this observation are emphasized heavily; for example, the Discussion section concludes that "the cerebellum, which learns from error to guide action, may help in the learning and tuning of accurate responses". This is an interesting proposal, but one might have had the same belief prior to seeing the results reported here (given existing human data and theories about the cerebellum), and it's not clear how the data should update it.

a) There is speculation about how the observed error responses could be used as error signals to guide learning, in line with existing models of cerebellar learning. However, it was not clear to the reviewers how that would work, especially given that the error responses observed were not directional. Typically, Purkinje cell complex spikes are thought to provide a directional signal for learning, not just a correct/ incorrect signal. There are no analyses to support the proposed functional role of these error related responses as being involved in "tuning" responses or correcting errors. Yet, it should be possible to provide a more thorough analysis of the error signals:

What do error signals predict about future behavior? An obvious analysis would be to ask whether the magnitude of the error signal on trial n−1 influences choice accuracy on trial n. This seems to be a clear prediction from the proposal that the cerebellum "tunes" behavior or "corrects errors".

c) How do the error signals relate to the strength of evidence on each trial? This is briefly mentioned at the end of the Results section and in Figure 4—figure supplement 1, but without any statistical tests or interpretation. Whether and how the error signals are modulated by the strength of evidence seems key to determining their functional role in updating behavior, if any. In particular, we would have expected that true "error" responses would be higher for easier trials, but the opposite appears to be true.

4) Ruling out sensory and motor confounds as potential sources of somatic and dendritic calcium signals

a) Figure 3B indicates that isolated puffs produce a calcium response that gradually rises and falls over ~1 s. On trials with strong evidence, these signals will overlap. Assuming that they sum reasonably linearly, overlapping calcium responses from neurons that encode only the transient presentation of evidence would nevertheless give the impression of a gradual ramp with a slope that depends on the quantity of evidence. Note that this is a different issue from the point raised in the Discussion section about distinguishing single-trial ramps from steps. Rather, it makes it unclear whether the evidence-related cerebellar responses correspond to a representation of the momentary sensory evidence or to the magnitude of accumulated evidence that drives choice. This distinction is critical to interpreting proposed neural implementations of evidence accumulation models. One possible way to address this would be to record responses during stimulus presentation from animals that are not engaged in making a decision.

b) The authors attempt to rule out that the somatic signals are related directly to movement of the animal and less related to the "cognitive" variables. The analysis in the supporting figures is in line with the level of detail often applied in the field currently. But, is it possible that the activity is related to other movements of the animal rather than in the orofacial region? The authors should either cite strong evidence showing that these are the only relevant regions for the area of cerebellum examined or provide additional videography results from other parts of the body (e.g. paws). This is of course a significant point to make as strong as possible given the cerebellum's long-established role in motor behaviors.

c) The analyses in Figure 4—figure supplement 1 do not convincingly rule out the possibility that the difference in dendritic calcium signals on error trials resulted from differences in licking on error trials. That figure clearly shows that cessation of licking proceeds at different rates on error trials compared to other times. How does trial type (correct vs error) affect the relationship between licking (or changes of rate in licking) and dendritic calcium signals?

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "A cerebellar role in evidence-guided decision-making" for further consideration at eLife. Your revised article has been favorably evaluated by Richard Ivry (Senior Editor), Megan Carey (Guest Reviewing Editor), and two reviewers.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

The authors have made a concerted effort to clarify the text of their paper and have provided reasonable responses to many of the points raised. As was noted in the first round of reviews, rebuttal, and in the text, it remains unclear what the cerebellum's specific role or computation is during the task, and the reviewers were somewhat disappointed that most of the key questions are deferred to future studies. However, the reviewers also felt that defining cerebellum's exact role in the task may be asking too much for a first study, particularly given the limitations of the dataset and the conceptual and analytical complexities that prevent the authors from specifying the nature of neural representations and the source of the behavioral deficit following cerebellar inactivation.

Overall, there are several interesting leads about the role(s) of cerebellum in perceptual decisions in this paper. While none of these are firmly established by the current study, the reviewers agree that the writing and presentation of the results is generally fair and not greatly over-stated, and favor publication so that the results can be evaluated by the field and followed up in future studies from other groups.

The reviewers note that in the process of clarifying what can and cannot be claimed based on the existing data, the scope of the paper is limited to three points: successful training of mice to do the task, reduced accuracy following cerebellar inactivation, and representation of task-relevant variables in cerebellar neural population without specifying the exact nature of the represented variables. Before publication, the reviewers agree that it is important to ensure that after toning down its claims, the manuscript does not leave behind any statements that could mislead readers as to what is actually shown. The reviewers have identified the following statements from the Title, Abstract, and Discussion section that are potentially misleading and should be restated with more specific statements that more accurately match the conclusions that can be supported by the data.

Title

A cerebellar role in evidence-guided decision-making.

Impact Statement

The lateral posterior cerebellum participates in evidence-accumulation-based decision-making, and Purkinje neurons in this region encode choice-, evidence-, and error-related variables. [Suggest replacing this stronger statement with language more like that used in the rebuttal, such as "choice- and evidence-related information is present in lateral posterior cerebellum and could participate in decision-making computations during a decision-making task involving evidence accumulation."].

Abstract

- Here we show that during perceptual decision-making over a period of seconds, decision-, sensory-, and error-related information converge on the lateral posterior cerebellum in crus I, [The presence of task-related signals is shown. Convergence is not, and decision and sensory signals are not clearly dissociated].

- Demonstrated that cerebellar inactivation reduces behavioral accuracy without impairing motor parameters of action [Not all motor parameters were controlled for].

- We found that Purkinje cell somatic activity encoded choice- and evidence-related variables [Please avoid the suggestion that the specific variables that are encoded have been determined].

- Decision errors were represented by dendritic calcium spikes, which are known to drive plasticity [This could misleadingly suggest that they are known to drive plasticity in this context].

- We propose that cerebellar circuitry may contribute to the set of distributed computations in the brain that support accurate perceptual decision-making. [Should be more focused on task performance].

Discussion section

- Cerebellar inactivation reduces animals' use of evidence and increases their use of choice history. [given the limitations of the interpretation of the inactivation experiments, this statement should be more conservative].

- Given the temporal resolution of calcium measurements, our somatic signals may correspond to firing rate ramps (Shadlen and Newsome, 2001), steps (Latimer et al., 2015), or more complex response profiles that form a temporal basis for evidence accumulation (Scott et al., 2017). [This statement, as well as the corresponding section of the Results section, should include an explicit reference to the time course of somatic calcium signals from Purkinje cells, which is at least an order of magnitude slower than typical calcium imaging (Ramirez and Stell, 2016).].

-The task-modulated activity we observe encodes both choice-related and evidence-related variables that may be used during the decision-making process. [Please avoid the suggestion that the specific variables that are encoded have been determined].

- We observed an excess of dendritic calcium events coincident with decision errors, demonstrating for the first time observations compatible with error-associated signalling in a decision-making reward context. [Given the limitations of the interpretation of these signals, this statement should be more conservative].

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Cerebellar involvement in an evidence-accumulation decision-making task" for further consideration at eLife. Your revised article has been favorably evaluated by Richard Ivry (Senior Editor), and Megan Carey (Guest Reviewing Editor).

The manuscript has been improved but there are some final issues that need to be addressed before acceptance, as outlined below:

In response to the request to re-evaluate the evidence for ramping signals that could be obtained with the somatic calcium imaging, the authors now state (Subsection “Purkinje cell somatic calcium encodes task-relevant information”), "Therefore our observed increasing and decreasing time courses of calcium could reflect various firing rate profiles, such as impulse responses, ramps, or steps."

In light of this revision, as well as the fact that the electrophysiological evidence provided in Figure 2—figure supplement 2 is from only a few cells, all of which show positive ramps of activity, the following statements should also be revised:

- (Subsection “Purkinje cell somatic calcium encodes task-relevant information”), "We did find that electrically recorded Purkinje cells exhibited gradually increasing rates of firing throughout the cue period (Figure 2—figure supplement 2), suggesting that on average across trials, the fluorescence signals we observed correspond to firing rate ramps."

- (Discussion section), "Our electrical recordings also showed ramps, suggesting that temporally filtered firing rate ramps are sufficient to account for our observed fluorescence signals."

In both instances, we suggest removing the second clause, starting with "suggesting that…"

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Cerebellar involvement in an evidence-accumulation decision-making task>" for further consideration at eLife. Your revised article has been favorably evaluated by our editors again, but there remain some issues that need to be addressed before acceptance, as outlined below. Given that this is the third request for revisions, we will be unable to follow with any more. Please attend to this final issue one way or the other so that the next letter will be the final one.

We appreciate the authors' desire to speculate here. However, in our view, the "suggests/ consistent with" was not the only problem with this sentence. There is also a problem with "sufficient". The electrophysiological evidence in Figure 2—figure supplement 2 is anecdotal and non-quantitative. For this statement to be left in, it would need to be adequately supported. In our view, this would require a quantitative comparison between imaging and electrophysiology results. In particular, we would want to know:

- How many Purkinje cells in total were recorded from electrophysiologically? How many of these showed ramping? (all of the cells they showed us show positive ramps, but it is not clear if those were selected from a larger data set)

- Did any Purkinje cells show ramping calcium signals without a transient increase in firing rate?

- Did any Purkinje cells show ramping calcium signals without ramps in firing rate (for instance, in cases where only a transient increase in firing may have been observed electrophysiologically)?

- Why are no decreasing activity ramps found with electrophysiology, but they are found with imaging?

- What accounts for the decreasing ramps that were observed with calcium imaging?

- What would the predicted calcium signals be for the examples shown if the spike rates recorded electrophysiologically (with and without the transient increase/ ramping components) were convolved according to Ramirez and Stell (2016)? And/or with the authors' own convolution/ deconvolution methods, from the simultaneous calcium imaging/ electrophysiological recordings that they performed?

We give the authors the choice of either fully addressing these points, or using compromise language, such as "Preliminary electrical recordings also showed ramps, consistent with the idea that temporally filtered firing rate ramps may account for the observed fluorescence signals."

https://doi.org/10.7554/eLife.36781.021

Author response

Summary:

This is an interesting study describing a potential role for the lateral posterior cerebellum in a novel decision-making task. The cerebellum has received much less focus than cortex in decision-making research despite clear evidence from, among other sources, human imaging and lesion studies that the cerebellum is involved in a wide range of cognitive processes. There is also a rich tradition of theoretical modeling inspired by the cerebellum's unique circuit organization and role in sensorimotor coordination that appears potentially relevant to many cognitive operations, raising interesting questions about what the cerebellum might contribute to decision-making. Therefore, the approach and data reported in the current manuscript are promising and could provide meaningful novel insight into the neural basis of decision-making. Despite this potential, and that it does report some interesting observations, the main limitation of this work is that it is hard to say what the cerebellum is doing during the task. In particular, it is not clear that it is possible to conclude that the cerebellum is truly involved in evidence accumulation. The concern about the support for evidence accumulation holds for the inactivation experiments as well as for the imaging.

We thank the reviewers and editor for their thoughtful comments on the manuscript. We have added new data, analyses, and substantial clarifications that we think improve the quality of the study. In some cases, we think the objections raised by reviewers were consequences of our failure to properly convey the scope of our study.

We fully agree that a tremendous amount stands to be resolved with respect to the cerebellar role and computations in our decision-making behavior. Our findings are at an early stage relative to the decision-making field. It is not our intent at this time to dissect precisely the evidence integration process as many studies have, but rather to study the whole perceptual decision-making process with a more agnostic view about cerebellar roles. We cannot yet make strong claims about the particular computational aspect of decision-making to which the cerebellum contributes. In light of this and of the reviewer suggestions, we have emphasized more strongly the extent to which many questions remain unanswered.

Of specific note with respect to a number of reviewer comments: we do not interpret our findings as evidence that the cerebellum performs the integration computation in evidence accumulation. Furthermore, we are aware that the particular roles of brain regions in the evidence-accumulation process are intricate, deeply studied, and yet still unresolved (Brody and Hanks, 2016).

In that light, our goal was to probe for an involvement of cerebellar activity in the perceptual decision-making process, which to the most basic extent has not been previously shown. We asked three fundamental questions: (1) Can we train mice on a new evidence-accumulation decision-making task? (2) Does cerebellar inactivation affect behavioral performance? (3) Are task-relevant variables represented in any way in cerebellar neuronal populations? We reasoned that these three demonstrations would constitute novel and interesting findings and would be a basis for substantially larger-effort endeavors (which we are now pursuing).

To make this clearer in the manuscript, we have adjusted the text to emphasize the result: i.e. demonstration of a cerebellar role in perceptual decision-making, using evidence accumulation as a paradigm, but where the cerebellar role may or may not be directly in the computations of updating an evidence integrator per se. We were careful to present it this way in the manuscript and have attempted to clarify it more. We consider follow-up studies with detailed high-resolution measurements and inactivations an important and interesting next step, and we are pursuing that.

Essential revisions:

1) Inactivation results: Inactivation results are suggestive of an important role for the cerebellum in this task. However, from the data presented, it is not clear that it is possible to conclude that the cerebellar inactivations are interfering with evidence accumulation itself, and not just somatosensation or motor output.

We view the inactivation experiment as an initial demonstration of cerebellar necessity, and not a comprehensive modelling effort. Rather, we add a new region of interest to the emerging brainwide view of perceptual decision-making, noting that the inactivations demonstrate a causal role for the cerebellum in effectively converting sensory evidence into a decision at some stage of that complex process. This could correspond to sensory input gain biases, post-categorization biases, or other non-accumulation-specific biases. We have now noted this important distinction more clearly in the Discussion section. Below we address our thoughts on the possibilities of specific roles for the cerebellum but we do not believe to have resolved those yet using this study.

First, to address the possibility of interference with motor output: we were particularly mindful of the possibility of a strictly motor deficit, and we believe the licking measurements and control trials we included (Figure 1—figure supplement 2B) address this.

Regarding the possibility of sensory deficits and other non-accumulation impairments: as we explain in the initial section of our response, we completely agree: the evidence-accumulation process, as opposed to related sensory gain and premotor processes, is difficult to pin down and we are aware of recent findings in rodent decision-making which indicate non-accumulation roles for regions that have been long studied in decision-making (Erlich et al., 2015; Katz et al., 2016). We do not intend to claim that the cerebellum performs specifically evidence integration.

At a minimum, the authors should show psychometric curves for the 5 individual muscimol-treated animals. Assessing the effects on the slopes and offsets of the individual psychometric curves might be more revealing about which aspects of task performance were affected by inactivation of crus I. Further, there is the possibility that with this experimental design, muscimol inactivation may not be adequate to determine whether the cerebellum is accumulating evidence, because changes in the psychometric function can also be explained by a decrease in sensory sensitivity alone. Careful modeling of the data may help.

In the submitted version of the manuscript, the evidence sensitivities, biases, and trial history effects of individual animals were represented by the session-by-session fits in the logistic regression model shown in Figure 1—figure supplement 2C (and subsection “A decision-making task for cerebellar investigations”). This behavioral model has been used in multiple published decision-making studies, it considers more variables than a psychometric fit does, and it provides a highly useful view of the effects introduced by the inactivation.

Psychometric curves

Nevertheless, we now include psychometric curves that similarly demonstrate the reduced sensitivity presented quantitatively in the aforementioned modelling (Figure 1—figure supplement 2D). We also more strongly highlight the findings of the behavioral modelling, which indicate significant changes in animals’ usage of evidence and trial history in guiding decisions (subsection “A decision-making task for cerebellar investigations”).

From our data and modelling at this stage, we consider deficits at all stages of the decision-making process (ex. input sensory gain biases, integration impairments, post-categorization/premotor biases) possible and interesting to report. Thus, we do not claim that the cerebellum controls evidence integration per se. We have made this point clearer in the Results section and Discussion section.

Some of the shifts in psychometric curves may arise from asymmetry of injections (and we include specific evidence suggesting that such an asymmetry can account for some of the biases we see). We are currently doing experiments with newer technologies that we believe will resolve these questions in a more detailed manner.

However, we do note that the trend in our psychometrics is broadly consistent with effects seen in (Erlich et al., 2015), which shows that curves with this general phenotype (compression and bias) could conceivably be explained by a number of hypotheses ranging from sensory and premotor deficits to accumulation deficits. While it is beyond our current goal in this manuscript to perform this level of modelling, we use it just to demonstrate that our phenotype does not rule out interesting possibilities.

Drift diffusion models

Fits to drift diffusion models are an interesting direction for these data, yet also not necessary for our aim to demonstrate general causality for the region in the task. For completeness, we include below preliminary fits to a drift diffusion model showing what appears to be an increase in accumulation noise as a result of muscimol inactivation.

Given the poor statistical power of this dataset for this analysis, we don’t feel it substantially changes our main novel point that the cerebellum indeed plays causal role in some component of perceptual decision-making. We are currently performing a temporally specific inactivation study that will be better powered to perform this level of analysis. We intend to pursue more detailed modelling there.

Author response image 1
Fits to a 5-parameter drift diffusion model (excluding the adaptation, initial noise, and sticky bounds found in the (Brunton, Botvinick and Brody, 2013) model).

Best-fit parameters in the muscimol and baseline conditions are plotted on the likelihood landscape of the muscimol fit. The top panel indicates a tradeoff between sensory and accumulation noise in the fit to muscimol inactivation trials. However, the statistical confidence intervals for the muscimol fits span the displayed range for the lapse and accumulation noise parameters.

https://doi.org/10.7554/eLife.36781.020

2) Evidence for evidence accumulation signals from imaging of somatic calcium in Purkinje cells: The authors report that appear similar to evidence-dependent and choice-selective ramping responses previously reported in many cortical and subcortical areas of primates and rodents. The manuscript reports temporal dynamics of Purkinje cell somatic calcium signals that, in many cells, manifest as a gradual ramping response during the cue period that is correlated with both the number of pulses on one or the other side (evidence) and/or the animal's eventual response (choice). These responses are interpreted with analogy to the ramping responses in trial-averaged spike rates recorded from parietal cortex (along with other forebrain regions) that are thought to encode the time-integral of sensory evidence. However, the manuscript does not convincingly demonstrate that cerebellar neurons are actually encoding accumulated evidence. In general, given the uncertainty about the origin of these signals, more caution is warranted in interpretation of the apparent ramping signals. Specifically:

We thank the reviewers for pointing out that our claims on this topic came across stronger than intended. As suggested, we have edited our interpretations to be more cautious. Our adjustments clarify what we now intend to claim and eliminate vagueness. We agree, as we note in the responses below, that (1) we have not demonstrated that cerebellar neurons precisely encode the stepwise accumulation of evidence, (2) the coding strategy within the cerebellum may be different than that of neocortex, and (3) our measurements are limited in the ability to answer (1) and (2). The analogy to more finely resolved neocortical signals is intriguing but not crucial to our claims, being more of a discussion point. We have therefore re-worded our conclusions to emphasize these distinctions.

On that note, some of the reviews in this section are concerned with how evidence is specifically represented, and how that may relate to specific evidence-accumulation models. As we note at the start of our response, an accumulation role per se for the cerebellum is possible but beyond our data to support. In this first contribution to the study of the cerebellum in this task, we focus on findings that precede such a claim. We consider our main finding to be that choice- and evidence-related information is present with sufficient fidelity to play a role in decision-making computations.

(Note that point 2a and our accompanying response is found below alongside point 2d).

b) Given the temporal resolution of the calcium indicator, it is difficult to interpret any evidence-related signal dynamics as reflecting an underlying ramp in neuronal firing rate. The original description of the use of somatic calcium imaging to track changes in Purkinje cell simple spike rate (Ramirez and Stell, 2016) found that these signals were so slow that step changes in simple spike rate could result in ramps of fluorescence like those seen here. This technical concern would best be addressed by at least some recordings comparing the calcium signals to simple spike rate with electrophysiology. Although the authors have tried to be careful in their writing, the text currently leaves plenty of room for misinterpretation by less technical readers. They should clarify the text further to be much more clear about the limitations of interpreting the temporal dynamics of the calcium signals.

We have clarified the text to emphasize the limitations in interpreting our calcium signals (subsection “Purkinje cell somatic calcium encodes task-relevant information”, Discussion section). We have more strongly noted that step changes in firing rate are indeed possibly the underlying representation.

While simultaneous calcium and electrical recordings are beyond our current technical abilities, we have obtained electrical recordings of Purkinje cells during performance of the task in trained animals. These recordings substantiate our primary findings of task-modulated signalling and suggest that the trial-averaged calcium signals we observe indeed correspond to ramps in firing rate. These recordings are now included in Figure 2—figure supplement 2. We include these important data, but nevertheless remain conservative in our interpretations of calcium data (Discussion section), especially since the nature of these signals remains unresolved at the single-trial level.

c) The relevant decision variable in the pulse accumulation task is the *difference* in pulses between the two sides. The authors seem to know this point well based on how they plot the psychometric functions. Yet at best a small minority of Purkinje cells encode this value. Subsection “Neuronal signatures of choice and evidence in Purkinje cells” reports that 39/843 cells (less than 5%) encode either a sum or difference of the number of pulses on each side. The specific figure for the number of cells encoding a difference is not reported, but Figure 3 suggests that it is about 5 cells (so ~2% of evidence-modulated cells or ~0.5% of all cells). These numbers question strong conclusions about the representation of the decision variable and accumulated evidence in cerebellum. Further, this result could be taken to show that responses in rodent cerebellum are different from those in primate cortex that are often interpreted as a representation of the decision variable for perceptual discrimination. Notably, it also indicates a dissociation between rodent cerebellum and neocortex, as Scott et al., estimate that as many as 1/3 of evidence-modulated cortical cells are better explained by the difference between sides. Therefore, the data strongly suggest that the critical decision-making operations are actually implemented downstream of the recorded cerebellar population. This strikes me as highly relevant to how the overall results should be interpreted, and it should be emphasized more strongly in the summary and discussion.

While it is true that a subset of cells in (Scott et al., 2017) represented the #R-#L quantity, a key conclusion of that study was that left and right evidence may be accumulated mostly independently. See their first discussion paragraph: “Behavioral Analysis and Neural Recordings Support the Existence of Two Weakly Coupled Accumulators.” In this light, we think that representations of single-sided evidence could possibly be important contributors to a decision-making circuit.

Nevertheless, we agree with the reviewer that cerebellar computations might not be the site of accumulation per se but possibly a filtering step preceding or following accumulation, as a node in a complex decision-making circuit. It may be the case, for example, that the cerebellar contribution is related to the individual accumulators proposed by (Scott et al., 2017), and that the weak coupling occurs in neocortex or elsewhere. We have now emphasized this point (Discussion section).

(We respond to the following two reviews, 2a and 2d, together below):

a) While there appear to be lots of time-modulated signals, the link to evidence accumulation and choice is not clear. It is apparent that the most salient feature of the recorded cerebellar population is a constant modulation, either upwards or downwards, with the passage of time in the trial. The presence of this signal complicates the interpretation of any ramping activity as being attributable to evidence accumulation. However, the statistical analyses (i.e. the linear model reported in subsection “Neuronal signatures of choice and evidence in Purkinje cells”) ignore this component of the responses. This might be valid if the representation of time were truly independent from the representation of evidence. But no such independence is established. For example, cells that are more strongly driven by evidence might also be more strongly driven by time, or more likely to be driven by time in one direction or another. The authors should more carefully consider the relationship between the different components of the cerebellar responses and/or formally control for potentially confounding effects of the time-related responses in the statistical models.

d) The analysis of the representation of accumulated evidence (subsection “Neuronal signatures of choice and evidence in Purkinje cells” and Figure 3) ignores the time course of representation, focusing on a short window immediately before the decision. Even if we disregard technical problems explained above in 2a-c, it is unclear whether the cerebellar neurons represent integration of evidence over time or merely its final outcome. Note that the population analysis for the representation of evidence (Figure 2F) does not answer this question (and is generally not quite informative) because evidence is defined as the "correct" choice in that analysis rather than the magnitude of evidence (#R-#L).

Points 2a and 2d are each distinct and important points, which concern the temporal representations observed in Purkinje cells in our study. We address the specifics of these concerns below, but we note generally that their relevance is contingent on an assumption that we are claiming cerebellar neurons represent integration of evidence per se. While we are aware of studies that have made strong claims about the precise representations of accumulated evidence in single-neuron signals, we are not making claims as strong. We have modified the text in the manuscript to clarify this (subsection “Dynamics of choice- and evidence-related information in Purkinje cells”, Discussion section).

Our intention is to demonstrate that we observe evidence-related information in neuronal activity. Specifically, we make 2 points about evidence representation in our data:

(1) Stimulus information is present throughout the cue period: evidence-side decoding in Figure 2F demonstrates population-level encoding of evidence-related information throughout most of the cue period (which, as noted by the reviewer, is not necessarily of the #R-#L value, and we refer to review 2c where we have responded to this point).

(2) The magnitude of single-sided evidence is present in some neurons (via an unresolved temporal encoding scheme): shown by linear model for evidence strength effects in Figure 3C.

These analyses are comparable to the type shown in Figure 5 of (Scott et al., 2017), which demonstrates the presence of task-relevant information but does not resolve its time course. Further, based on new data given in response to review 4a, we know that these representations are task-specific.

Notably, we do notclaim that:

(1) cerebellar modulation is entirely driven by evidence. Some level of time modulation, as suggested in review 2a is likely in our view, though we note that our signals cannot be exclusively a time representation because of the analyses mentioned above.

(2) cerebellar signals represent the integration of evidence on a moment-to-moment basis. Our resolution and signal-to-noise does not allow us to answer this yet. Even if our neuronal population encodes evidence by, for example, a distributed temporal basis, the presence of the necessary information at all is an important finding.

Additionally, we know from our upcoming study of neural correlates in the primary and secondary visual cortices that even apparently simple sensory responses (time-locked to individual stimuli) are modulated by time and numerous other factors (Koay et al., 2018). The slow calcium time dynamics of Purkinje cell somata makes disentangling time and evidence effects very difficult, and this is better addressed by specific experiments e.g. with carefully designed sampling of stimulus times, which is beyond the scope of our current study.

In summary, we think that the concern in 2a (evidence and time modulations may be intertwined) and the concern in 2d (neuronal signals may not directly track evidence over time) are interesting questions we have not resolved, but that our core conclusions do not depend on the answers. Finally, we note that we are currently performing electrical-recording experiments that address this important issue as part of a follow-up study.

3) "Error-related" dendritic responses

In addition to somatic calcium signals, the authors analyzed dendritic calcium signals and found that these were higher on error vs. correct trials. The implications of this observation are emphasized heavily; for example, the Discussion section concludes that "the cerebellum, which learns from error to guide action, may help in the learning and tuning of accurate responses". This is an interesting proposal, but one might have had the same belief prior to seeing the results reported here (given existing human data and theories about the cerebellum), and it's not clear how the data should update it.

In this work we focused on reporting what is, to our knowledge, the first observation of error-related signalling in the Purkinje cell dendritic pathway during a cognitive, decision-making context. We agree with the reviewers that we did not resolve many interesting details of this signalling, in some part due to limitations of the experiment as explained in the specific replies below. Nevertheless, we feel that our result builds a bridge between known roles of error signals in motor contexts and hypothesized but never-observed roles in the domain of decision-making.

Beyond describing the finding, at this time we feel limited in our ability to propose specific roles of the observed dendritic signalling. We have therefore removed the heavy emphasis (like that referenced by the reviewer above). We still consider these findings an important result to include as they motivate future studies that we are currently pursuing, and we hope that they will also be of use to others given the current interest in this topic (Sendhilnathan et al., 2018; Wagner et al., 2017).

a) There is speculation about how the observed error responses could be used as error signals to guide learning, in line with existing models of cerebellar learning. However, it was not clear to the reviewers how that would work, especially given that the error responses observed were not directional. Typically, Purkinje cell complex spikes are thought to provide a directional signal for learning, not just a correct/ incorrect signal. There are no analyses to support the proposed functional role of these error related responses as being involved in "tuning" responses or correcting errors. Yet, it should be possible to provide a more thorough analysis of the error signals:

What do error signals predict about future behavior? An obvious analysis would be to ask whether the magnitude of the error signal on trial n−1 influences choice accuracy on trial n. This seems to be a clear prediction from the proposal that the cerebellum "tunes" behavior or "corrects errors".

We appreciate the trial-by-trial analysis suggestion and we are very interested in knowing the answer to this question as well, especially given the success of similar analyses in recent cerebellar work on head movements (Brooks, Carriot and Cullen, 2015), eye movements (Medina and Lisberger, 2008), and eyeblink conditioning (Ten Brinke et al., 2017). We attempted this analysis, and encountered an issue of statistical power that is related to slow learning in this behavior and therefore prohibitively low trial counts.

To illustrate the point, we have performed a power analysis considering the number of correct and error trials in our dataset. To detect a 10% difference from mean performance (at α=0.05) in fraction correct choices, one needs approximately 16x the number of trials we collected in our imaging study. We also note that this is under a very generous assumption that there would be a 10% difference in performance as a result of dendritic signalling within a session, but we know that in our task, small improvements in performance take place over weeks (learning occurs on the order of thousands of trials with gradually diminishing error rates). This is in contrast to the behaviors used in studies that have successfully performed such analyses, where learning is substantially faster and can thus be expected to be accompanied by robust neural correlates in just tens of trials.

We are encouraged to have found this elevated signalling and now that we know it exists, it may be possible to design a study to test this, but it would best involve a new or modified experimental design. We have added text to the Discussion on this topic.

c) How do the error signals relate to the strength of evidence on each trial? This is briefly mentioned at the end of the Results section and in Figure 4—figure supplement 1, but without any statistical tests or interpretation. Whether and how the error signals are modulated by the strength of evidence seems key to determining their functional role in updating behavior, if any. In particular, we would have expected that true "error" responses would be higher for easier trials, but the opposite appears to be true.

We have added statistics to the reporting of these results, which proved to be not significant despite a possibly interesting trend. If the trend had been significant, it may have been compatible with inverted reward signalling which is known to occur in some other brain regions (Cohen, 2007; Matsumoto and Hikosaka, 2007). We now comment on this in the manuscript (Discussion section).

4) Ruling out sensory and motor confounds as potential sources of somatic and dendritic calcium signals

a) Figure 3B indicates that isolated puffs produce a calcium response that gradually rises and falls over ~1 s. On trials with strong evidence, these signals will overlap. Assuming that they sum reasonably linearly, overlapping calcium responses from neurons that encode only the transient presentation of evidence would nevertheless give the impression of a gradual ramp with a slope that depends on the quantity of evidence. Note that this is a different issue from the point raised in the Discussion section about distinguishing single-trial ramps from steps. Rather, it makes it unclear whether the evidence-related cerebellar responses correspond to a representation of the momentary sensory evidence or to the magnitude of accumulated evidence that drives choice. This distinction is critical to interpreting proposed neural implementations of evidence accumulation models. One possible way to address this would be to record responses during stimulus presentation from animals that are not engaged in making a decision.

We see the importance of the suggested control, and we have included new data from mice subjected to trials under identical conditions but not trained in the decision-making behavior. These data can be seen in Figure 3—figure supplement 1 (and subsection “Dynamics of choice- and evidence-related information in Purkinje cells”), and they indicate an absence of ramping, evidence-side decoding, or puff count representation in these signals. We cite this, and the electrical recordings given in response to review 2b as evidence that the calcium modulation we observe is not a result of simple sensory responses.

b) The authors attempt to rule out that the somatic signals are related directly to movement of the animal and less related to the "cognitive" variables. The analysis in the supporting figures is in line with the level of detail often applied in the field currently. But, is it possible that the activity is related to other movements of the animal rather than in the orofacial region? The authors should either cite strong evidence showing that these are the only relevant regions for the area of cerebellum examined or provide additional videography results from other parts of the body (e.g. paws). This is of course a significant point to make as strong as possible given the cerebellum's long-established role in motor behaviors.

We now cite evidence of primarily orofacial representations in crus I (subsection “Purkinje cell somatic calcium encodes task-relevant information”). In particular, we note here that “in rats, the largest whisker projection area in the cerebellar cortex is located in crus 1, occupying the largest part of that folium” (Bosman et al., 2010) (and crus 1 features are well conserved in rodents (Sugihara, 2018)), and that a comprehensive review of cerebellar representations concludes “the physiological role of crus I and crus II in controlling forelimb and hindlimb muscles could no longer be maintained” (Manni and Petrosini, 2004).

Nevertheless, we now include more movie-based measurements of forepaws, which also support the conclusion that movements do not account for the somatic signals (subsection “Purkinje cell somatic calcium encodes task-relevant information”, Figure 2—figure supplement 4, Video 3).

We appreciate the reviewers’ concerns and do not believe we can extensively rule out all movements in the body (including other body parts not visible, and other more subtle movements). That said, it is a less likely explanation than the one we propose, given the known roles of this region and our inactivation results. We have demonstrated that the strongest expected movement-related signals, orofacial movements, are not responsible for the somatic activity that we observe. By adding forepaw analyses, we have furthermore shown that less-likely movement-related signals are also not an explanation. This standard of proof is comparable that rest of the field of evidence accumulation.

c) The analyses in Figure 4—figure supplement 1 do not convincingly rule out the possibility that the difference in dendritic calcium signals on error trials resulted from differences in licking on error trials. That figure clearly shows that cessation of licking proceeds at different rates on error trials compared to other times. How does trial type (correct vs error) affect the relationship between licking (or changes of rate in licking) and dendritic calcium signals?

We take this concern very seriously and we have thought hard about what analysis properly addresses it. We believe the reviewer’s question is addressed by our analyses in Figure 4F and Figure 4—figure supplement 1. We now explain our rationale more thoroughly in the text (subsection “Error-associated signalling in Purkinje cell dendrites”), as well as include an additional panel in Figure 4F to visually demonstrate the rationale of the analysis.

To our understanding, the specific null hypothesis we need to reject is that the elevation in dendritic signalling with errors is reflective not of an error event but rather of a specific motor event that tends to co-occur with errors. The most likely such motor event would be the cessation of licking, since this is the one that reliably occurs at the moment of errors (as the reviewer notes that we show in Figure 4—figure supplement 1A), but others, such as initiationof licking (perhaps toward the other direction) or movements of other sorts are possible too.

The reasoning of our analysis is as follows: consider first the hypothesis that dendritic activity increases not specifically with errors but rather when animals stop licking, an event that nearly always occurs when animals make errors. If this were true, then the cessation of licking, whether or not it occurs at the moment of error, would elicit this elevated signalling. We take advantage of the fact that we have many additional instances where lick cessation occurred not at the moment of errors. These “non-error” lick cessations occur in all correct trials, as noted by the reviewer. We computed the dendritic responses at all such moments of lick cessation, in both error contexts and non-error contexts (i.e. correct trials). We then compared the responses on a cell-by-cell basis across these two contexts, yielding an error:correct activity ratio. If the best explanation for the elevated signalling were a lick cessation-related signal, then these lick cessation events should elicit responses of similar magnitude (i.e. error:correct activity ratio of 1), but we find a statistically elevated level of signalling specifically when lick cessation occurs with errors, as compared to when it occurs elsewhere. This implies that signalling is specifically elevated at moments of error beyond what is expected just when animals stop licking in non-error contexts. In other words, lick cessation in error trials elicits more dendritic response than lick cessation in correct trials–a direct response to the question “How does trial type (correct vs error) affect the relationship between licking and dendritic calcium signals?”

We apply the same logic to various definable motor events, such as moments of low, medium, and high rates of licking, as well as lick initiation and orofacial movements. We also apply the lick initiation and cessation analyses over differing time windows, including the dendritic activity preceding and following the lick initiation/cessation, to account for the possibility that a lick-related signal in the cerebellum temporally precedes or follows the licking measurement.

We consider this to be the most thorough approach we can take to answer whether there exists elevated error-related dendritic signalling beyond what is expected from specific motor hypotheses.

To clarify this analysis for readers, we have added an additional panel (in Figure 4F) illustrating this type of measurement, and relabelled the descriptions to align with the “correct vs error” phrasing given by the reviewer (Figure 4F, Figure 4—figure supplement 1B).

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

The authors have made a concerted effort to clarify the text of their paper and have provided reasonable responses to many of the points raised. As was noted in the first round of reviews, rebuttal, and in the text, it remains unclear what the cerebellum's specific role or computation is during the task, and the reviewers were somewhat disappointed that most of the key questions are deferred to future studies. However, the reviewers also felt that defining cerebellum's exact role in the task may be asking too much for a first study, particularly given the limitations of the dataset and the conceptual and analytical complexities that prevent the authors from specifying the nature of neural representations and the source of the behavioral deficit following cerebellar inactivation.

Overall, there are several interesting leads about the role(s) of cerebellum in perceptual decisions in this paper. While none of these are firmly established by the current study, the reviewers agree that the writing and presentation of the results is generally fair and not greatly over-stated, and favor publication so that the results can be evaluated by the field and followed up in future studies from other groups.

The reviewers note that in the process of clarifying what can and cannot be claimed based on the existing data, the scope of the paper is limited to three points: successful training of mice to do the task, reduced accuracy following cerebellar inactivation, and representation of task-relevant variables in cerebellar neural population without specifying the exact nature of the represented variables. Before publication, the reviewers agree that it is important to ensure that after toning down its claims, the manuscript does not leave behind any statements that could mislead readers as to what is actually shown. The reviewers have identified the following statements from the Title, Abstract, and Discussion section that are potentially misleading and should be restated with more specific statements that more accurately match the conclusions that can be supported by the data.

Title

A cerebellar role in evidence-guided decision-making.

Cerebellar involvement in an evidence-accumulation decision-making task.

Impact Statement

The lateral posterior cerebellum participates in evidence-accumulation-based decision-making, and Purkinje neurons in this region encode choice-, evidence-, and error-related variables. [Suggest replacing this stronger statement with language more like that used in the rebuttal, such as "choice- and evidence-related information is present in lateral posterior cerebellum and could participate in decision-making computations during a decision-making task involving evidence-accumulation."].

In a new evidence-accumulation decision-making task, activity of the lateral posterior cerebellum is necessary for accurate performance, and Purkinje cell somatic and dendritic activity contain choice/evidence and error-related information.

Abstract

- Here we show that during perceptual decision-making over a period of seconds, decision-, sensory-, and error-related information converge on the lateral posterior cerebellum in crus I, [The presence of task-related signals is shown. Convergence is not, and decision and sensory signals are not clearly dissociated].

Sentence removed from abstract.

- Demonstrated that cerebellar inactivation reduces behavioral accuracy without impairing motor parameters of action [Not all motor parameters were controlled for].

(Abstract) Cerebellar inactivation led to a reduction in the fraction of correct trials.

- We found that Purkinje cell somatic activity encoded choice- and evidence-related variables [Please avoid the suggestion that the specific variables that are encoded have been determined].

(Abstract) “.…we found that Purkinje cell somatic activity contained choice/evidence-related information”.

- Decision errors were represented by dendritic calcium spikes, which are known to drive plasticity [This could misleadingly suggest that they are known to drive plasticity in this context].

(Abstract) “Decision errors were represented by dendritic calcium spikes, which in other contexts are known to drive cerebellar plasticity.”

- We propose that cerebellar circuitry may contribute to the set of distributed computations in the brain that support accurate perceptual decision-making. [Should be more focused on task performance].

(Abstract) “We propose that cerebellar circuitry may contribute to computations that support accurate performance in this perceptual decision-making task."

Discussion section

- Cerebellar inactivation reduces animals' use of evidence and increases their use of choice history. [given the limitations of the interpretation of the inactivation experiments, this statement should be more conservative].

(Discussion section) Our fits to a behavioral choice model indicate that reduced performance was accompanied by a decreased weighting of evidence and increased weighting of choice history parameters.

- Given the temporal resolution of calcium measurements, our somatic signals may correspond to firing rate ramps (Shadlen and Newsome, 2001), steps (Latimer et al., 2015), or more complex response profiles that form a temporal basis for evidence accumulation (Scott et al., 2017). [This statement, as well as the corresponding section of the Results section, should include an explicit reference to the time course of somatic calcium signals from Purkinje cells, which is at least an order of magnitude slower than typical calcium imaging (Ramirez and Stell, 2016).].

(Discussion section) Cytoplasmic calcium acts as a temporally filtered readout of firing rate, limited by calcium removal times that are slower in Purkinje cells (Konnerth et al., 1992; Lev-Ram et al., 1992; Fierro and Llano, 1996; Rokni and Yarom, 2009; Ramirez and Stell, 2016) than in neocortical neurons (Chen et al., 2013).

(Subsection “Purkinje cell somatic calcium encodes task-relevant information”) Cytoplasmic calcium acts as a temporally filtered readout of firing rate, and calcium extrusion in Purkinje cells occurs on a slower time scale (Konnerth et al., 1992; Lev-Ram et al., 1992; Fierro and Llano, 1996; Rokni and Yarom, 2009; Ramirez and Stell, 2016) than in neocortical neurons (Chen et al., 2013). Therefore, our observed increasing and decreasing time courses of calcium could reflect various firing rate profiles, such as impulse responses, ramps, or steps.

-The task-modulated activity we observe encodes both choice-related and evidence-related variables that may be used during the decision-making process. [Please avoid the suggestion that the specific variables that are encoded have been determined].

(Discussion section).…the neural activity we observed contains task-relevant information that may be used during evidence accumulation and decision-making.

- We observed an excess of dendritic calcium events coincident with decision errors, demonstrating for the first time observations compatible with error-associated signalling in a decision-making reward context. [given the limitations of the interpretation of these signals, this statement should be more conservative].

(Discussion section) We observed an excess of dendritic calcium events coincident with decision errors, which has not been previously reported in the cerebellum.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

The manuscript has been improved but there are some final issues that need to be addressed before acceptance, as outlined below:

In response to the request to re-evaluate the evidence for ramping signals that could be obtained with the somatic calcium imaging, the authors now state (Subsection “Purkinje cell somatic calcium encodes task-relevant information”), "Therefore our observed increasing and decreasing time courses of calcium could reflect various firing rate profiles, such as impulse responses, ramps, or steps."

In light of this revision, as well as the fact that the electrophysiological evidence provided in Figure 2—figure supplement 2 is from only a few cells, all of which show positive ramps of activity, the following statements should also be revised:

- (Subsection “Purkinje cell somatic calcium encodes task-relevant information”), "We did find that electrically recorded Purkinje cells exhibited gradually increasing rates of firing throughout the cue period (Figure 2—figure supplement 2), suggesting that on average across trials, the fluorescence signals we observed correspond to firing rate ramps."

(Subsection “Purkinje cell somatic calcium encodes task-relevant information”) “We did find that electrically recorded Purkinje cells exhibited gradually increasing rates of firing throughout the cue period (Figure 2—figure supplement 2).”

- (Discussion section), "Our electrical recordings also showed ramps, suggesting that temporally filtered firing rate ramps are sufficient to account for our observed fluorescence signals."

(Discussion section) “Our electrical recordings also showed ramps, consistent with the idea that temporally filtered firing rate ramps are sufficient to account for our observed fluorescence signals.”

The reason we went to the effort to obtain and include electrical recordings, per suggestions of many of our colleagues, was that it could help substantiate the possibility of fluorescence ramps corresponding to firing rate ramps. We think that the Discussion is an appropriate place to at least note the relationship between these two findings and our interpretation of them.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Your revised article has been favorably evaluated by our editors again, but there remain some issues that need to be addressed before acceptance, as outlined below. Given that this is the third request for revisions, we will be unable to follow with any more. Please attend to this final issue one way or the other so that the next letter will be the final one.

We appreciate the authors' desire to speculate here. However, in our view, the "suggests/ consistent with" was not the only problem with this sentence. There is also a problem with "sufficient". The electrophysiological evidence in Figure 2—figure supplement 2 is anecdotal and non-quantitative. For this statement to be left in, it would need to be adequately supported. In our view, this would require a quantitative comparison between imaging and electrophysiology results. In particular, we would want to know:

- How many Purkinje cells in total were recorded from electrophysiologically? How many of these showed ramping? (all of the cells they showed us show positive ramps, but it is not clear if those were selected from a larger data set)

- Did any Purkinje cells show ramping calcium signals without a transient increase in firing rate?

- Did any Purkinje cells show ramping calcium signals without ramps in firing rate (for instance, in cases where only a transient increase in firing may have been observed electrophysiologically)?

- Why are no decreasing activity ramps found with electrophysiology, but they are found with imaging?

- What accounts for the decreasing ramps that were observed with calcium imaging?

- What would the predicted calcium signals be for the examples shown if the spike rates recorded electrophysiologically (with and without the transient increase/ ramping components) were convolved according to Ramirez and Stell (2016)? And/or with the authors' own convolution/ deconvolution methods, from the simultaneous calcium imaging/ electrophysiological recordings that they performed?

We give the authors the choice of either fully addressing these points, or using compromise language, such as "Preliminary electrical recordings also showed ramps, consistent with the idea that temporally filtered firing rate ramps may account for the observed fluorescence signals."

We thank the reviewers and editors for their suggestions. We have made the requested adjustment to the manuscript text, opting to use the exact compromise phrasing beginning with “Preliminary …” proposed by the editors. Below we include the original sentence for reference, followed by the updated sentence:

(Discussion section), “Our electrical recordings also showed ramps, consistent with the idea that temporally filtered firing rate ramps are sufficient to account for our observed fluorescence signals.”

(Discussion section) “Preliminary electrical recordings also showed ramps, consistent with the idea that temporally filtered firing rate ramps may account for the observed fluorescence signals.”

https://doi.org/10.7554/eLife.36781.022

Article and author information

Author details

  1. Ben Deverett

    1. Department of Molecular Biology, Princeton University, Princeton, United States
    2. Princeton Neuroscience Institute, Princeton University, Princeton, United States
    3. Rutgers Robert Wood Johnson Medical School, Piscataway, United States
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    For correspondence
    deverett@princeton.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3119-7649
  2. Sue Ann Koay

    Princeton Neuroscience Institute, Princeton University, Princeton, United States
    Contribution
    Formal analysis, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  3. Marlies Oostland

    Princeton Neuroscience Institute, Princeton University, Princeton, United States
    Contribution
    Formal analysis, Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9474-4040
  4. Samuel S-H Wang

    1. Department of Molecular Biology, Princeton University, Princeton, United States
    2. Princeton Neuroscience Institute, Princeton University, Princeton, United States
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Writing—review and editing
    For correspondence
    sswang@princeton.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0490-9786

Funding

National Institute of Mental Health (MH115577)

  • Ben Deverett

National Institute of Neurological Disorders and Stroke (NS090541)

  • Ben Deverett
  • Sue Ann Koay
  • Samuel S-H Wang

National Institute of Neurological Disorders and Stroke (NS104648)

  • Ben Deverett
  • Sue Ann Koay
  • Samuel S-H Wang
  • Marlies Oostland

National Institute of Mental Health (MH115750)

  • Samuel S-H Wang
  • Marlies Oostland

National Institute of Neurological Disorders and Stroke (NS045193)

  • Samuel S-H Wang

Nancy Lurie Marks Family Foundation

  • Samuel S-H Wang

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the members of the laboratories of SW, David Tank, Ilana Witten, and Carlos Brody for discussion and technical assistance, as well as David Tank for advice and support, Esteban Engel for viruses, Ben Scott, Ilana Witten, Carlos Brody, Sandra Aamodt, and Alex Riordan for comments on the manuscript, Stephan Thiberge for microscopy, Steve Lowe for machine shop assistance, Jess Verpeut and Tom Pisano for technical help, and Sarah Welsh for data analysis. Funded by National Institutes of Health grants R01 NS045193, U01 NS090541, U19 NS104648, R01 MH115750, and F30 MH115577, and the Nancy Lurie Marks Family Foundation.

Ethics

Animal experimentation: Experimental procedures were approved by the Princeton University Institutional Animal Care and Use Committee (protocol #1943-16) and performed in accordance with the animal welfare guidelines of the National Institutes of Health. All surgery was performed under isoflurane anesthesia and suffering was minimized in all ways possible.

Senior Editor

  1. Richard B Ivry, University of California, Berkeley, United States

Reviewing Editor

  1. Megan R Carey, Champalimaud Foundation, Portugal

Publication history

  1. Received: March 21, 2018
  2. Accepted: August 11, 2018
  3. Accepted Manuscript published: August 13, 2018 (version 1)
  4. Version of Record published: August 22, 2018 (version 2)
  5. Version of Record updated: August 30, 2018 (version 3)

Copyright

© 2018, Deverett et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,957
    Page views
  • 328
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Neuroscience
    2. Structural Biology and Molecular Biophysics
    Dipak N Patil et al.
    Research Article
    1. Biochemistry and Chemical Biology
    2. Neuroscience
    Apurwa M Sharma et al.
    Research Advance