Motor thalamus supports striatum-driven reinforcement

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Reinforcement has long been thought to require striatal synaptic plasticity. Indeed, direct striatal manipulations such as self-stimulation of direct-pathway projection neurons (dMSNs) are sufficient to induce reinforcement within minutes. However, it’s unclear what role, if any, is played by downstream circuitry. Here, we used dMSN self-stimulation in mice as a model for striatum-driven reinforcement and mapped the underlying circuitry across multiple basal ganglia nuclei and output targets. We found that mimicking the effects of dMSN activation on downstream circuitry, through optogenetic suppression of basal ganglia output nucleus substantia nigra reticulata (SNr) or activation of SNr targets in the brainstem or thalamus, was also sufficient to drive rapid reinforcement. Remarkably, silencing motor thalamus—but not other selected targets of SNr—was the only manipulation that reduced dMSN-driven reinforcement. Together, these results point to an unexpected role for basal ganglia output to motor thalamus in striatum-driven reinforcement.

https://doi.org/10.7554/eLife.34032.001

Introduction

Reinforcement refers to a process by which the frequency or intensity of a specific behavior increases over time. Standard models of reinforcement invoke dopamine-dependent corticostriatal plasticity as a mechanism underlying this form of associative learning (Schultz and Dickinson, 2000; Reynolds et al., 2001; Doya, 2007; Maia and Frank, 2011; Yagishita et al., 2014; Xiong et al., 2015). In these models, reinforcement occurs through plasticity of excitatory inputs to striatal action-related ensembles, which results in enhanced future recruitment of these motor programs in similar contexts. Yet other studies have found that activating targets of basal ganglia output can also be reinforcing (classic electrical studies reviewed in Wise, 1996); for recent optogenetic studies, see below) and reinforcement learning in songbirds and primates is proposed to involve basal ganglia regulation of cortical plasticity (Turner and Desmurget, 2010; Fee and Goldberg, 2011). Thus, it remains unclear to what extent striatum-driven reinforcement requires engagement of basal ganglia output projections to downstream targets in brainstem or thalamus.

The striatum contains two distinct classes of projection neurons—direct (dMSNs) and indirect pathway medium spiny neurons (iMSNs)—that regulate motivated behavior by increasing or suppressing movement, respectively (Albin et al., 1989; Kravitz et al., 2010). Numerous lines of evidence support a role for the striatum in reinforcement. Early intracranial self-stimulation experiments showed that striatal stimulation is sufficient to drive operant responding in rodents, in the absence of food reward (Phillips et al., 1976; White and Hiroi, 1998). In addition, striatal microstimulation biases choice in cue-guided decision tasks in primates (Nakamura and Hikosaka, 2006; Williams and Eskandar, 2006). Although some of these effects are due to dopamine release from axon stimulation, pathway-specific optogenetic stimulation later revealed that direct pathway stimulation is reinforcing, whereas indirect pathway stimulation drives avoidance (Hikida et al., 2010; Kravitz et al., 2012). Indeed, mice readily learned an operant task to self-stimulate dMSNs and formed a memory of the stimulation-paired apparatus. This suggests that dMSN stimulation is not just acutely rewarding but can drive long-lasting changes in the brain supporting learning and memory (Kravitz et al., 2012). Consistent with this, stimulation of direct and indirect pathways oppositely biases both choice and vigor in water-reinforced tasks (Tai et al., 2012; Yttri and Dudman, 2016). Similarly, in vivo recordings from primates and rodents show that activity in discrete populations of MSNs encodes reward (Hikosaka et al., 1989; Apicella et al., 1991; Oyama et al., 2010), actions and/or their values (Lauwereyns et al., 2002; Barnes et al., 2005; Pasupathy and Miller, 2005; Samejima et al., 2005; Lau and Glimcher, 2008; Kim et al., 2009; Kimchi and Laubach, 2009). Together, these results demonstrate that MSNs both integrate and support motor and reinforcement functions.

The involvement of regions downstream of the basal ganglia in motor control and reinforcement has also been extensively studied. Different brainstem targets of basal ganglia are associated with regulation of distinct aspects motor behavior: the superior colliculus is implicated in orienting (Hikosaka et al., 2000), the periaqueductal gray is involved in freezing (Gross and Canteras, 2012), and the mesencephalic locomotor region (MLR) is engaged for locomotion (Esposito and Arber, 2016), whereas Dorsal Raphe Nucleus (DRN) neurons are involved in mood regulation and state-dependent control of motivation (Cohen et al., 2015). Basal-ganglia-recipient motor thalamus, which consists of ventromedial (VM), ventral anterior and lateral (VAL) nuclei, is involved in motor performance and skill learning (Hikosaka et al., 2000; Graybiel, 2005; Turner and Desmurget, 2010; Bosch-Bouju et al., 2013; Goldberg et al., 2013; Kawai et al., 2015). Extensive intracranial self-stimulation studies have revealed that a multitude of sites across the brain, including different brainstem targets of basal ganglia, are sufficient to drive reinforcement (Wise, 1996). Recent optogenetic studies also showed that activation of specific cell populations in the MLR (Dautan et al., 2016; Xiao et al., 2016; Yoo et al., 2017) and DRN (Liu et al., 2014; McDevitt et al., 2014) is reinforcing. Additionally, electrical stimulation of motor thalamus (Clavier and Gerfen, 1982) supports operant behavior, and lesions impair striatum-dependent learning (Zis et al., 1984). However, it is poorly understood whether these basal ganglia output targets are necessary for striatum-driven reinforcement. Indeed, after almost 70 years of research on the anatomical substrates of reinforcement, it is still debated whether nuclei supporting self-stimulation operate in series within a single circuit, or belong to multiple, parallel systems (Routtenberg, 1971; Phillips, 1984; Wise and Bozarth, 1984).

Here, we took advantage of optogenetic dMSN self-stimulation in mice to probe both striatal plasticity and basal ganglia circuit mechanisms in striatum-driven reinforcement. In theory, repeated pairing of operant behavior with dMSN depolarization could locally induce Hebbian synaptic plasticity onto stimulated dMSNs and/or engage downstream circuits to drive reinforcement. We found that dMSN self-stimulation was not accompanied by measurable changes in synaptic strength of excitatory afferents, nor was it impaired by disrupting NMDA receptors in stimulated neurons. Furthermore, reinforcement could be elicited by direct suppression of SNr. In order to identify the contribution of basal ganglia target regions to striatal reinforcement, we combined dMSN self-stimulation with silencing of selected basal ganglia output targets. Although brainstem and thalamic targets were sufficient to drive reinforcement, only motor thalamus was necessary for dMSN self-stimulation. Thus, our results highlight an underappreciated circuit mechanism for striatum-driven reinforcement through recruitment of motor thalamus.

Results

dMSN self-stimulation supports reinforcement independently of NMDA receptors

We bilaterally injected Cre-dependent ChR2-eYFP virus and implanted optical fibers in the dorsomedial striatum (DMS) of D1-Cre mice (D1-ChR2) for selective self-stimulation of dMSNs, which represented the primary reinforcer motivating task performance (Figure 1a, Figure 1—figure supplement 1a). Mice learned to self-stimulate dMSNs by poking in an active (laser-paired) nosepoke (Figure 1b), whereas a second, inactive nosepoke had no effect. We characterized the self-stimulation behavior by varying different task parameters. Poke rate was sensitive to laser duration, stimulation pattern, effort, and contingency degradation, all of which are hallmarks of goal-directed behavior (Figure 1—figure supplement 2). For subsequent experiments, we chose one second of constant laser stimulation per nosepoke to optimally balance poke rate and total laser stimulation. Within a single session, D1-ChR2 mice began to make more nosepokes on the active side and rapidly neglected the inactive nosepoke (Figure 1c,d). In contrast, control mice (expressing only eYFP) produced fewer and equal numbers of nosepokes on both sides. Overall, no difference in total locomotion was observed during the session (Figure 1—figure supplement 2f). The total number of active nosepokes per session was stable over days (Figure 1e). Similarly, poking patterns did not change over time, showing stable inter-poke interval distributions and averages between day 1 and 4. Poking behavior could be divided in epochs of poking bursts, defined here by at least two nosepokes performed in <2 s, or an alternative criterion of <6 s. In either case, the number of bursts and pokes/bursts did not change over days (Figure 1—figure supplement 3). Additionally, when mice were subjected to an extinction session 24 hr after the last training session, D1-ChR2 mice made significantly more nosepokes on the previously-active side, despite the absence of dMSN stimulation (Figure 1g). In contrast, D1-eYFP mice showed no preference for either side. Together these results demonstrate that behavior was learned and stabilized within a single session, and that mice form a stable association (lasting at least 24 hr) between the nosepoke and the reinforcer (dMSN stimulation). Therefore, we focused our analysis on the first behavior session (see materials and methods).

Figure 1 with 3 supplements see all

Download asset Open asset

dMSN self-stimulation elicits rapid and persistent NMDAR-independent reinforcement.

(a) Schematic of a sagittal brain section showing optogenetic stimulation of striatal dMSNs (left), and slice showing fluorescent dMSNs after striatal infusion of a Cre-dependent virus encoding eYFP in a D1-Cre mouse (right). (b) Schematic of the behavioral apparatus showing active (laser-paired) and inactive nosepokes. (c) Example cumulative plot of active nosepokes from a D1-ChR2 (blue) and D1-eYFP (black) mouse. (d) Average nosepokes during a single self-stimulation session for D1-ChR2 and D1-eYFP mice (n = 11 and 8, two-way ANOVA, interaction poke x group, F(1,17)=31.03, p<0.001, posthoc Sidak’s multiple comparisons test, ChR2 active vs eYFP active p<0.001, ChR2 active vs inactive p<0.001, eYFP active vs inactive p=0.921). (e) Average nosepokes during four consecutive daily sessions (n = 8, one-way ANOVA, F(1.908,13.36) = 1.237, p=0.319). (f) Distribution of (left) and average (right) inter-poke intervals for days 1 and 4 of dMSN self-stimulation (n = 8, paired t-test, p=0.0575). (g) Average nosepokes during an extinction session for D1-ChR2 and D1-eYFP mice (n = 8 and 8, two-way ANOVA, interaction poke x group, F(1,14)=47.86, p<0.0001, posthoc Sidak’s multiple comparisons test, ChR2 active vs eYFP active p<0.0001, ChR2 active vs inactive p<0.0001, eYFP active vs inactive p=0.5703). (h) Example cumulative plot of active nosepokes from D1-ChR2 mice expressing (D1-Cre, blue) or lacking (D1-Cre x NR1f/f, red) NMDA receptors in dMSNs. (i) Average nosepokes during a single self-stimulation session for D1-ChR2 mice with (‘D1’) or without (‘D1-NR1^KO’) NMDA receptors in dMSNs (n = 9 and 7, two-way ANOVA, interaction poke x group, F(1,14)=0.638, p=0.438, posthoc Sidak’s multiple comparisons test, D1 active vs D1-NR1 active p=0.418). (j) Average distance travelled by D1-ChR2 mice with (‘D1’) or without (‘KO’) NMDA receptors in dMSNs (Mann Whitney U test, p=0.023).

https://doi.org/10.7554/eLife.34032.002

Figure 1—source data 1 Source data for Figure 1.: https://doi.org/10.7554/eLife.34032.008
Download elife-34032-fig1-data1-v2.xlsx

Current models of basal ganglia reinforcement posit that synaptic plasticity of excitatory cortical inputs onto MSNs supports learning (Schultz and Dickinson, 2000; Doya, 2007; Maia and Frank, 2011). NMDA receptors have been implicated in the induction of various forms of synaptic plasticity in the striatum (Calabresi et al., 1992; Thomas et al., 2000; Shen et al., 2008), and interfering with striatal NMDA receptors disrupts action-outcome associations (Yin et al., 2005a). To assess a role for synaptic plasticity in dMSN reinforcement, we generated mice lacking NMDA receptors in dMSNs (D1-Cre x NR1^f/f), in which various forms of plasticity are altered (Dang et al., 2006). However, these mice showed normal dMSN self-stimulation, despite a mild decrease in locomotion (Figure 1h–j, Figure 1—figure supplement 1a).

We next tested whether any trace of synaptic plasticity could be observed in stimulated (ChR2+) dMSNs after training. We trained D1-Cre x D1-tmt mice injected with Cre-dependent ChR2 or eYFP and then prepared acute slices for whole-cell recordings. We targeted neurons directly underneath the optic fiber tip (Figure 2a). We recorded: ChR2 +and ChR2- dMSNs (identified by tmt expression and presence or absence of ChR2-dependent photocurrent, respectively) in D1-ChR2 mice; eYFP +neurons in D1-eYFP mice, and tmt +dMSNs in naive D1-tmt mice. We examined mEPSCs, AMPA/NMDA ratios, paired-pulse ratio, and intrinsic excitability of dMSNs, but found no difference in any of these parameters after training (Figure 2). Similarly, we probed synaptic transmission and excitability in iMSNs (D1-tmt negative) after dMSN self-stimulation and did not observe any plastic changes (Figure 2—figure supplement 1). Taken together, these results prompted us to explore the involvement of circuitry downstream of striatum in reinforcement.

Figure 2 with 1 supplement see all

Download asset Open asset

No measurable changes in excitatory synaptic transmission or excitability in dMSNs after self-stimulation.

(a) Coronal section of the striatum showing ChR2-eYFP expression in DMS, optic fiber track and recording area (dotted box), corresponding approximately to the portion of striatum illuminated during behavior. (b) Whole-cell recordings of dMSNs (tmt+) from trained D1-ChR2 mice (including ChR2 +and ChR2- neurons, identified by the presence or absence of a blue light-evoked photocurrent, inset) and naive mice (tmt+), showing average (left) and example traces (right) for mEPSC frequency and amplitude (frequency: Kruskal-Wallis test, p=0.505; amplitude, one-way ANOVA, F(2,32)=0.6141, p=0.547). (**c,d**) Whole-cell recordings of dMSNs (tmt+) from trained D1-ChR2 mice (including ChR2 +and ChR2- neurons), trained D1-eYFP mice (eYFP +neurons), or naive mice (tmt+), showing averages (left) and example traces (right) for AMPA/NMDA ratio and paired-pulse ratio (c) and excitability (d) [AMPA/NMDA ratio: n = 8,7,6,7; one-way ANOVA, F(3,24)=0.062, p=0.9793; paired-pulse ratio: n = 8,11,11,7; one-way ANOVA, F(3,32)=1.358, p=0.273; excitability: n = 6,8,14; two-way ANOVA, interaction current x group F(10,125)=0.3801, p=0.9533].

https://doi.org/10.7554/eLife.34032.009

Figure 2—source data 1 Source data for Figure 2.: https://doi.org/10.7554/eLife.34032.012
Download elife-34032-fig2-data1-v2.xlsx

Reinforcement through modulation of circuitry downstream of striatum

Given the widespread influence of the basal ganglia over multiple brain structures (Freeze et al., 2013; Oldenburg and Sabatini, 2015; Lee et al., 2016), we asked which specific downstream circuits could be involved in dMSN-driven reinforcement. Activation of inhibitory dMSNs leads to a net suppression of SNr neurons and disinhibition of downstream targets. First, we tested whether bypassing the striatum and directly inhibiting the SNr would be sufficient to recapitulate direct pathway-driven reinforcement. To do so, we bilaterally expressed Cre-dependent Arch3.0 in the SNr of vGAT-Cre mice. Surprisingly, these animals also learned to self-inhibit the SNr (Figure 3a–c, Figure 1—figure supplement 1b).

Figure 3

Download asset Open asset

Optogenetic control of basal ganglia output neurons or their projection targets is reinforcing.

(a) Schematic of a sagittal brain section showing optogenetic inhibition of SNr (left), and coronal slice showing Arch3-eYFP and TH expression after SNr infusion of a Cre-dependent construct in a vGAT-Cre mouse and fiber track (right). (b) Example cumulative plot of active nosepokes from a SNr-Arch3 (green) and SNr-eYFP (black) mouse. (c) Average nosepokes during a single self-inhibition session for SNr-Arch3 and SNr-eYFP mice (n = 7 and 10, two-way ANOVA, interaction poke x group, F(1,15)=13.01, p=0.003, posthoc Sidak’s multiple comparisons test, Arch3 active vs eYFP active p=0.007). (d) Schematic of a sagittal brain section showing optogenetic excitation of VM (left), and coronal slice showing ChR2-eYFP expression after VM infusion of CaMKIIα-ChR2 in a WT mouse (right). (e) Example cumulative plot of active nosepokes from a VM-ChR2 (blue) and VM-eYFP (black) mouse. (f) Average nosepokes during a single self-stimulation session for VM-ChR2 and VM-eYFP mice (n = 8 and 8, two-way ANOVA, interaction poke x group, F(1,14)=22.17, p=0.003, posthoc Sidak’s multiple comparisons test, ChR2 active vs eYFP active p<0.001). (g) Schematic of a sagittal brain section showing optogenetic excitation of DRN (left), and coronal slice showing ChR2-eYFP overlapping with 5HT expression after infusion of DIO-ChR2 in a SERT-Cre mouse and fiber track (right). (h) Example cumulative plot of active nosepokes from a DRN-ChR2 (blue) and DRN-eYFP (black) mouse. (i) Average nosepokes during a single self-stimulation session for DRN-ChR2 and DRN-eYFP mice (n = 4 and 6, two-way ANOVA, interaction poke x group, F(1,8)=38.09, p<0.001, posthoc Sidak’s multiple comparisons test, ChR2 active vs eYFP active p<0.001). (j) Schematic of a sagittal brain section showing optogenetic excitation of the MLR (left), and coronal slice showing ChR2-eYFP expression after MLR infusion (centered around the PPTg) of DIO-ChR2 in a vGLUT2-Cre mouse and fiber tracks (right). (k) Example cumulative plot of active nosepokes from an MLR-ChR2 responsive (resp, blue) and MLR-ChR2 non-responsive (non-resp, black) mouse. (l) Average nosepokes during a single self-stimulation session for MLR-ChR2 responsive and non-responsive mice (n = 2 and 3). Abbreviations: 5HT, 5-hydroxytryptamine; DRN, dorsal raphe nucleus; MLR, mesencephalic locomotor region, PPTg, pedunculopontine tegmentum; SNc, substantia nigra compacta; SNr, substantia nigra reticulata; TH, tyrosine hydroxylase; VM, ventromedial thalamus; VTA, ventral tegmental area.

https://doi.org/10.7554/eLife.34032.013

Figure 3—source data 1 Source data for Figure 3.: https://doi.org/10.7554/eLife.34032.014
Download elife-34032-fig3-data1-v2.xlsx

The SNr distributes basal ganglia output across several brain regions, including a canonical projection to ventromedial thalamus, traditionally recognized as part of the motor thalamus (Starr and Summerhayes, 1983; Klockgether et al., 1985; Deniau and Chevalier, 1992). It was recently shown that VM neurons send projections to and interact with prefrontal cortices, which are involved in decision-making and learning (Kuramoto et al., 2015; Guo et al., 2017). Therefore, we asked whether activation of VM would be sufficient to support operant behavior. We targeted VM neurons with CaMKIIα-ChR2-containing virus in WT mice. These mice showed robust self-stimulation of VM (Figure 3d–f, Figure 1—figure supplement 1c).

Brainstem targets of SNr, including the DRN (Liu et al., 2014; McDevitt et al., 2014; Cohen et al., 2015) and the pedunculopontine tegmentum, a part of the MLR (Xiao et al., 2016; Yoo et al., 2017), have also been associated with reinforcement. However, there is some uncertainty about the neuronal identity of cell populations involved. When targeting serotonergic neurons of the DRN, we observed self-stimulation in SERT-Cre mice injected with Cre-dependent ChR2 (Figure 3g–i, Figure 1—figure supplement 1d), but not in ePET-Cre mice (data not shown). In the MLR, glutamatergic neurons have been shown to receive innervation from SNr and drive both locomotion (Roseberry et al., 2016) and reinforcement (Yoo et al., 2017). We targeted these neurons with Cre-dependent ChR2 infusion into the MLR of vGLUT2-Cre mice and functionally validated expression by confirming increased locomotion upon stimulation (data not shown). When placed into the operant chamber, 2 of 5 mice showed clear self-stimulation behavior, whereas the other three showed no preference (Figure 3j–l, see materials and methods for responder criterion). Taken together, these results show that self-stimulation behavior can be elicited in both the DRN and MLR, raising the possibility that they are also engaged during dMSN self-stimulation to mediate reinforcement.

DRN serotonergic and MLR glutamatergic neurons are not required for dMSN-driven reinforcement

We next asked whether activation of DRN serotonergic neurons or MLR glutamatergic neurons is necessary for dMSN-driven reinforcement. To do so we combined dMSN self-stimulation with various approaches to silence these downstream nuclei. To test the role of DRN serotonin neurons, we virally expressed Cre-dependent Caspase three in SERT-Cre mice, leading to loss of SERT + cells (Figure 1—figure supplement 1e). Selective dMSN excitation was achieved by injecting hSyn-ChR2-eYFP virus into the striatum and placing optic fibers above the anterior tip of SNr, where only fibers from dMSNs are present (Figure 4a). This approach yielded reliable dMSN self-stimulation, similar to striatal cell body self-stimulation (Figure 1—figure supplement 2e). However, we observed no difference in the number of nosepokes performed by mice with lesioned or intact DRN (Figure 4b,c).

Figure 4 with 1 supplement see all

Download asset Open asset

Silencing DRN serotonergic neurons or MLR glutamatergic neurons does not disrupt dMSN self-stimulation.

(a) Sagittal schematic showing injection of a hSyn-ChR2 construct in the striatum and DIO-Casp3 in the DRN of a SERT-Cre mouse, and optic fiber placement above dMSN axons. (b) Example cumulative plot of active nosepokes for dMSN axon stimulation in mice infused with saline (sal, blue) or DIO-Casp3 (casp, orange). (c) Average nosepokes during a single self-stimulation session for dMSN axon self-stimulation in mice with intact (sal) or lesioned (casp) serotonergic neurons in DRN (sal versus casp, two-way ANOVA, interaction pokes x stim F(1,8)=0.45, p=0.521, posthoc Sidak’s multiple comparisons test, sal active vs casp active p=0.76). (d) Sagittal schematic showing optogenetic stimulation of dMSNs and inhibition of MLR glutamatergic neurons (CaMKIIα-eNpHR3.0). (e) Example cumulative plot of active nosepokes from D1-ChR2 mice for dMSN stimulation alone (blue) or paired with MLR inhibition (green). (f) Average nosepokes during a single self-stimulation session for dMSN stimulation alone or paired with MLR inhibition, or MLR inhibition alone (D1 +versus D1+ and MLR-, two-way ANOVA, interaction pokes x stim F(1,10)=1.618, p=0.232, posthoc Sidak’s multiple comparisons test, D1 +active vs D1+ and MLR- active p=0.142; MLR-, Wilcoxon signed rank test, active vs inactive p=0.625).

https://doi.org/10.7554/eLife.34032.015

Figure 4—source data 1 Source data forFigure 4.: https://doi.org/10.7554/eLife.34032.018
Download elife-34032-fig4-data1-v2.xlsx

We then asked whether glutamatergic neurons in the MLR could be involved in dMSN-driven reinforcement. We expressed Cre-dependent ChR2 in the striatum of D1-Cre mice and eNpHR3.0 in the MLR under the CaMKIIα promoter to target glutamatergic neurons (Figure 4d). We confirmed that silencing MLR glutamatergic neurons reduced dMSN-evoked locomotion (Figure 4—figure supplement 1), as previously reported (Roseberry et al., 2016). However, we observed no difference in the total number of nosepokes in mice for dMSN stimulation alone or paired with inhibition of MLR glutamatergic neurons (Figure 4e,f).

VM supports dMSN-driven reinforcement

Given that the brainstem targets of basal ganglia output that we tested do not appear to be required for dMSN-mediated reinforcement, we next examined the role of SNr-recipient VM thalamus. In order to test whether VM activity contributes to dMSN-mediated reinforcement, we inhibited VM with muscimol, a GABA_A receptor agonist, in D1-ChR2 mice prior to optogenetic self-stimulation (Figure 5a, Figure 1—figure supplement 1f). Overall, VM-muscimol mice self-stimulated less than saline-infused control mice (Figure 5b,c). In contrast, VM silencing had no effect on either spontaneous or dMSN-evoked locomotion (Figure 5d,e). To examine the requirement for VM with higher temporal resolution, we injected Cre-dependent ChR2-eYFP in the striatum and DIO-ChR2-eYFP or DIO-eYFP control virus in the SNr of vGAT-Cre mice (Figure 5f, Figure 1—figure supplement 1g). For dMSN-selective stimulation, we bilaterally implanted optic fibers above the cerebral peduncle, which contains axons of dMSNs on their way to the SNr. We also implanted a second set of fibers above VM, to specifically manipulate axon terminals of VM-projecting SNr neurons. This design enabled us to prevent VM disinhibition only during dMSN stimulation. Similar to what we observed with muscimol infusions, mice made fewer nosepokes when dMSN stimulation was paired with SNr terminal stimulation in VM, compared to control mice expressing only eYFP in the SNr (Figure 5g,h). SNr terminals were stimulated at 20 Hz, a frequency similar to basal SNr firing rates in mice (Freeze et al., 2013), and which alone was not reinforcing. Together, these results indicate that VM disinhibition is a critical component of the reinforcing circuit initiated by dMSN self-stimulation.

Figure 5

Download asset Open asset

VM thalamus inhibition disrupts dMSN self-stimulation.

(a) Schematic of a sagittal brain section showing optogenetic excitation of striatal dMSNs combined with muscimol infusion in VM, and coronal slice from a D1-Cre mouse showing ChR2-eYFP-expressing fibers from dMSNs en route to SNr (green) and infusion site in VM (DiI, orange). (b) Example cumulative plot of active nosepokes from D1-ChR2 mice infused with saline (sal, blue) or muscimol (mus, red). (c) Average nosepokes during a single self-stimulation session for D1-ChR2 mice infused with saline (sal) or muscimol (mus, n = 8 and 7, two-way ANOVA, interaction poke x group, F(1,13)=6.33, p=0.026, posthoc Sidak’s multiple comparisons test, sal active vs mus active p=0.0009, sal active vs inactive p<0.001, musactive vs inactive p=0.018). (d) Average spontaneous locomotion in D1-ChR2 mice infused in VM with muscimol or saline (left), and example tracks (right) (n = 3, Wilcoxon matched-pairs signed rank test, p=0.5). (e) Average distance travelled before (pre), during (laser) and after (post) dMSN stimulation in the same mice as in g (left), and example tracks (right) (two-way ANOVA, interaction laser x drug F(2,4)=0.873, p=0.485, main effect of laser F(2,4)=14.16, p=0.015, posthoc Tukey’s multiple comparisons test, sal pre vs laser p=0.002, mus pre vs laser p=0.001, sal laser vs mus laser p=0.994). (f) Schematic of a sagittal brain section showing injection of Cre-dependent ChR2 in the striatum and SNr of a vGAT-Cre mouse (left) and fiber placement above dMSN axons and in VM for optogenetic excitation of dMSN combined with excitation of SNr terminals in VM, respectively (right). (g) Example cumulative plot of active nosepokes for dMSN axon stimulation paired with SNr-eYFP (D1+, blue) or SNr-ChR2 terminal stimulation in VM (D1+ and SNr+, red). (h) Average nosepokes during a single self-stimulation session from the same mice as in e (n = 6 and 6, two-way ANOVA, interaction poke x group, F(1,10)=12.95, p=0.0049, posthoc Sidak’s multiple comparisons test, D1 +active vs D1+ and SNr +active p<0.001, D1 +active vs inactive p<0.001, D1+ and SNr +active vs inactive p=0.018; SNr+, Wilcoxon signed rank test, active vs inactive p=0.625).

https://doi.org/10.7554/eLife.34032.019

Figure 5—source data 1 Source data for Figure 5.: https://doi.org/10.7554/eLife.34032.020
Download elife-34032-fig5-data1-v2.xlsx

Finally, to better understand how basal ganglia output modulates VM, we performed ex vivo and in vivo recordings in VM during stimulation of basal ganglia circuitry. We first expressed ChR2 in the SNr of vGAT-Cre mice. We prepared acute slices of VM from the same mice and performed whole-cell recordings in VM that confirmed the inhibitory nature of this projection (Figure 6—figure supplement 1a). Next, we performed extracellular recordings from 163 VM neurons in awake, head-fixed mice while stimulating dMSNs for 1 s, mimicking task conditions (Figure 6a, Figure 6—figure supplement 1b). VM cells exhibited a variety of responses, from excitation to inhibition (Figure 6b), with the average Z-score of all units showing a net increase in firing during dMSN stimulation (Figure 6c). Within 20 ms of stimulation, most responsive VM neurons were excited (Figure 6d); these excited neurons displayed a shorter latency to change firing rate than inhibited cells. Within a 100 ms of stimulation, more cells were excited, and a significant fraction of neurons with decreased activity appeared (Figure 6e). Over 1 s of stimulation, a majority of neurons were excited, while a small fraction of cells did not show any significant change in firing (Figure 6f). There was no obvious spatial segregation of response type within VM (Figure 6—figure supplement 1c). These results demonstrate that dMSN activation increases activity in a majority of VM thalamus neurons and, taken together with the VM self-stimulation data (Figure 3) and inactivation data (Figure 5), further support a role for VM disinhibition in dMSN-driven reinforcement.

Figure 6 with 1 supplement see all

Download asset Open asset

dMSN stimulation increases firing in VM thalamus.

(a) Sagittal schematic depicting dMSN optogenetic stimulation with simultaneous in vivo single-unit recording from VM in awake, head-fixed mouse. (b) Peri-event time histogram of all 163 cells recorded in VM, aligned to dMSN stimulation (blue bar, 1 s duration). Units are sorted by response type (increase [excit], no significant change [ns], decrease [inhib]) and modulation index (see below). (c) Average z-scored response of all VM neurons to dMSN stimulation (blue shading, 1 s duration, n = 163, baseline vs stim, Wilcoxon signed rank test, p=0.0006). (**d–f**) Detailed analysis of responses for the first 20 ms (d), first 100 ms (e) or full 1 s (f) of dMSN stimulation (blue shading). Average firing rate (top) and modulation index (bottom) for excited (red), inhibited (blue) or unmodulated units (grey). Latency to significant change in firing rate is shown in d, top inset. Bottom insets, pie charts showing fraction of excited (red), inhibited (blue) and unmodulated (grey) VM neurons during dMSN stimulation for each time window. See materials and methods for modulation index calculation.

https://doi.org/10.7554/eLife.34032.021

Figure 6—source data 1 Source data for Figure 6.: https://doi.org/10.7554/eLife.34032.023
Download elife-34032-fig6-data1-v2.xlsx

Discussion

We used striatal dMSN self-stimulation as a model to dissect the circuit mechanisms underlying a simple form of striatum-driven reinforcement. Our results highlight several important aspects of basal ganglia function. First, we demonstrate that activation of striatal direct pathway projection neurons can engage a form of reinforcement that is independent of NMDAR-dependent plasticity in dMSNs. Second, we find that multiple downstream targets of basal ganglia output can mediate reinforcement, not all of which are required for dMSN self-stimulation. Finally, we identify an unexpected role for motor thalamus in supporting dMSN-driven reinforcement.

An alternative to classic mechanisms of reinforcement

Models of the basal ganglia typically propose that corticostriatal synaptic plasticity underlies reinforcement (Schultz and Dickinson, 2000; Doya, 2007; Maia and Frank, 2011). We observed here that dMSN self-stimulation plateaued during the first day of training and that mice formed a memory of the dMSN stimulation-paired nosepoke, consistent with our previous report (Kravitz et al., 2012). These results suggest that behavior was learned and acquired within a single session, presumably by engaging long-term plasticity. We initially hypothesized that the repeated pairing of a specific action (entry into the active nosepoke) with depolarization of ChR2-expressing dMSNs would drive NMDAR-dependent Hebbian plasticity at active (task-related) synapses onto stimulated dMSNs. This mechanism would thereby provide a synaptic substrate for reinforcement learning, through the task-dependent recruitment of action-promoting dMSNs. However, we observed that dMSN self-stimulation was unaltered by NMDAR knockout in dMSNs (Figure 1). Additionally, despite our ability to identify and record from task-relevant ChR2-expressing dMSNs, we did not observe changes in pre or postsynaptic excitatory transmission, or in neuronal excitability, across both dMSNs and iMSNs (Figure 2). In fact, previous reports suggest that simple types of reinforcement can occur without NMDAR-dependent striatal plasticity. For example, mice with dorsal striatum lesions can still learn and perform an action to obtain food rewards (Yin et al., 2005b). Similarly, altering MSN NMDAR composition by knocking out GluN2B, an important subunit for plasticity induction (Paoletti et al., 2013), does not prevent mice from learning a basic operant task (Brigman et al., 2013). Likewise, genetic deletion of NMDA receptors in MSNs does not prevent mice from learning simple action-outcome associations (Jin and Costa, 2010); although see Beutler et al., 2011). Altogether, these results are consistent with the idea that simple forms of reinforcement may not require corticostriatal synaptic plasticity, but rather take advantage of a different mechanism.

Multiple basal ganglia output targets for reinforcement

Several basal ganglia output targets in the brainstem have been associated with reinforcement. Recently, a series of optogenetic studies reported a role for both DRN serotonergic and non-serotonergic populations in reinforcement. Consistent with these reports, we observed that mice self-stimulated serotonergic neurons of the DRN (Figure 3). We observed reinforcement in SERT-Cre mice, which failed to sustain operant behavior in a previous report (Fonseca et al., 2015). In ePET-Cre mice, we did not observe reinforcement, consistent with one previous report (McDevitt et al., 2014), but inconsistent with another recent study (Liu et al., 2014). These discrepancies could be explained by variable targeting of DRN subregions and stimulation parameters. Despite overall mixed observations about which exact neuronal population is sufficient for reinforcement, a common result is that DRN activation is reinforcing through direct excitation of midbrain dopamine neurons (Liu et al., 2014; McDevitt et al., 2014). In our hands, ablation of DRN SERT-positive neurons did not affect dMSN stimulation (Figure 4).

Similarly, the MLR receives projections from the SNr and has been associated with both locomotor control and reinforcement (Inglis et al., 2000; Ryczko and Dubuc, 2013). Both glutamatergic and cholinergic populations of the MLR have been associated with reinforcement (Dautan et al., 2016; Xiao et al., 2016; Yoo et al., 2017), and we have previously shown that striatal dMSNs drive locomotion by disinhibiting MLR glutamatergic neurons (Roseberry et al., 2016). We observed variable levels of MLR glutamatergic neuron self-stimulation at sites that reliably drove locomotion (Figure 3). However, dMSN-driven reinforcement did not require MLR glutamatergic neurons (Figure 4). Similar to DRN, MLR self-stimulation of both glutamatergic and cholinergic populations seems to rely on direct excitation of midbrain dopamine neurons (Xiao et al., 2016; Yoo et al., 2017). Altogether, these results show that dMSN self-stimulation does not require DRN serotonergic or MLR glutamatergic neurons, suggesting that these nuclei belong to separate reinforcement circuits.

A role for motor thalamus in reinforcement

VM thalamus is traditionally classified as a motor nucleus relaying basal ganglia signals toward motor cortex (Starr and Summerhayes, 1983; Klockgether et al., 1985; Turner and Desmurget, 2010). VM and VA/VL nuclei of the thalamus receive basal ganglia outputs (Carter and Fibiger, 1978; Di Chiara et al., 1979) and project to motor cortical regions (Herkenham, 1979). However, VM differs with unique, widespread axonal projections to Layer I across many cortical areas including prefrontal and parietal cortices that are associated with more cognitive functions (Kuramoto et al., 2015).

Consistent with the disinhibitory architecture of the basal ganglia (Deniau and Chevalier, 1985; Albin et al., 1989), we observed an overall increase in VM activity upon dMSN stimulation (Figure 6). Latency to excitation was short (~10–20 ms), in line with di-synaptic disinhibition via the SNr. However, a significant fraction of cells showed a decrease in firing, highlighting a degree of heterogeneity among VM neurons. These differences in response type could be based on connectivity or biophysical properties of VM neurons. Alternatively, inhibition could arise from polysynaptic network effects like recruitment of inhibition (via the reticular thalamic nucleus) by VM excited neurons or through lateral disinhibition within the SNr among VM-projecting cells. Both scenarios would be consistent with the longer latencies to inhibition observed here. Nevertheless, our results demonstrate that dMSN stimulation significantly increases activity in the majority of VM neurons (Figure 6), which is critical for dMSN-driven reinforcement (see below).

Mimicking dMSN-driven VM disinhibition with optogenetic VM activation supported operant behavior, similar to dMSN self-stimulation and SNr self-inhibition (Figure 3), and consistent with VM electrical self-stimulation in rats (Clavier and Gerfen, 1982). In contrast to results obtained with silencing of brainstem SNr targets, both extended silencing of VM with muscimol infusion, or temporally-precise SNr terminal excitation time-locked to dMSN stimulation, decreased the total number of nosepokes mice performed to obtain dMSN stimulation (Figure 5). The latter approach argues for an acute role of VM disinhibition in reinforcement, rather than a global alteration of brain state. Therefore, it is possible that the magnitude of reinforcement (number of nosepokes) scales with that of VM disinhibition, similar to graded dMSN self-stimulation observed with increasing stimulation frequencies (Figure 1—figure supplement 2). Importantly, VM silencing did not affect either spontaneous locomotion in the open field or dMSN locomotor drive, consistent with a primary role of MLR glutamatergic output in locomotor control. Altogether, these results identify VM as a key recipient of basal ganglia reinforcement signals.

Silencing VM did not fully block dMSN self-stimulation. This may be due to incomplete VM silencing, or the participation of additional mechanisms. For example, mediodorsal thalamus, a basal ganglia-recipient, non-motor nucleus, has been implicated in action-outcome learning (Chudasama et al., 2001; Corbit et al., 2002; Corbit et al., 2003) and working memory (Parnaudeau et al., 2013; Bolkan et al., 2017), and could support striatal reinforcement. The SNr also targets the superior colliculus, which displays reward-related activity (Jeljeli et al., 2003; Doya, 2007), and MLR cholinergic population, which has been shown to drive reinforcement (Xiao et al., 2016). Additionally, dMSN activation, through inhibition of SNc-projecting SNr neurons (Mailly et al., 2001; Brazhnik et al., 2008), may disinhibit SNc dopamine neurons and thus increase DA release, which is sufficient for reinforcement (Phillips et al., 1976; Gratton et al., 1988; Ilango et al., 2014). Interestingly, rodent VM is embedded in circuits separate from dopamine (Groenewegen, 1988; Papadopoulos and Parnavelas, 1990; Watabe-Uchida et al., 2012), suggesting that the basal ganglia could engage both dopamine-dependent and dopamine-independent reinforcement mechanisms. In fact, previous reports have shown that dopamine-deficient mice can still learn to locate food in a T-maze (Robinson et al., 2005) or develop place preference for morphine or cocaine (Hnasko et al., 2005; Hnasko et al., 2007), providing evidence that alternate, dopamine-independent reinforcement mechanisms can occur.

VM lesions have been shown to impair both the acquisition and consolidation of active avoidance (Zis et al., 1984), a behavior that is believed to depend on the basal ganglia (Delacour et al., 1977; Hormigo et al., 2016). In songbirds, vocal learning requires an intact basal ganglia-thalamocortical loop (Scharff and Nottebohm, 1991; Fee and Goldberg, 2011). Other lesion studies in rodents have suggested a role for VM in spatial navigation (Jeljeli et al., 2003) and working memory (Bailey and Mair, 2005), and more specifically in sustaining cortical motor preparatory activity in a working memory task (Guo et al., 2017). Indeed, VM/VAL activation and inhibition increases and decreases cortical activity, respectively. Thus, it appears that basal-ganglia-recipient motor thalamus may be important for engaging cortical plasticity and learning.

Distinct circuits for locomotion and reinforcement

Gross manipulations of striatum, including cocaine administration and bulk dMSN stimulation drive both locomotion and reinforcement (Di Chiara et al., 1979; Inglis et al., 2000; Kravitz et al., 2010; Kravitz et al., 2012). However, it has not been clear whether those two distinct processes can be dissociated in downstream circuitry. Interestingly, direct stimulation of both VM thalamus and MLR glutamatergic neurons, two main targets of basal ganglia outputs, increases locomotion (Klockgether et al., 1986; Roseberry et al., 2016) and sustains operant responding (Clavier and Gerfen, 1982; Yoo et al., 2017). However, we demonstrate a dissociation of these two basal-ganglia-driven functions at the level of output targets: reinforcement relies on VM but not MLR glutamatergic neurons (Figures 4 and 5), whereas basal ganglia-driven locomotion depends on MLR glutamatergic neurons, but not VM (Figure 5, Figure 4—figure supplement 1). Future studies will be required to understand the exact nature of basal ganglia signals transferred to motor thalamus and how this information is integrated in thalamocortical circuits to support cognitive behavior.

Materials and methods

Subjects

All procedures were in accordance with protocols approved by the UCSF Institutional Animal Care and Use Committee. Mice were maintained on a 12 hr/12 hr light/dark cycle and fed ad libitum. Experiments were carried out during the light cycle. 166 wild-type (C₅₇BL/6) or transgenic (gene name in italic) mice were used for experiments as follows: 44 D1-Cre mice (Drd1, 27 females, 17 males, GENSAT #030778-UCD) were used for dMSN self-stimulation experiments. four male D1-Cre mice were used for in vivo recordings in VM. three female D1-Cre mice were used for dMSN-driven locomotion. 30 D1-Cre mice crossed to D1-tmt mice (Jackson #016204) and 9 D1-tmt mice were used for slice recordings in the striatum. 7 D1-Cre x NR1f/f mice (Grin1^tm2Stl, six females, one male, Jackson #005246) were used for dMSN self-stimulation. 17 vGAT-Cre mice (Slc32a1, nine females, eight males, Jackson #016962) were used for SNr self-inhibition. 12 vGAT-Cre (six females, six males) were used for dMSN axon self-stimulation paired with SNr terminal stimulation in VM. one male vGAT-Cre mouse was used for slice recordings. 16 wild-type mice (15 females, one male) were used VM self-stimulation. 5 vGLUT2-Cre mice (Slc17a6, two females, three males, Jackon 016963) were used for MLR self-stimulation. 18 Sert-Cre (Slc6a4, seven females, 11 males, Jackson 014554) mice were used for DRN self-stimulation and dMSN self-stimulation combined with DRN caspase lesions.

Surgical procedures

Request a detailed protocol

All surgeries were carried out in aseptic conditions while mice were anaesthetized with isoflurane (5% for induction, 0.5–1.5% afterward) and then placed in a stereotactic frame (Kopf). The scalp was opened and bilateral holes were drilled in the skull. In DMS, viral constructs were injected through a 33-gauge steel cannula (Plastics1) using a syringe pump (World Precision Instruments) at a rate of 100–200 nl/minute. In the SNr and MLR, viral constructs were injected through a 33-gauge needle on a 5 μL NanoFil syringe (WPI), mounted on a microsyringe pump (UMP3; WPI) and controller (Micro4; WPI), at a rate of 50–100 nl/minute. Needles were removed 5 min after the injection finished. The following volumes were injected: 500–1000 nl (DMS), 250 nl (SNr), 100 nl (VM), 500 nl (MLR, DRN, SNc). The following coordinates were used for viral injections and optic fiber implantation (AP, lateral, DV virus, DV fiber, in mm, from bregma and brain surface): DMS +0.6, 1.5, 2.5, 2; SNr −3.5, 1.4, 4; VM −1.3, 0.8, 3.7, 3.2; MLR −0.8 (from lambda), 1.2, 3.6, 3.2; fiber for dMSN axon stimulation (cerebral peduncle) −1.7, 2.75, 4.0, with 14-degree lateral angle; DRN −4.1, 0.3 (right hemisphere only), 3.3, 2.8; SNc −3.7, 1.5, 4.2, 3.4. For SNr terminal stimulation in VM, VM coordinates were used for fiber implantation. After viral injection, 200-μm-diameter optical fibers (Thorlabs #FT200UMT) glued into 1.25 mm ferrules (Thorlabs CFLC128-10) were lowered through the same holes, unless noted otherwise. Ferrules were then secured to the skull with a layer of dental adhesive (C and B Metabond, Parkell), and covered in UV-cure dental cement (flowable composite, Henry Schein) solidified with brief (20 s) UV light exposure (dental LED light lamp, MUW), while the mouse’s eyes were protected with cardboard glasses. The scalp was then sutured shut around the dental cement. Buprenorphine HCl (0.1 mg/kg, intraperitoneal injection) and Ketoprofen (5 mg/kg, subcutaneous injection) were used for postoperative analgesia. Mice were given 2–4 weeks for recovery and viral expression after surgery before behavioral training started.

Viral constructs

Request a detailed protocol

For Cre-dependent neuronal excitation and inhibition, we used an adeno-associated virus serotype 5 (AAV5) carrying Channelrhodopsin-2 (hChR2(H124R)) or eArch3.0, respectively, fused to enhanced yellow fluorescent protein (eYFP) in a double-floxed inverted open reading frame (DIO) under the control of the EF1α promoter (AAV5-EF1α-DIO-hChR2(H124R)-eYFP, AAV5-EF1α-DIO-eArch3.0-eYFP and AAV5-EF1α-DIO-eYFP for controls, University of Pennsylvania Vector Core, diluted 1:3 in saline (ChR2 and eYFP)). For MLR glutamatergic neurons inhibition, we expressed halorhodopsin 3.0 (eNpHR3.0) under the control of CaMKIIα (AAV5-CaMKIIα-eNpHR3.0-eYFP, University of North Carolina Vector Core). For VM excitation, we expressed Channelrhodopsin-2 under the control of CaMKIIα (AAV5-CaMKIIα-hChR2(H124R)-eYFP, University of North Carolina Vector Core). For DRN serotonergic neuron ablation, we expressed Caspase three in a flip-excision switch (AAV5-FLEx-EF1α-taCasp3-TEVp, University of North Carolina Vector Core). DIO constructs were allowed at least 2 weeks before experiments, and CamKIIα constructs at least 4 weeks.

Awake infusions in VM

Request a detailed protocol

For awake drug infusions in VM, mice were implanted with a custom stainless steel headbar for head fixation. The scalp was removed and skull scraped clean and dry using a scalpel. VM sites (AP, lateral, in mm, from bregma: −1.3, 0.8) were marked with sharpie pen on the skull and covered with a drop of silicone elastomer (Kwik-Cast; WPI). Cyanoacrylate glue (Vetbond, 3M) was lightly dabbed on the skull. For combined VM infusions and dMSN self-stimulation, holes above DMS were then drilled, the viral construct was injected and optic fibers implanted in DMS, and secured as described above. Then, the headbar was levelled flat and lowered to touch lambda, covered with dental adhesive (C and B Metabond, Parkell) and secured with UV-cure dental cement. Only a thin layer of was applied above the marked VM sites. After at least 2 weeks of recovery, mice were habituated to head fixation in a 3-cm-wide acrylic cylinder for 10 min twice a day, 3 days prior to the beginning of the experiment. On the last day of habituation, mice were anesthetized, dental cement and silicone above VM were removed, and holes were drilled on the marked VM locations. The craniotomy was then covered in silicone elastomere and mice were given 24–48 hr for recovery. For drug infusions, mice were headfixed, levelled flat, and a glass pipette mounted on a microinjector (Nanoject) was lowered 3.7 mm from brain surface through the craniotomy to deliver 100 nl of muscimol (0.1 mM in saline, Tocris) over 2 min in VM. The pipette was removed 2 min later and the same procedure was repeated on the other hemisphere. The craniotomy was covered in silicone elastomer and mice were immediately transferred to the behavior boxes for training. After the last behavior session, mice were infused with 20 nl of DiI (ThermoFischer) on each side and then perfused.

Behavioral experiments

Self-Stimulation

Request a detailed protocol

Apparatus

The self-stimulation apparatus consisted of a 10 × 20 cm box made of white acrylic, equipped with two nosepokes with infrared beams (MIX-engineering) on the same wall and a houselight (MedAssociates). For optogenetic excitation, we used 473 nm wavelength lasers (Shanghai Lasers) adjusted to 1 mW at the output of the fiber optic implant for continuous exposure or 10 mW for pulsed excitation (Master-8, A.M.P.I). For optogenetic inhibition, we used continuous 532 nm wavelength adjusted to 3–10 mW at the output of the fiber optic implant. Laser-emitted light was fed in 200 μm, 0.39NA optic fiber cable (Thorlabs), split through a 1-to-2 commutator (Doric) to two optic fiber cables connected to the mouse’s head for bilateral neuronal manipulation. Behavior was performed in the dark. Mouse position and nosepokes were tracked with Noldus software (Ethovision).

Self-stimulation

Sessions lasted 30 min. The night before the first training session, mice were food- and water-restricted. For the first training session, both nosepokes were baited with 200 nl of 10% sucrose-containing water, to ensure apparatus exploration. Mice were bilaterally connected to the laser with fiber optic cables and placed in the box. Poking in the laser-paired nosepoke (active) triggered the laser and the houselight, and poking in the other nosepoke (inactive) had no effect. The nosepoke-laser contingency was counterbalanced across mice and fixed throughout training. Mice could reinitiate a nosepoke as soon as the laser turned off, or after a similar lapse for the inactive side, by exiting and reentering the poke. After the first session, mice went back to unlimited water and food access for the rest of the training. Across all conditions, total number of nosepokes was stable from the first session onward. Mice were excluded if by day four the number of active nosepokes had declined by more than 50% or if mice performed <1 poke/min on average over four sessions. Laser activation patterns and durations were set as follows: dMSN cell bodies, 1 s continuous; dMSN axons, 1 s 40 Hz, SNr, 3 s continuous; VM, 3 s 20 Hz; MLR excitation 3 s 20 Hz; MLR inhibition, 1.1 s continuous, with dMSN stimulation starting 0.1 s later, for 1 s; SNr terminals in VM, 1 s 20 Hz, simultaneous to dMSN axon stimulation; DRN 3 s 40 Hz; SNc 1 s 40 Hz. Pulse duration, 5 ms.

Throughout the manuscript, self-stimulation data come from the first behavior session, except for Figure 1c,d,h,i, where data come from the first session with a 1 s laser/nosepoke contingency (second overall session).

The extinction session happened 24 hr after the 4^th training session. Mice were placed in the apparatus for 15 min, during which the previously laser-paired nosepoke became inactive. Mice intended for ex vivo recordings were trained for 4 days and slices were prepared 24 hr later.

Locomotion

Spontaneous and dMSN-driven locomotion was measured in a 20 cm diameter circular open field arena made of transparent acrylic placed on a square of white yoga mat. Mice were habituated for 30 min to the open field 1 day prior to the experiment. After VM infusions, mice were placed in the open field for 5 min to measure baseline locomotion. For dMSN-driven locomotion, we repeated five cycles consisting of 10 s pre-stimulation, 10 s of bilateral optogenetic stimulation and 10 s post-stimulation epochs, followed by 30 s of recovery. Each mouse received both saline and muscimol infusions in VM in a random order, 24 hr apart. For dMSN stimulation paired with MLR glutamate neurons inhibition, the same protocol was applied, except that dMSN stimulation alone cycles were alternated with dMSN stimulation paired with bilateral MLR inhibition cycles, 5 cycles of each. Total distance travelled was summed for each epoch type, and averaged across mice.

Electrophysiology in acute slices

Request a detailed protocol

Mice were euthanized with a lethal dose of ketamine and xylazine followed by transcardial perfusion with 8 ml of ice cold artificial corticospinal fluid (aCSF) containing (in mM): glycerol (250), KCl (2.5), MgCl₂ (2), CaCl2 (2), NaH2PO4 (1.2), HEPES (10), NaHCO₃ (21) and glucose (5). Coronal slices (250 µM) containing the DMS or the SNr were then prepared with a vibratome (Leica) in the same solution, before incubation in 33°C recording aCSF containing (in mM): NaCl (125), NaHCO₃ (26), NaH2PO₄ (1.25), KCl (2.5), MgCl₂ (1), CaCl₂ (2), glucose (12.5), continuously bubbled with 95/5% O₂/CO₂. After 30 min of recovery, slices were either kept at room temperature or transferred to a recording chamber superfused with recording aCSF (2.5 ml/min) at 33°C. Whole‐cell current‐clamp recordings were obtained using an internal solution containing (in mM): KGluconate (130), NaCl (10), MgCl₂ (2), CaCl₂ (0.16), EGTA (0.5), HEPES (10). Voltage-clamp recordings were obtained using an internal solution containing (in mM): CsMeSO₃ (120), CsCl (15), NaCl (8), EGTA (0.5), HEPES (10), Mg-ATP (2), Na-GTP (0.3), TEA-Cl (10), QX-314 (5). EPSCs were evoked with a monopolar stimulation electrode (glass pipet) placed in the striatum, between the recorded cell and the cortex, and isolated in presence of picrotoxin (100 μM). TTX (1 μM) was added to isolate miniEPSCs. Cells were held at −70 mV except for NMDAR currents, which were recorded at +40 mV (measured 50 ms after stimulus onset). dMSNs were identified by tomato fluorescence. ChR2-positive neurons were identified by the presence of a photocurrent evoked by 500 ms blue light. eYFP-positive neurons were identified by eYFP fluorescence. SNr terminal excitation was achieved by flashing 470 nm filtered LED (Thorlabs LED4C driven by a Prizmatix BLCC-2) light through a 40x immersion objective (1–5 ms pulse, 0.1–1 mW/cm²). Holding current was varied from −70 to +40 mV. In VM, currents evoked from SNr terminal stimulation were only observed at positive potential and were blocked by picrotoxin (100 μM). Stimulation was applied every 10 s. Holding currents were not corrected for junction potential. Recordings were obtained with 3–4 MW resistance pipettes pulled from glass capillaries (Harvard Apparatus GC150TF-10) on a puller (Zeitz). Picrotoxin (Tocris) was prepared at stock concentration in H₂O, then diluted in aCSF while kynurenic acid (Sigma) was directly added to aCSF for bath application. Data was acquired with custom written code in Igor software, which can be found in the source code files linked to this article.

In vivo electrophysiology

D1-Cre mice were injected with DIO-ChR2 virus, implanted with optical fibers over DMS, and implanted with headbars as described above. Recordings were performed in head-fixed mice sitting in an acrylic tube. Mice were acclimated to head-fixation in several habituation sessions during the week before recording. On the day of recording, craniotomies (500 µm diameter) were performed over VM (0.7 mm posterior of Bregma, 0.8 lateral of midline) in the left and right hemispheres. The dura was left intact. Craniotomies were covered with a drop of surgilube followed by silicon elastomer and the animal was allowed to recover for at least 4 hr. Acute recordings were performed using a 64 channel silicon probe (H2 Cambridge Neurotech) mounted on a micromanipulator. The probe was inserted into the craniotomy and driven vertically to a depth of ~4 mm. To reduce movement, the craniotomy was covered with 2% agarose in saline and a layer of mineral oil. The probe was allowed to settle for at least 15 min before acquiring data. Continuous broadband data were acquired at 30 kHz using a SpikeGadgets acquisition system and Trodes software. Blue light was delivered to the DMS ipsilateral to the recording site using a fiber-coupled LED (M470F3, Thorlabs). 1 s of stimulation was delivered every 10 s for 100 trials. The power at the fiber tip was 1 mW. Probe shanks were coated with CM-DiI.

Analysis

Spike sorting was performed using MountainSort software (Chung et al., 2017). After a manual merging step, clusters with a clear refractory period in their autocorrelogram, noise overlap metric <0.03 (Chung et al., 2017), and amplitude distribution without a sharp cutoff at low amplitudes were considered single units. Only units whose peak signal was recorded on an electrode site within VM were included in the analysis. We report data from 163 units recorded across 8 hemispheres of 4 mice. Peri-event time histograms (PETH) aligned to the onset of stimulation were computed for each unit’s firing rate using 10 ms bins. Z-scored PETHs were computed using the average and standard deviation of the PETH in a 1000 ms baseline period preceding stimulus onset.

To determine whether a unit’s firing was significantly modulated by the stimulation, a paired t-test was performed to compare the firing rate in a 20 ms, 100 ms, or 1000 ms response window following stimulus onset with the firing rate in a 1000 ms baseline window preceding stimulus onset in each trial. Units with p<0.05 were considered significantly modulated for a given response window. Significantly modulated units were defined as excited if their average firing rate during the response window was greater than the baseline and inhibited if it was less. The modulation index was calculated as:

(AvgFiringRespWindow-AvgFiringBaselineWindow) / (AvgFiringRespWindow + AvgFiringBaselineWindow)

The latency to significant modulation was defined as the second time bin in which two consecutive bins of the PETH were outside of the 97% confidence level of the poisson distribution calculated from the average firing rate 1000 ms prior to stimulus onset (Abeles, 1982). Latency was not computed for units lacking two consecutive bins outside of the 97% confidence level within the first 100 ms after stimulus onset.

Histology

Request a detailed protocol

After recordings were completed, animals were prepared for histology as described below. To identify recording sites, 100 μm coronal sections were cut on a freezing microtome and DiI signal from the electrode was identified using fluorescence microscopy. Location of electrode sites relative to VM were determined by matching the histological images to the Paxinos mouse atlas.

Histology

Request a detailed protocol

Animals were euthanized with a lethal dose of ketamine and xylazine (400 mg ketamine plus 20 mg xylazine per kilogram of body weight, i.p.) and transcardially perfused with PBS, followed by 4% paraformaldehyde (PFA). Following perfusion, brains were transferred into 4% PFA for 16–24 hr and then moved to a 30% sucrose solution in PBS for 2–3 d (all at 4 deg C). Brains were then frozen and cut into 30 μm coronal sections with a sliding microtome (Leica Microsystems, model SM2000R) equipped with a freezing stage (Physitemp). Free-floating slides were blocked for 1 hr in 10% Normal Donkey Serum (NDS) in 0.5% phosphate-buffer saline in Tween 20 (PBST) then incubated overnight in primary antibody (1:500; Aves chicken anti-YFP, #GFP-1020; Pel-Freez rabbit anti-TH, P40101; ImmunoStar goat anti-5HT, 20079), 3% NDS in 0.5% PBST. The following day, they were washed 3 times for 10 min each in 0.5% PBST and incubated for 1 hr in secondary antibody (1:1000, Jackson ImmunoResearch Donkey anti-Chicken 488, #703-545-155; Invitrogen Alexa-fluor donkey anti-rabbit 647, A-31573; Invitrogen Alexa-fluor donkey anti goat 647, A21447), 3% NDS in 0.5% PBST. Slices were then incubated for 5 min with 1:2000 DAPI. After this, slides were washed for 10 min in 0.5% PBST and 2 more 10 min periods with 1:1 PBS. Slides were then washed with 0.05% lithium carbonate and alcohol, rinsed with diH2O, mounted and coverslipped with Cytoseal 60.

Slides were scanned on a VS120 semi-automated fluorescent slide scanner (Olympus Scientific Solutions Americas Corp, USA). Some figure images were acquired using a 6D high throughput microscope (Nikon, USA). All images shown were made brighter for better print quality using Photoshop function ‘Vibrance’, ‘Brightness’ and ‘Contrast’ to change LUT. No detail was lost during this manipulation. Manual registration of slices was performed by scaling whole slice images in Adobe Illustrator, using the Paxinos mouse atlas (Academic Press, Orlando, FL) panels as a background reference.

Statistics

All statistics were calculated with Prism 7 (Graphpad) or Igor Pro (WaveMetrics). Distribution normality was tested for each data set to determine whether parametric or non-parametric tests would be adequate. Based on the number of groups and independent variables, we used t-test, Mann-Whitney U-test, 1-way and 2-way ANOVAs, followed by posthoc tests correcting for multiple comparisons. Only two-tailed tests were used. Tests and p-values are mentioned in figure legends. p-value<0.05 was considered significant (p<0.05*, p<0.01**, p<0.001***). Results are reported as mean +- SEM. Error bars represent SEM. When possible, sample size was calculated based on pilot cohort (powerandsamplesize.com) with a power of 0.8 and alpha of 0.05). All behavioral experiments were independently repeated at least twice. All data supporting the findings can be found in the source data file linked to this article.

Data availability

All data generated or analysed during this study are included in the manuscript and supporting files.

References

1. Abeles M
(1982) Quantification, smoothing, and confidence limits for single-units' histograms
Journal of Neuroscience Methods 5:317–325.

https://doi.org/10.1016/0165-0270(82)90002-4
- PubMed
- Google Scholar
(1989) The functional anatomy of basal ganglia disorders
Trends in Neurosciences 12:366–375.

https://doi.org/10.1016/0166-2236(89)90074-X
- PubMed
- Google Scholar
(1991) Responses to reward in monkey dorsal and ventral striatum
Experimental Brain Research 85:491–500.

https://doi.org/10.1007/BF00231732
- PubMed
- Google Scholar
1. Bailey KR
2. Mair RG
(2005) Lesions of specific and nonspecific thalamic nuclei affect prefrontal cortex-dependent aspects of spatial working memory
Behavioral Neuroscience 119:410–419.

https://doi.org/10.1037/0735-7044.119.2.410
- PubMed
- Google Scholar
1. Barnes TD
2. Kubota Y
3. Hu D
4. Jin DZ
5. Graybiel AM
(2005) Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories
Nature 437:1158–1161.

https://doi.org/10.1038/nature04053
- PubMed
- Google Scholar
1. Beutler LR
2. Eldred KC
3. Quintana A
4. Keene CD
5. Rose SE
6. Postupna N
7. Montine TJ
8. Palmiter RD
(2011) Severely impaired learning and altered neuronal morphology in mice lacking NMDA receptors in medium spiny neurons
PLoS One 6:e28168.

https://doi.org/10.1371/journal.pone.0028168
- PubMed
- Google Scholar
(2017) Thalamic projections sustain prefrontal activity during working memory maintenance
Nature Neuroscience 20:987–996.

https://doi.org/10.1038/nn.4568
- PubMed
- Google Scholar
(2013) Motor thalamus integration of cortical, cerebellar and basal ganglia information: implications for normal and parkinsonian conditions
Frontiers in Computational Neuroscience 7:163.

https://doi.org/10.3389/fncom.2013.00163
- PubMed
- Google Scholar
(2008) GABAergic afferents activate both GABAA and GABAB receptors in mouse substantia nigra dopaminergic neurons in vivo
Journal of Neuroscience 28:10386–10398.

https://doi.org/10.1523/JNEUROSCI.2387-08.2008
- PubMed
- Google Scholar
1. Brigman JL
2. Daut RA
3. Wright T
4. Gunduz-Cinar O
5. Graybeal C
6. Davis MI
7. Jiang Z
8. Saksida LM
9. Jinde S
10. Pease M
11. Bussey TJ
12. Lovinger DM
13. Nakazawa K
14. Holmes A
(2013) GluN2B in corticostriatal circuits governs choice learning and choice shifting
Nature Neuroscience 16:1101–1110.

https://doi.org/10.1038/nn.3457
- PubMed
- Google Scholar
(1992) Long-term potentiation in the striatum is unmasked by removing the Voltage-dependent magnesium block of NMDA receptor channels
European Journal of Neuroscience 4:929–935.

https://doi.org/10.1111/j.1460-9568.1992.tb00119.x
- PubMed
- Google Scholar
1. Carter DA
2. Fibiger HC
(1978) The projections of the entopeduncular nucleus and globus pallidus in rat as demonstrated by autoradiography and horseradish peroxidase histochemistry
The Journal of Comparative Neurology 177:113–123.

https://doi.org/10.1002/cne.901770108
- PubMed
- Google Scholar
(2001) Effects of selective thalamic and prelimbic cortex lesions on two types of visual discrimination and reversal learning
European Journal of Neuroscience 14:1009–1020.

https://doi.org/10.1046/j.0953-816x.2001.01607.x
- PubMed
- Google Scholar
1. Chung JE
2. Magland JF
3. Barnett AH
4. Tolosa VM
5. Tooker AC
6. Lee KY
7. Shah KG
8. Felix SH
9. Frank LM
10. Greengard LF
(2017) A fully automated approach to spike sorting
Neuron 95:1381–1394.

https://doi.org/10.1016/j.neuron.2017.08.030
- PubMed
- Google Scholar
1. Clavier RM
2. Gerfen CR
(1982) Intracranial self-stimulation in the thalamus of the rat
Brain Research Bulletin 8:353–358.

https://doi.org/10.1016/0361-9230(82)90072-7
- PubMed
- Google Scholar
(2015) Serotonergic neurons signal reward and punishment on multiple timescales
eLife 4:e06346.

https://doi.org/10.7554/eLife.06346
- PubMed
- Google Scholar
(2002) Sensitivity to instrumental contingency degradation is mediated by the entorhinal cortex and its efferents via the dorsal hippocampus
The Journal of Neuroscience 22:10976–10984.

https://doi.org/10.1523/JNEUROSCI.22-24-10976.2002
- PubMed
- Google Scholar
(2003) Lesions of mediodorsal thalamus and anterior thalamic nuclei produce dissociable effects on instrumental conditioning in rats
European Journal of Neuroscience 18:1286–1294.

https://doi.org/10.1046/j.1460-9568.2003.02833.x
- PubMed
- Google Scholar
1. Dang MT
2. Yokoi F
3. Yin HH
4. Lovinger DM
5. Wang Y
6. Li Y
(2006) Disrupted motor learning and long-term synaptic plasticity in mice lacking NMDAR1 in the striatum
PNAS 103:15254–15259.

https://doi.org/10.1073/pnas.0601758103
- PubMed
- Google Scholar
1. Dautan D
2. Souza AS
3. Huerta-Ocampo I
4. Valencia M
5. Assous M
6. Witten IB
7. Deisseroth K
8. Tepper JM
9. Bolam JP
10. Gerdjikov TV
11. Mena-Segovia J
(2016) Segregated cholinergic transmission modulates dopamine neurons integrated in distinct functional circuits
Nature Neuroscience 19:1025–1033.

https://doi.org/10.1038/nn.4335
- PubMed
- Google Scholar
(1977) Specificity of avoidance deficits produced by 6-hydroxydopamine lesions of the nigrostriatal system of the rat
Journal of Comparative and Physiological Psychology 91:875–885.

https://doi.org/10.1037/h0077367
- PubMed
- Google Scholar
1. Deniau JM
2. Chevalier G
(1985) Disinhibition as a basic process in the expression of striatal functions. II. The striato-nigral influence on thalamocortical cells of the ventromedial thalamic nucleus
Brain Research 334:227–233.

https://doi.org/10.1016/0006-8993(85)90214-8
- PubMed
- Google Scholar
1. Deniau JM
2. Chevalier G
(1992) The lamellar organization of the rat substantia nigra pars reticulata: distribution of projection neurons
Neuroscience 46:361–377.

https://doi.org/10.1016/0306-4522(92)90058-A
- PubMed
- Google Scholar
(1979) Evidence for a GABAergic projection from the substantia nigra to the ventromedial thalamus and to the superior colliculus of the rat
Brain Research 176:273–284.

https://doi.org/10.1016/0006-8993(79)90983-1
- PubMed
- Google Scholar
1. Doya K
(2007) Reinforcement learning: Computational theory and biological mechanisms
HFSP Journal 1:30–40.

https://doi.org/10.2976/1.2732246/10.2976/1
- PubMed
- Google Scholar
1. Esposito MS
2. Arber S
(2016) Motor control: illuminating an enigmatic midbrain locomotor center
Current Biology 26:R291–R293.

https://doi.org/10.1016/j.cub.2016.02.043
- PubMed
- Google Scholar
1. Fee MS
2. Goldberg JH
(2011) A hypothesis for basal ganglia-dependent reinforcement learning in the songbird
Neuroscience 198:152–170.

https://doi.org/10.1016/j.neuroscience.2011.09.069
- PubMed
- Google Scholar
(2015) Activation of dorsal raphe serotonergic neurons promotes waiting but is not reinforcing
Current Biology 25:306–315.

https://doi.org/10.1016/j.cub.2014.12.002
- PubMed
- Google Scholar
(2013) Control of basal ganglia output by direct and indirect pathway projection neurons
Journal of Neuroscience 33:18531–18539.

https://doi.org/10.1523/JNEUROSCI.1278-13.2013
- PubMed
- Google Scholar
(2013) Basal ganglia output to the thalamus: still a paradox
Trends in Neurosciences 36:695–705.

https://doi.org/10.1016/j.tins.2013.09.001
- PubMed
- Google Scholar
(1988) Effects of electrical stimulation of brain reward sites on release of dopamine in rat: an in vivo electrochemical study
Brain Research Bulletin 21:319–324.

https://doi.org/10.1016/0361-9230(88)90247-X
- PubMed
- Google Scholar
1. Graybiel AM
(2005) The basal ganglia: learning new tricks and loving it
Current Opinion in Neurobiology 15:638–644.

https://doi.org/10.1016/j.conb.2005.10.006
- PubMed
- Google Scholar
1. Groenewegen HJ
(1988) Organization of the afferent connections of the mediodorsal thalamic nucleus in the rat, related to the mediodorsal-prefrontal topography
Neuroscience 24:379–431.

https://doi.org/10.1016/0306-4522(88)90339-9
- PubMed
- Google Scholar
1. Gross CT
2. Canteras NS
(2012) The many paths to fear
Nature Reviews Neuroscience 13:651–658.

https://doi.org/10.1038/nrn3301
- PubMed
- Google Scholar
1. Guo ZV
2. Inagaki HK
3. Daie K
4. Druckmann S
5. Gerfen CR
6. Svoboda K
(2017) Maintenance of persistent activity in a frontal thalamocortical loop
Nature 545:181–186.

https://doi.org/10.1038/nature22324
- Google Scholar
1. Herkenham M
(1979) The afferent and efferent connections of the ventromedial thalamic nucleus in the rat
The Journal of Comparative Neurology 183:487–517.

https://doi.org/10.1002/cne.901830304
- PubMed
- Google Scholar
1. Hikida T
2. Kimura K
3. Wada N
4. Funabiki K
5. Nakanishi S
(2010) Distinct roles of synaptic transmission in direct and indirect striatal pathways to reward and aversive behavior
Neuron 66:896–907.

https://doi.org/10.1016/j.neuron.2010.05.011
- PubMed
- Google Scholar
(1989) Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward
Journal of Neurophysiology 61:814–832.

https://doi.org/10.1152/jn.1989.61.4.814
- PubMed
- Google Scholar
(2000) Role of the basal ganglia in the control of purposive saccadic eye movements
Physiological Reviews 80:953–978.

https://doi.org/10.1152/physrev.2000.80.3.953
- PubMed
- Google Scholar
(2005) Morphine reward in dopamine-deficient mice
Nature 438:854–857.

https://doi.org/10.1038/nature04172
- PubMed
- Google Scholar
(2007) Cocaine-conditioned place preference by dopamine-deficient mice is mediated by serotonin
Journal of Neuroscience 27:12484–12488.

https://doi.org/10.1523/JNEUROSCI.3133-07.2007
- PubMed
- Google Scholar
(2016) Basal ganglia output controls active avoidance behavior
The Journal of Neuroscience 36:10274–10284.

https://doi.org/10.1523/JNEUROSCI.1842-16.2016
- PubMed
- Google Scholar
1. Ilango A
2. Kesner AJ
3. Keller KL
4. Stuber GD
5. Bonci A
6. Ikemoto S
(2014) Similar roles of substantia nigra and ventral tegmental dopamine neurons in reward and aversion
The Journal of Neuroscience 34:817–822.

https://doi.org/10.1523/JNEUROSCI.1703-13.2014
- PubMed
- Google Scholar
(2000) Pedunculopontine tegmental nucleus lesions impair stimulus--reward learning in autoshaping and conditioned reinforcement paradigms
Behavioral Neuroscience 114:285–294.

https://doi.org/10.1037/0735-7044.114.2.285
- PubMed
- Google Scholar
(2003) Effects of ventrolateral-ventromedial thalamic lesions on motor coordination and spatial orientation in rats
Neuroscience Research 47:309–316.

https://doi.org/10.1016/S0168-0102(03)00224-4
- PubMed
- Google Scholar
1. Jin X
2. Costa RM
(2010) Start/stop signals emerge in nigrostriatal circuits during sequence learning
Nature 466:457–462.

https://doi.org/10.1038/nature09263
- PubMed
- Google Scholar
1. Kawai R
2. Markman T
3. Poddar R
4. Ko R
5. Fantana AL
6. Dhawale AK
7. Kampff AR
8. Ölveczky BP
(2015) Motor cortex is required for learning but not for executing a motor skill
Neuron 86:800–812.

https://doi.org/10.1016/j.neuron.2015.03.024
- PubMed
- Google Scholar
1. Kim H
2. Sul JH
3. Huh N
4. Lee D
5. Jung MW
(2009) Role of striatum in updating values of chosen actions
Journal of Neuroscience 29:14701–14712.

https://doi.org/10.1523/JNEUROSCI.2728-09.2009
- PubMed
- Google Scholar
1. Kimchi EY
2. Laubach M
(2009) Dynamic encoding of action selection by the medial striatum
Journal of Neuroscience 29:3148–3159.

https://doi.org/10.1523/JNEUROSCI.5206-08.2009
- PubMed
- Google Scholar
(1985) Rigidity and catalepsy after injections of muscimol into the ventromedial thalamic nucleus: an electromyographic study in the rat
Experimental Brain Research 58:559–569.

https://doi.org/10.1007/BF00235872
- PubMed
- Google Scholar
(1986) The rat ventromedial thalamic nucleus and motor control: role of N-methyl-D-aspartate-mediated excitation, GABAergic inhibition, and muscarinic transmission
The Journal of Neuroscience 6:1702–1711.

https://doi.org/10.1523/JNEUROSCI.06-06-01702.1986
- PubMed
- Google Scholar
1. Kravitz AV
2. Freeze BS
3. Parker PR
4. Kay K
5. Thwin MT
6. Deisseroth K
7. Kreitzer AC
(2010) Regulation of parkinsonian motor behaviours by optogenetic control of basal ganglia circuitry
Nature 466:622–626.

https://doi.org/10.1038/nature09159
- PubMed
- Google Scholar
(2012) Distinct roles for direct and indirect pathway striatal neurons in reinforcement
Nature Neuroscience 15:816–818.

https://doi.org/10.1038/nn.3100
- PubMed
- Google Scholar
1. Kuramoto E
2. Ohno S
3. Furuta T
4. Unzai T
5. Tanaka YR
6. Hioki H
7. Kaneko T
(2015) Ventral medial nucleus neurons send thalamocortical afferents more widely and more preferentially to layer 1 than neurons of the ventral anterior-ventral lateral nuclear complex in the rat
Cerebral Cortex 25:221–235.

https://doi.org/10.1093/cercor/bht216
- PubMed
- Google Scholar
1. Lau B
2. Glimcher PW
(2008) Value representations in the primate striatum during matching behavior
Neuron 58:451–463.

https://doi.org/10.1016/j.neuron.2008.02.021
- PubMed
- Google Scholar
(2002) A neural correlate of response bias in monkey caudate nucleus
Nature 418:413–417.

https://doi.org/10.1038/nature00892
- PubMed
- Google Scholar
1. Lee HJ
2. Weitz AJ
3. Bernal-Casas D
4. Duffy BA
5. Choy M
6. Kravitz AV
7. Kreitzer AC
8. Lee JH
(2016) Activation of direct and indirect pathway medium spiny neurons drives distinct Brain-wide responses
Neuron 91:412–424.

https://doi.org/10.1016/j.neuron.2016.06.010
- PubMed
- Google Scholar
1. Liu Z
2. Zhou J
3. Li Y
4. Hu F
5. Lu Y
6. Ma M
7. Feng Q
8. Zhang JE
9. Wang D
10. Zeng J
11. Bao J
12. Kim JY
13. Chen ZF
14. El Mestikawy S
15. Luo M
(2014) Dorsal raphe neurons signal reward through 5-HT and glutamate
Neuron 81:1360–1374.

https://doi.org/10.1016/j.neuron.2014.02.010
- PubMed
- Google Scholar
1. Maia TV
2. Frank MJ
(2011) From reinforcement learning models to psychiatric and neurological disorders
Nature Neuroscience 14:154–162.

https://doi.org/10.1038/nn.2723
- PubMed
- Google Scholar
1. Mailly P
2. Charpier S
3. Mahon S
4. Menetrey A
5. Thierry AM
6. Glowinski J
7. Deniau JM
(2001) Dendritic arborizations of the rat substantia nigra pars reticulata neurons: spatial organization and relation to the lamellar compartmentation of striato-nigral projections
The Journal of Neuroscience 21:6874–6888.

https://doi.org/10.1523/JNEUROSCI.21-17-06874.2001
- PubMed
- Google Scholar
1. McDevitt RA
2. Tiran-Cappello A
3. Shen H
4. Balderas I
5. Britt JP
6. Marino RAM
7. Chung SL
8. Richie CT
9. Harvey BK
10. Bonci A
(2014) Serotonergic versus nonserotonergic dorsal raphe projection neurons: differential participation in reward circuitry
Cell Reports 8:1857–1869.

https://doi.org/10.1016/j.celrep.2014.08.037
- PubMed
- Google Scholar
1. Nakamura K
2. Hikosaka O
(2006) Facilitation of saccadic eye movements by postsaccadic electrical stimulation in the primate caudate
Journal of Neuroscience 26:12885–12895.

https://doi.org/10.1523/JNEUROSCI.3688-06.2006
- PubMed
- Google Scholar
1. Oldenburg IA
2. Sabatini BL
(2015) Antagonistic but not symmetric regulation of primary motor cortex by basal ganglia direct and indirect pathways
Neuron 86:1174–1181.

https://doi.org/10.1016/j.neuron.2015.05.008
- PubMed
- Google Scholar
1. Oyama K
2. Hernádi I
3. Iijima T
4. Tsutsui K
(2010) Reward prediction error coding in dorsal striatal neurons
Journal of Neuroscience 30:11447–11457.

https://doi.org/10.1523/JNEUROSCI.1719-10.2010
- PubMed
- Google Scholar
(2013) NMDA receptor subunit diversity: impact on receptor properties, synaptic plasticity and disease
Nature Reviews Neuroscience 14:383–400.

https://doi.org/10.1038/nrn3504
- PubMed
- Google Scholar
1. Papadopoulos GC
2. Parnavelas JG
(1990) Distribution and synaptic organization of dopaminergic axons in the lateral geniculate nucleus of the rat
The Journal of Comparative Neurology 294:356–361.

https://doi.org/10.1002/cne.902940305
- PubMed
- Google Scholar
1. Parnaudeau S
2. O'Neill PK
3. Bolkan SS
4. Ward RD
5. Abbas AI
6. Roth BL
7. Balsam PD
8. Gordon JA
9. Kellendonk C
(2013) Inhibition of mediodorsal thalamus disrupts thalamofrontal connectivity and cognition
Neuron 77:1151–1162.

https://doi.org/10.1016/j.neuron.2013.01.038
- PubMed
- Google Scholar
1. Pasupathy A
2. Miller EK
(2005) Different time courses of learning-related activity in the prefrontal cortex and striatum
Nature 433:873–876.

https://doi.org/10.1038/nature03287
- PubMed
- Google Scholar
(1976) Dopaminergic substrates of intracranial self-stimulation in the caudate-putamen
Brain Research 104:221–232.

https://doi.org/10.1016/0006-8993(76)90615-6
- PubMed
- Google Scholar
1. Phillips AG
(1984) Brain reward circuitry: a case for separate systems
Brain Research Bulletin 12:195–201.

https://doi.org/10.1016/0361-9230(84)90189-8
- PubMed
- Google Scholar
(2001) A cellular mechanism of reward-related learning
Nature 413:67–70.

https://doi.org/10.1038/35092560
- PubMed
- Google Scholar
(2005) Distinguishing whether dopamine regulates liking, wanting, and/or learning about rewards
Behavioral Neuroscience 119:5–15.

https://doi.org/10.1037/0735-7044.119.1.5
- PubMed
- Google Scholar
1. Roseberry TK
2. Lee AM
3. Lalive AL
4. Wilbrecht L
5. Bonci A
6. Kreitzer AC
(2016) Cell-Type-Specific control of brainstem locomotor circuits by basal ganglia
Cell 164:526–537.

https://doi.org/10.1016/j.cell.2015.12.037
- PubMed
- Google Scholar
1. Routtenberg A
(1971) Forebrain pathways of reward in Rattus norvegicus
Journal of Comparative and Physiological Psychology 75:269–276.

https://doi.org/10.1037/h0030927
- PubMed
- Google Scholar
1. Ryczko D
2. Dubuc R
(2013) The multifunctional mesencephalic locomotor region
Current pharmaceutical design 19:4448–4470.

https://doi.org/10.2174/1381612811319240011
- PubMed
- Google Scholar
1. Samejima K
2. Ueda Y
3. Doya K
4. Kimura M
(2005) Representation of action-specific reward values in the striatum
Science 310:1337–1340.

https://doi.org/10.1126/science.1115270
- PubMed
- Google Scholar
1. Scharff C
2. Nottebohm F
(1991) A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning
The Journal of Neuroscience 11:2896–2913.

https://doi.org/10.1523/JNEUROSCI.11-09-02896.1991
- PubMed
- Google Scholar
1. Schultz W
2. Dickinson A
(2000) Neuronal coding of prediction errors
Annual Review of Neuroscience 23:473–500.

https://doi.org/10.1146/annurev.neuro.23.1.473
- PubMed
- Google Scholar
(2008) Dichotomous dopaminergic control of striatal synaptic plasticity
Science 321:848–851.

https://doi.org/10.1126/science.1160575
- PubMed
- Google Scholar
1. Starr MS
2. Summerhayes M
(1983) Role of the ventromedial nucleus of the thalamus in motor behaviour--I. Effects of focal injections of drugs
Neuroscience 10:1157–1169.

https://doi.org/10.1016/0306-4522(83)90106-9
- PubMed
- Google Scholar
1. Tai LH
2. Lee AM
3. Benavidez N
4. Bonci A
5. Wilbrecht L
(2012) Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value
Nature Neuroscience 15:1281–1289.

https://doi.org/10.1038/nn.3188
- PubMed
- Google Scholar
(2000) Modulation of long-term depression by dopamine in the mesolimbic system
The Journal of Neuroscience 20:5581–5586.

https://doi.org/10.1523/JNEUROSCI.20-15-05581.2000
- PubMed
- Google Scholar
1. Turner RS
2. Desmurget M
(2010) Basal ganglia contributions to motor control: a vigorous tutor
Current Opinion in Neurobiology 20:704–716.

https://doi.org/10.1016/j.conb.2010.08.022
- PubMed
- Google Scholar
(2012) Whole-brain mapping of direct inputs to midbrain dopamine neurons
Neuron 74:858–873.

https://doi.org/10.1016/j.neuron.2012.03.017
- PubMed
- Google Scholar
1. White NM
2. Hiroi N
(1998) Preferential localization of self-stimulation sites in striosomes/patches in the rat striatum
PNAS 95:6486–6491.

https://doi.org/10.1073/pnas.95.11.6486
- PubMed
- Google Scholar
1. Williams ZM
2. Eskandar EN
(2006) Selective enhancement of associative learning by microstimulation of the anterior caudate
Nature Neuroscience 9:562–568.

https://doi.org/10.1038/nn1662
- PubMed
- Google Scholar
1. Wise RA
2. Bozarth MA
(1984) Brain reward circuitry: four circuit elements "wired" in apparent series
Brain Research Bulletin 12:203–208.

https://doi.org/10.1016/0361-9230(84)90190-4
- PubMed
- Google Scholar
1. Wise RA
(1996) Addictive drugs and brain stimulation reward
Annual Review of Neuroscience 19:319–340.

https://doi.org/10.1146/annurev.ne.19.030196.001535
- PubMed
- Google Scholar
1. Xiao C
2. Cho JR
3. Zhou C
4. Treweek JB
5. Chan K
6. McKinney SL
7. Yang B
8. Gradinaru V
(2016) Cholinergic mesopontine signals govern locomotion and reward through dissociable midbrain pathways
Neuron 90:333–347.

https://doi.org/10.1016/j.neuron.2016.03.028
- PubMed
- Google Scholar
(2015) Selective corticostriatal plasticity during acquisition of an auditory discrimination task
Nature 521:348–351.

https://doi.org/10.1038/nature14225
- PubMed
- Google Scholar
(2014) A critical time window for dopamine actions on the structural plasticity of dendritic spines
Science 345:1616–1620.

https://doi.org/10.1126/science.1255514
- PubMed
- Google Scholar
(2005a) Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning
European Journal of Neuroscience 22:505–512.

https://doi.org/10.1111/j.1460-9568.2005.04219.x
- PubMed
- Google Scholar
(2005b) The role of the dorsomedial striatum in instrumental conditioning
European Journal of Neuroscience 22:513–523.

https://doi.org/10.1111/j.1460-9568.2005.04218.x
- PubMed
- Google Scholar
1. Yoo JH
2. Zell V
3. Wu J
4. Punta C
5. Ramajayam N
6. Shen X
7. Faget L
8. Lilascharoen V
9. Lim BK
10. Hnasko TS
(2017) Activation of pedunculopontine glutamate neurons is reinforcing
The Journal of Neuroscience 37:38–46.

https://doi.org/10.1523/JNEUROSCI.3082-16.2016
- PubMed
- Google Scholar
1. Yttri EA
2. Dudman JT
(2016) Opponent and bidirectional control of movement velocity in the basal ganglia
Nature 533:402–406.

https://doi.org/10.1038/nature17639
- PubMed
- Google Scholar
(1984) Acquisition and extinction of L-maze and conditioned avoidance behaviours following kainic acid-induced lesions of the ventromedial thalamic nuclei in rats
Brain Research Bulletin 12:617–623.

https://doi.org/10.1016/0361-9230(84)90141-2
- PubMed
- Google Scholar

Article and author information

Author details

Arnaud L Lalive

The Gladstone Institutes, San Francisco, United States

Contribution
Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Writing—original draft, Project administration, Writing—review and editing

Competing interests
No competing interests declared
Anthony D Lien

The Gladstone Institutes, San Francisco, United States

Contribution
Investigation, Methodology

Competing interests
No competing interests declared
Thomas K Roseberry
1. The Gladstone Institutes, San Francisco, United States
2. Neuroscience Graduate Program, University of California, San Francisco, United States
Present address
Neuralink Corp, San Francisco, United States

Contribution
Conceptualization, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Writing—original draft, Project administration, Writing—review and editing

Competing interests
No competing interests declared
Christopher H Donahue

The Gladstone Institutes, San Francisco, United States

Contribution
Investigation, Methodology

Competing interests
No competing interests declared
Anatol C Kreitzer
1. The Gladstone Institutes, San Francisco, United States
2. Neuroscience Graduate Program, University of California, San Francisco, United States
3. Departments of Physiology and Neurology, University of California, San Francisco, United States
Contribution
Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Methodology, Writing—original draft, Writing—review and editing

For correspondence
akreitzer@gladstone.ucsf.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-7423-2398

Funding

Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung

Arnaud L Lalive

National Institutes of Health (U01 NS094342)

Anatol C Kreitzer

National Institutes of Health (P01 DA010154)

Anatol C Kreitzer

National Institutes of Health (R01 NS064984)

Anatol C Kreitzer

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank D Schulte and B Margolin for technical assistance. We thank J Berke and members of the Kreitzer lab for useful discussions. This research was supported by the Swiss National Science Foundation (ALL) and the NIH (ACK).

Ethics

Animal experimentation: This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All of the animals were handled according to approved institutional animal care and use committee (IACUC) protocols (AN144957) of the University of California, San Francisco. All surgery was performed under isoflurane anesthesia, and every effort was made to minimize suffering.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.