Slowly evolving dopaminergic activity modulates the moment-to-moment probability of reward-related self-timed movements

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Clues from human movement disorders have long suggested that the neurotransmitter dopamine plays a role in motor control, but how the endogenous dopaminergic system influences movement is unknown. Here, we examined the relationship between dopaminergic signaling and the timing of reward-related movements in mice. Animals were trained to initiate licking after a self-timed interval following a start-timing cue; reward was delivered in response to movements initiated after a criterion time. The movement time was variable from trial-to-trial, as expected from previous studies. Surprisingly, dopaminergic signals ramped-up over seconds between the start-timing cue and the self-timed movement, with variable dynamics that predicted the movement/reward time on single trials. Steeply rising signals preceded early lick-initiation, whereas slowly rising signals preceded later initiation. Higher baseline signals also predicted earlier self-timed movements. Optogenetic activation of dopamine neurons during self-timing did not trigger immediate movements, but rather caused systematic early-shifting of movement initiation, whereas inhibition caused late-shifting, as if modulating the probability of movement. Consistent with this view, the dynamics of the endogenous dopaminergic signals quantitatively predicted the moment-by-moment probability of movement initiation on single trials. We propose that ramping dopaminergic signals, likely encoding dynamic reward expectation, can modulate the decision of when to move.

Editor's evaluation

Dopamine loss in Parkinson's disease results in impaired movement initiation and execution, but the precise relationship between dopamine activity and the decision to move is poorly understood. Here, the authors imaged mesostriatal dopamine signals as head-fixed mice decided when, after a cue, to retrieve water from a spout. Surprisingly, ramps in dopamine activity predicted, even on single trials, the timing of licks. Fast ramps preceded early retrievals; slow ones preceded late ones. Optogenetic activation or suppression of dopamine activity accelerated or delayed lick initiation, respectively. Together, these findings reveal strong links between ramps in dopamine activity and the timing of self-initiated movement.

https://doi.org/10.7554/eLife.62583.sa0

Introduction

What makes us move? Empirically, a few hundred milliseconds before movement, thousands of neurons in the motor system suddenly become active in concert, and this neural activity is relayed via spinal and brainstem neurons to recruit muscle fibers that power movement (Shenoy et al., 2013). Yet just before this period of intense neuronal activity, the motor system is largely quiescent. How does the brain suddenly and profoundly rouse motor neurons into the coordinated action needed to trigger movement?

In the case of movements made in reaction to external stimuli, activity evoked first in sensory brain areas is presumably passed along to appropriate motor centers to trigger this coordinated neural activity, thereby leading to movement. But humans and animals can also self-initiate movement without overt, external input (Deecke, 1996; Hallett, 2007; Lee and Assad, 2003; Romo et al., 1992). For example, while reading this page, you may decide without prompting to reach for your coffee. In that case, the movement cannot be clearly related to an abrupt, conspicuous sensory cue. What ‘went off’ in your brain that made you reach for your coffee at this particular moment, as opposed to a moment earlier or later?

Human movement disorders may provide clues to this mystery. Patients and animal models of Parkinson’s Disease experience difficulty self-initiating movements, exemplified by perseveration (Hughes et al., 2013), trouble initiating steps when walking (Bloxham et al., 1984), and problems timing movements (Malapani et al., 1998; Meck, 1986; Meck, 2006; Mikhael and Gershman, 2019). In contrast to these self-generated actions, externally cued reactions are often less severely affected in Parkinson’s, a phenomenon sometimes referred to as ‘paradoxical kinesia’ (Barthel et al., 2018; Bloxham et al., 1984). For example, patients’ gait can be normalized by walking aids that prompt steps in reaction to visual cues displayed on the ground (Barthel et al., 2018).

Because the underlying neuropathophysiology of Parkinson’s includes the loss of midbrain dopaminergic neurons (DANs), the symptomatology of Parkinson’s suggests DAN activity plays an important role in deciding when to self-initiate movement. Indeed, pharmacological manipulations of the neurotransmitter dopamine causally and bidirectionally influence movement timing (Dews and Morse, 1958; Lustig and Meck, 2005; Meck, 1986; Mikhael and Gershman, 2019; Schuster and Zimmerman, 1961). This can be demonstrated in the context of self-timed movement tasks, in which subjects reproduce a target timing interval by making a movement following a self-timed delay that is referenced to a start-timing cue (Malapani et al., 1998). Species across the animal kingdom, from rodents and birds to primates, can learn these tasks and produce self-timed movements that occur, on average, at about the target time, although the exact timing exhibits considerable variability from trial-to-trial (Gallistel and Gibbon, 2000; Meck, 2006; Mello et al., 2015; Merchant et al., 2013; Rakitin et al., 1998; Remington et al., 2018; Schuster and Zimmerman, 1961; Sohn et al., 2019; Wang et al., 2018). In such self-timed movement tasks, decreased dopamine availability/efficacy (e.g., Parkinson’s, neuroleptic drugs) generally produces late-shifted movements (Malapani et al., 1998; Meck, 1986; Meck, 2006; Merchant et al., 2013), whereas high dopamine conditions (e.g., amphetamines) produce early-shifting (Dews and Morse, 1958; Schuster and Zimmerman, 1961).

Although exogenous dopamine manipulations can influence timing behavior, it remains unclear whether endogenous DAN activity is involved in determining when to move. DANs densely innervate the striatum, where they modulate the activity of spiny projection neurons of the direct and indirect pathways, which are thought to exert a push-pull influence on movement centers (Albin et al., 1989; DeLong, 1990; Freeze et al., 2013; Grillner and Robertson, 2016). Most studies on endogenous DAN activity have focused on reward-related signals, but there are also reports of movement-related DAN signals. For example, phasic bursts of dopaminergic activity have been observed just prior to movement onset (within ~500ms; Coddington and Dudman, 2018; Coddington and Dudman, 2019; da Silva et al., 2018; Dodson et al., 2016; Howe and Dombeck, 2016; Wang and Tsien, 2011), and dopaminergic signals have been reported to reflect more general encoding of movement kinematics (Barter et al., 2015; Engelhard et al., 2019; Parker et al., 2016). However, optogenetic activation of dopamine neurons—within physiological range—does not elicit immediate movements (Coddington and Dudman, 2018; Coddington and Dudman, 2019). We hypothesized that rather than overtly triggering movements, the ongoing activity of nigrostriatal DANs could influence movement initiation over longer timescales by controlling or modulating the moment-by-moment decision of when to execute a planned movement.

To test this hypothesis, we trained mice to make a movement (lick) after a self-timed interval following a start-timing cue. The mice learned the timed interval, but, as observed in other species, the exact timing of movement was highly variable from trial-to-trial, spanning seconds. We exploited this inherent variability by examining how moment-to-moment nigrostriatal DAN signals differed when animals decided to move relatively early versus late. We found that dopaminergic signals ‘ramped up’ during the timing interval, with variable dynamics that were highly predictive of trial-by-trial movement timing, even seconds before the movement occurred. Because reward was delivered at the time of movement, the ramping dopaminergic signals likely related to the animal’s expectation of when reward would be available in response to movement. Furthermore, optogenetic DAN manipulation during the timing interval produced bidirectional changes in the probability of movement timing, with activation causing a bias toward earlier self-timed movements and suppression causing a bias toward later self-timed movements. These combined observations suggest a novel role for the dopaminergic system in the timing of movement initiation, wherein slowly evolving dopaminergic signals, likely driven by reward expectation, can modulate the moment-to-moment probability of whether a reward-related movement will occur.

Results

We trained head-fixed mice to make self-timed movements to receive juice rewards (Figure 1A). Animals received an audio/visual start-timing cue and then had to decide when to first-lick in the absence of further cues. Animals only received juice if they waited a proscribed interval following the cue before making their first-lick (>3.3 s in most experiments). As expected from previous studies, the distribution of first-lick timing was broadly distributed over several seconds, and exhibited the canonical scalar property of timing, as described by Weber’s Law (Figure 1B and Figure 1—figure supplement 1A-B; Gallistel and Gibbon, 2000). We note this variability in timing was not imposed on the animal by training it to reproduce a variety of target intervals (e.g., 2 vs. 5 s), but is rather a natural consequence of timing behavior, even for a single target interval.

Figure 1 with 2 supplements see all

Download asset Open asset

Self-timed movement task.

(A) Task schematic (3.3 s version shown). (B) First-lick timing distributions generated by the same mouse exhibit the scalar property of timing (Weber’s Law). Red: 3.3 s target time (four sessions); Blue: 5 s target time (four sessions). For all mice, see Figure 1—figure supplement 1B. (**C–E**) Hazard-function analysis. Time = 0 is the start-timing cue; dashed vertical lines are target times. (C) Uniform instantaneous probability of movement over time is equivalent to a flat hazard rate (bottom) and produces an exponential first-lick timing distribution (top). (D) Before Training: First day of exposure to the self-timed movement task. Top: average first-lick timing distribution across mice; bottom: corresponding hazard functions. Gray traces: single session data. Red traces: average among all sessions, with shading indicating 95% confidence interval produced by 10,000x bootstrap procedure. (E) Trained Behavior: Hazard functions (bottom) computed from the first-lick timing distributions for the 3.3 s- and 5 s tasks (top) reveal peaks at the target times. Right: average first-lick timing distribution and hazard functions for all 12 GCaMP6f photometry animals. Source data: Figure 1—source data 1.

Figure 1—source data 1 Self-timed movement task behavioral data.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig1-data1-v3.zip
Download elife-62583-fig1-data1-v3.zip

Our main objective was to exploit the inherent variability in self-timed behavior to examine how differences in neural activity might relate to variability in movement timing. Nonetheless, the trained animals well-understood the timing contingencies of the task. In self-timed movement tasks in which a single movement is used to assess timing, the distributions of movement times (in both rodents and monkeys) tend to anticipate the target interval, even at the expense of reward on many trials (Eckard and Kyonka, 2018; Kirshenbaum et al., 2008; Lee and Assad, 2003). In these paradigms, however, once a movement occurs, it removes future opportunities to move, which creates premature ‘bias’ in the raw timing distributions (Anger, 1956). To correct this bias, movement times must be normalized by the (ever-diminishing) number of opportunities to move at each timepoint (Jaldow et al., 1990). This yields the hazard function (the conditional probability of movement given that movement has not already occurred, as a function of time), which is equivalent to the instantaneous probability of movement. For example, on the first day of training, our animals displayed fairly flat hazard functions, indicating a uniform instantaneous probability of movement over time—that is, the animals did not yet understand the timing contingency (Figure 1C–D). However, after training, the hazard function for our animals peaked near the target time (either 3.3 or 5 s), suggesting an accurate latent timing process reflected in the instantaneous movement probability (Figure 1E). Mice trained on a variant of the self-timed movement task without lamp-off/on events showed no systematic differences in their timing distributions (Figure 1—figure supplement 1C), suggesting that the mice referenced their timing to the start-timing cue rather than the lamp-off event.

When mice were fully trained, we employed fiber photometry to record the activity of genetically-defined DANs expressing the calcium-sensitive fluorophore GCaMP6f (12 mice, substantia nigra pars compacta [SNc]; Figure 1—figure supplement 2). We controlled for mechanical/optical artifacts by simultaneously recording fluorescence modulation of a co-expressed, calcium-insensitive fluorophore, tdTomato. We also recorded bodily movements with neck-muscle EMG, high-speed video, and a back-mounted accelerometer.

DAN signals ramp up slowly between the start-timing cue and self-timed movement

DAN GCaMP6f fluorescence typically exhibited brief transients following cue onset and immediately before first-lick onset (Figure 2A), as observed in previous studies (Coddington and Dudman, 2018; da Silva et al., 2018; Dodson et al., 2016; Howe and Dombeck, 2016; Schultz et al., 1997). However, during the timed interval, we observed slow ‘ramping up’ of fluorescence over seconds, with a minimum after the cue-aligned transient and maximum just before the lick-related transient. The relatively fast intrinsic decay kinetics of GCaMP6f (t_1/2 <100 ms at 37°; Helassa et al., 2016) should not produce appreciable signal integration over the seconds-long timescales of the ramps we observed.

Figure 2 with 3 supplements see all

Download asset Open asset

SNc DAN signals preceding self-timed movement.

(A) Left: surgical strategy for GCaMP6f/tdTomato fiber photometry. Right: average SNc DAN GCaMP6f response for first-licks between 3 and 3.25 s (12 mice). Data aligned separately to both cue-onset (left) and first-lick (right), with the break in the time axis indicating the change in plot alignment. (B) Average SNc DAN GCaMP6f responses for different first-lick times (indicated by dashed vertical lines). (C) Comparison of average DAN GCaMP6f and tdTomato responses on expanded vertical scale. Traces plotted up to 150 ms before first-lick. See also Figure 2—figure supplements 1–3. Figure 2—source data 1.

Figure 2—source data 1 SNc DAN signals during self-timing.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig2-data1-v3.zip
Download elife-62583-fig2-data1-v3.zip
Figure 2—source data 2 Baseline SNc DAN signals.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig2-data2-v3.zip
Download elife-62583-fig2-data2-v3.zip
Figure 2—source data 3 df/F methods.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig2-data3-v3.zip
Download elife-62583-fig2-data3-v3.zip
Figure 2—source data 4 DAN signals recorded at SNc, DLS and VTA.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig2-data4-v3.zip
Download elife-62583-fig2-data4-v3.zip

We asked whether this ramping differed between trials in which the animal moved relatively early or late. Strikingly, when we averaged signals pooled by movement time, we observed systematic differences in the steepness of ramping that were highly predictive of movement timing (Figure 2B–C). Trials with early first-licks exhibited steep ramping, whereas trials with later first-licks started from lower fluorescence levels and rose more slowly toward the time of movement. The fluorescence ramps terminated at nearly the same amplitude, regardless of the movement time. Ramping dynamics were not evident in control tdTomato signals (Figure 2C), indicating that the ramping in the GCaMP6f signals was not an optical artifact. The quantitative relationship between GCaMP6f dynamics and movement time will be addressed in a subsequent section of this paper.

Higher pre-cue DAN signals are correlated with earlier self-timed movements

In addition to ramping dynamics, average DAN GCaMP6f signals were correlated with first-lick timing even before cue-onset, with higher baseline fluorescence predicting earlier first-licks (Figure 2B–C). This correlation began before the lamp-off event (the 2 s ‘Baseline’ period before lamp-off; Pearson’s r = −0.63 (95% CI=[-0.92,–0.14]), n = 12 mice) and grew stronger during the ‘Lamp-Off Interval’ between lamp-off and the cue (Pearson’s r = −0.89 (95% CI=[-0.98,–0.68]), n = 12 mice; Figure 2—figure supplement 1A-B). This correlation was independent of the duration of the lamp-off interval (Figure 2—figure supplement 1C). Because dF/F correction methods can potentially distort baseline measurements, we rigorously tested and validated three different dF/F methods, and we also repeated analyses with raw fluorescence values compared between pairs of sequential trials with different movement times (Figure 2—figure supplement 2; see Materials and methods). All reported results, including the systematic baseline differences, were robust to dF/F correction.

In principle, the amplitude of the baseline signal on a given trial n could be related to the animal’s behavior during the baseline interval or the outcome of the previous trial. To test this, we performed four-way ANOVA to compare the main effects of the following factors on the pre-cue signal (averaged for each trial between lamp-off and the start-timing cue, the ‘lamp-off interval’ (LOI), n = 12 mice): (1) presence or absence of spontaneous licking during the LOI; (2) outcome of the previous trial (rewarded or unrewarded); (3) upcoming movement time on trial n (categorized as <3.3 s or >3.3 s to provide a simple binary proxy for movement time); and (4) session number (to account for signal variability across animals and daily sessions). Although the effects of LOI-licking and previous trial outcome were statistically significant (F(1,18282) = 10.7, p = 0.008, η_p²=5.9·10^–4 and F(1,18282) = 281.2, p = 7.5·10^–47, η_p²=0.015, respectively), the upcoming movement time had an independent, statistically significant effect (F(1,18282) = 63.4, p = 5.9·10^–6, η_p²=0.0035). This raises the possibility of an additional source of variance in baseline dopaminergic activity that is independent from previous trial events, but potentially influences the upcoming movement time on that trial.

Ramping dynamics in other dopaminergic areas and striatal dopamine release

We found similar ramping dynamics in SNc DAN axon terminals in the dorsolateral striatum (DLS; Figure 2—figure supplement 3A-B) at a location involved in goal-directed licking behavior (Sippy et al., 2015). Ramping was also present in GCaMP6f-expressing DAN cell bodies in the ventral tegmental area (VTA, Figure 2—figure supplement 3C), reminiscent of mesolimbic ramping signals described in goal-oriented navigation tasks (Howe et al., 2013; Kim et al., 2019).

To determine if these movement timing-related signals are available to downstream targets that may be involved in movement initiation, we monitored dopamine release in the DLS with two complementary florescent dopamine sensors (dLight1.1 and DA_2m) expressed broadly in striatal cells (Figure 3 and Figure 2—figure supplement 3D-E). The decay kinetics of the two extracellular dopamine sensors differ somewhat (Patriarchi et al., 2018; Sun et al., 2020), which we confirmed (dLight1.1 t_1/2~75 ms, DA_2m t_1/2~125 ms; Figure 3—figure supplement 1), yet both revealed similar timing-related ramping dynamics on average (Figure 3 inset). These combined data argue that the seconds-long dopaminergic ramping signals were not artifacts of sluggish temporal responses of the various fluorescent sensors and were ultimately expressed as ramp-like increases in dopamine release in the striatum.

Figure 3 with 1 supplement see all

Download asset Open asset

Striatal dopamine release during the self-timed movement task.

Photometry signals averaged together from DA_2m signals (n = 4 mice) and dLight1.1 signals (n = 5 mice) recorded in DLS. Axis break and plot alignment as in Figure 2. Dashed lines: first-lick times. **Inset, left**: surgical strategy. **Inset, right**: Comparison of dLight1.1 and DA_2m dynamics. Expanded vertical scale to show ramping in the average signals for DA_2m (solid trace) and dLight1.1 (dashed trace) up until the time of the first-lick (first-lick occurred between 2 and 3 s after the cue for this subset of the data). See also: Figure 3—figure supplement 1. Figure 3—source data 1.

Figure 3—source data 1 Striatal dopamine indicator signals.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig3-data1-v3.zip
Download elife-62583-fig3-data1-v3.zip

First-lick timing-predictive DAN signals are not explained by ongoing body movements

The systematic ramping dynamics and baseline differences were not observed in the tdTomato optical control channel nor in any of the other movement-control channels, at least on average (Figure 4), making it unlikely that ramping dynamics resulted from optical artifacts. Nevertheless, because DANs show transient responses to salient cues and movements (Coddington and Dudman, 2018; da Silva et al., 2018; Dodson et al., 2016; Howe and Dombeck, 2016; Schultz et al., 1997), it is possible that fluorescence signals could reflect the superposition of dopaminergic responses to multiple task events, including the cue, lick, ongoing spurious body movements, and hidden cognitive processes like timing. For example, accelerating spurious movements could, in principle, produce motor-related neural activity that ramps up during the timed interval, perhaps even at different rates on different trials.

Figure 4 with 1 supplement see all

Download asset Open asset

Movement controls reliably detected movements, but there were no systematic differences in movement during the timing interval.

(A) Schematic of movement-control measurements. (B) First-lick-aligned average movement signals on rewarded (red) and unrewarded (blue) trials. Pre-lick traces begin at the nearest cue-time (dashed red, dashed blue). Left: one session; Right: all sessions. Dashed grey line: time of earliest-detected movement on most sessions (150ms before first-lick). Average first-lick-aligned tdTomato optical artifacts showed inconsistent excursion directions (up/down), even within the same session; signals for each artifact direction shown in Figure 4—figure supplement 1. Source data: Figure 4—source data 1.

Figure 4—source data 1 Movement control signals.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig4-data1-v3.zip
Download elife-62583-fig4-data1-v3.zip

We thus derived a nested generalized linear encoding model of single-trial GCaMP6f signals (Engelhard et al., 2019; Park et al., 2014; Runyan et al., 2017), a data-driven, statistical approach designed to isolate and quantify the contributions of task events (timing-independent predictors) from processes predictive of movement timing (timing-dependent predictors; Figure 5A–B and Figure 5—figure supplement 1A-D). The model robustly detected task-event GCaMP6f kernels locked to cue, lick and EMG/accelerometer events, but these timing-independent predictors alone were insufficient to capture the rich variability of GCaMP6f signals for trials with different first-lick times, especially the timing-dependent ramp-slope and baseline offset (n = 12 mice, Figure 5C and Figure 5—figure supplement 1E-G). In contrast, two timing-dependent predictors robustly improved the model: (1) a baseline offset with amplitude linearly proportional to first-lick time; and (2) a ‘stretch’ feature representing percentages of the timed interval (Figure 5B–C and Figure 5—figure supplement 1E). The baseline offset term fit a baseline level inversely proportional to movement time, and the temporal stretch feature predicted a ramping dynamic from the time of the cue up to the first-lick, whose slope was inversely proportional to first-lick time. Similar results were obtained for SNc DAN axon terminals in the DLS, VTA DAN cell bodies, and extracellular striatal dopamine release (Figure 5—figure supplement 1H).

Figure 5 with 2 supplements see all

Download asset Open asset

Contribution of optical artifacts, task variables and nuisance bodily movements to SNc GCaMP6f signals.

(A) Nested encoding model comparing the contribution of timing-independent predictors (TI) to the contribution of timing-dependent predictors (TD). (B) Predicted dF/F signal for one session plotted up to time of first-lick. Model error simulated 300x (shading). (C) Nested encoding model for one session showing the actual recorded signal (1st panel), the timing-independent model (2nd panel), and the full, timing-dependent model with all predictors (3rd panel). Top: GCaMP6f; Bottom: tdTomato (tdt). Right: relative loss improvement by timing-dependent predictors (grey dots: single sessions, line: median, box: lower/upper quartiles, whiskers: 1.5x IQR). See also Figure 5—figure supplement 1. Source data: Figure 5—source data 1.

Figure 5—source data 1 DAN signal encoding model.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig5-data1-v3.zip
Download elife-62583-fig5-data1-v3.zip

We note that the stretch feature of this GLM makes no assumptions about the underlying shape of the dopaminergic signal; it only encodes percentages of timing intervals to allow for temporal ‘expansion’ or ‘contraction’ to fit whatever shape(s) were present in the data. In particular, the stretch feature cannot produce ramping unless ramping is present in the signal and temporally scales with the length of the interval. Because this feature empirically found a ramp (although not constrained to do so), the stretch aspect indicated that the underlying ramping process took place at different rates for trials with different movement times, at least on average.

In contrast to the GCaMP6f model, when the same GLM was applied to the tdTomato control signal, the timing-independent predictors (which could potentially cause optical/mechanical artifacts—the cue onset, first-lick, and EMG/accelerometer signal) improved the model, but timing-dependent predictors did not (Figure 5C and Figure 5—figure supplement 1F-H). In addition, separate principal component (PC) analysis revealed ramp-like and baseline-offset-like components that explained as much as 93% of the variance in DAN signals during the timing interval (mean: 66%, range: 16–93%), but similar PCs were not present when tdTomato control signals were analyzed with PCA (mean variance explained: 4%, range: 1.6–15%, Figure 5—figure supplement 2).

Single-trial DAN ramping and baseline signals predict movement timing

Given that ramping and baseline-offset signals were not explained by nuisance movements or optical artifacts, we asked whether DAN GCaMP6f fluorescence could predict first-lick timing on single trials. Using a simple threshold-crossing decoding model (Maimon and Assad, 2006), we found that single-trial GCaMP6f signals were predictive of first-lick time even for low thresholds intersecting the ‘base’ of the ramp, with the predictive value of the model progressively improving for higher thresholds (n = 12 mice: mean R² low = 0.54, mid = 0.71, high = 0.82 (95% CI: low=[0.44,0.64], mid=[0.68,0.75], high=[0.76,0.87]); analysis for one mouse shown in Figure 6A). We will return to this observation in more detail in the upcoming section on single-trial dynamics.

Figure 6 with 4 supplements see all

Download asset Open asset

Single-trial DAN signals predict first-lick timing.

(A) Schematic of nested decoding model. Categories for n-1^th trial predictors: (2) reaction, (3) early, (4) reward, (5) ITI first-lick (see Materials and methods). Bottom: single-trial cue-aligned SNc DAN GCaMP6f signals from one session (six trials shown for clarity). Traces plotted up to first-lick. Right: threshold-crossing model. Low/Mid/High label indicates threshold amplitude. Dots: single trials. (B) Model weights. Error bars: 95% CI, *: p<0.05, two-sided t-test. Numbers indicate nesting-order. (C) Variance explained by each model nest. Gray lines: single sessions; thick black line: average. For model selection, see Figure 6—figure supplement 1. (D) Predicted vs. actual first-lick time, same session as 6A. See also Figure 6—figure supplements 1–4. Source data: Figure 6—source data 1.

Figure 6—source data 1 Movement time decoding model.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig6-data1-v3.zip
Download elife-62583-fig6-data1-v3.zip

To more thoroughly determine the independent, additional predictive power of DAN baseline and ramping signals over other task variables (e.g., previous trial first-lick time and reward outcome, etc.), we derived a nested decoding model for first-lick time (Figure 6A). In this model, the pre-cue ‘baseline’ was divided into two components: the pre-lamp-off intertrial interval signal (‘ITI’) and the lamp-off to cue interval signal (‘LOI’). All predictors contributed to the predictive power of the model. However, even when we accounted for the contributions of prior trial history, tdTomato artifacts and baseline GCaMP6f signals, GCaMP6f threshold-crossing time robustly dominated the model and absorbed much of the variance explained by baseline dopaminergic signals, alone explaining 10% of the variance in first-lick time on average (range: 1–27%, Figure 6B–D). Alternate formulations of the decoding model produced similar results (Figure 6—figure supplement 1).

Characterizing single-trial dopaminergic dynamics

Although the threshold-crossing analysis made no assumptions about the underlying dynamics of the GCaMP6f signals on single-trials, in principle, ramping dynamics in averaged neural signals could be produced from individual trials with a single, discrete ‘step’ occurring at different times on different trials. Ramping has long been observed in averaged neural signals recorded during perceptual decision tasks in monkeys, and there has been considerable debate over whether single-trial responses in these experiments are better classified as ‘ramps’ or a single ‘step’ (Latimer et al., 2015; Latimer et al., 2016; Shadlen et al., 2016; Zoltowski et al., 2019; Zylberberg and Shadlen, 2016). It has even been suggested that different sampling distributions can produce opposite model classifications in ground-truth synthetic datasets (Chandrasekaran et al., 2018).

We attempted to classify single-trial dynamics as a discrete stepping or ramping process with hierarchical Bayesian models implemented in probabilistic programs (Figure 6—figure supplement 2A-B). However, like the perceptual decision-making studies, we also found ambiguous results, with about half of single trials best classified by a linear ramp and half best classified by a discrete step dynamic (Figure 6—figure supplement 2C). Nonetheless, three separate lines of evidence suggest that single trials are better characterized by slowly evolving ramps:

First, the relationship of threshold-crossing time to first-lick time is different for the step vs. ramp models when different threshold levels are sampled (Maimon and Assad, 2006), as schematized in Figure 6—figure supplement 3A: Increasing slope of this relationship is consistent with ramps on single trials, but not with a discrete step, which would be expected to have the same threshold-crossing time regardless of threshold level (Figure 6—figure supplement 3B). We found that the slope of this relationship increased markedly as the threshold level was increased, consistent with the ramp model (n = 12 mice: mean slope low = 0.46, mid = 0.7, high = 0.82 (95% CI: low=[0.37,0.54], mid=[0.66,0.73], high=[0.74,0.88]), Figure 6—figure supplement 3C).

Second, if single trials involve a step change occurring at different times from trial-to-trial then aligning trials on that step should produce a clear step on average, rather than a ramp (Latimer et al., 2015). We thus aligned single-trial GCaMP6f signals according to that optimal step position determined from a Bayesian step model fit for each trial and then averaged the step-aligned signals across trials. The averaged signals did not resemble a step function, but rather yielded a sharp transient superimposed on a ‘background’ ramping signal (Figure 6—figure supplement 4A). Step-aligned tdTomato and EMG averages showed a small inflection at the time of the step, but neither signal showed background ramping. This suggests that the detected ‘steps’ in the GCaMP6f signals were likely transient movement artifacts superimposed on the slower ramping dynamic rather than bona fide steps.

Third, the ideal step model holds that the step occurs at different times from trial-to-trial, producing a ramping signal when trials are averaged together. In this view, the trial-by-trial variance of the signal should be maximal at the time at which 50% of the steps have occurred among all trials, and the variance should be minimal at the beginning and end of the interval (when no steps or all steps have occurred, respectively). We thus derived the optimal step time for each trial using the Bayesian step model, and we then calculated variance as a function of time within pools of trials with similar movement times. The signal variance showed a monotonic downward trend during the timed interval, with a minimum variance at the time of movement rather than at the point at which 50% of steps had occurred among trials, inconsistent with the discrete step model (Figure 6—figure supplement 4B).

Taken together, we did not find evidence for a discrete step dynamic on single trials; on the contrary, our observations concord with slow ramping dynamics on single trials. Regardless, our GLM movement-time decoding approaches in Figure 6 did not make any assumptions about underlying single-trial dynamics.

Moment-to-moment DAN activity causally controls movement timing on single trials

Because dopaminergic ramping signals robustly predicted first-lick timing and were apparently transmitted via dopamine release to downstream striatal neurons, ramping DAN activity may causally determine movement timing. However, because the animals could expect reward within a few hundred milliseconds of the first-lick, it is also possible that the dopaminergic ramps could instead serve as a ‘passive’ monitor of reward expectation without influencing movement initiation. To distinguish these possibilities, we optogenetically activated or inhibited DANs (in separate experiments) on 30% of randomly interleaved trials (Figure 7A and Figure 7—figure supplement 1). For activation experiments, we chose light levels that elevated DAN activity within the physiological range observed in our self-timed movement task, as assayed by simultaneous photometry in the DLS with a fluorescent sensor of released dopamine (dLight1.1, Figure 7—figure supplement 2). DAN activation significantly early-shifted the distribution of self-timed movements on stimulated trials compared to unstimulated trials (12 mice; 2-sample Kolmogorov-Smirnov (KS) Test, D = 0.078 (95% CI: [0.067,0.093]), p = 2.8·10^–26), whereas inhibition produced significant late-shifting compared to unstimulated trials (4 mice; two-sample KS Test, D = 0.051 (95% CI: [0.034,0.077]), p = 3.1·10^–4; Figure 7B and Figure 7—figure supplement 3A). Stimulation of mice expressing no opsin produced no consistent effect on timing (5 mice; two-sample KS Test, D = 0.017 (95% CI: [0.015,0.040]), p = 0.62). The direction of these effects was consistent across all animals tested in each category (Figure 7C). Complementary analysis methods revealed consistent effects (bootstrapped difference in median first-lick times between categories: Δ(activation - no-opsin) = –0.22 s (95% CI=[–0.32 s,–0.12 s]), Δ(inhibition – no-opsin) = +0.19 s (95% CI=[+0.09 s,+0.30 s]), Figure 7C–D; bootstrapped comparison of difference in area under the cdf curves: Δ(activation – no-opsin) = –0.31 dAUC (95% CI=[–0.47 dAUC,–0.15 dAUC]), Δ(inhibition – no-opsin) = +0.23 dAUC (95% CI=[+0.08 dAUC,+0.37 dAUC]), Figure 7—figure supplement 3B; bootstrapped difference in mean first-lick times between categories: Δ(activation – no-opsin) = –0.34 s (95% CI=[–0.49 s,–0.19 s]), Δ(inhibition – no-opsin) = +0.24 s (95% CI=[+0.09 s,+0.39 s]), Figure 7—figure supplement 3C). Similar effects were obtained with activation of SNc DAN axon terminals in the DLS (2 mice, Figure 7—figure supplement 3A-B). Because these exogenous manipulations of DAN activity modulated movement timing on the same trial as the stimulation/inhibition, this suggests that the endogenous dopaminergic ramping we observed during the self-timed movement task likewise affected movement initiation in real time, rather than serving solely as a passive monitor of reward expectation.

Figure 7 with 4 supplements see all

Download asset Open asset

Optogenetic DAN manipulation systematically and bidirectionally shifts the timing of self-timed movements.

(A) Strategy for optogenetic DAN activation or inhibition. Mice were stimulated from cue-onset until first-lick or 7 s. (B) Empirical continuous probability distribution functions (cdf) of first-lick times for stimulated (blue line) versus unstimulated (grey line) trials. Arrow and shading show direction of effect. p-Values calculated by Kolmogorov-Smirnov test (for other metrics, see Figure 7—figure supplements 1 and 3). (C) Median 1,000,000x bootstrapped difference in first-lick time, stimulated-minus-unstimulated trials. Box: upper/lower quartile; line: median; whiskers: 1.5x IQR; dots: single mouse. (D) Comparison of median first-lick time difference across all sessions. Error bars: 95% confidence interval (*: p<0.05, 1,000,000x bootstrapped median difference in first-lick time between sessions of different stimulation categories). See also Figure 7—figure supplements 1–4. Source data: Figure 7—source data 1.

Figure 7—source data 1 Optogenetic manipulation of SNc DANs.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig7-data1-v3.zip
Download elife-62583-fig7-data1-v3.zip

Recent studies have shown that physiological ranges of optogenetic DAN activation (as assayed by simultaneous recordings from DANs) fail to elicit overt movements (Coddington and Dudman, 2018). We likewise found that optogenetic DAN activation did not elicit immediate licking outside the context of the task (Figure 7—figure supplement 4A). Additionally, optogenetic DAN inhibition did not reduce the rate of spontaneous licking outside the context of the task (Figure 7—figure supplement 4B). In both cases, we used the same light levels that had elicited the robust shifts in timing behavior during the self-timed movement task. In other control experiments, we purposefully drove neurons into non-physiological activity regimes during the task by applying higher activation light levels. Over-stimulation caused large, immediate, sustained increases in DLS dopamine (Figure 7—figure supplement 2), comparable in amplitude to the typical reward-related dopamine transients on interleaved, unstimulated trials. These non-physiological manipulations resulted in rapid, nonpurposive body movements and disrupted performance of the task. Together, these results suggest that the optogenetic effects on timing in Figure 7 did not result from direct, immediate triggering or suppression of movement, nor from non-physiological dopamine release due to over-stimulation.

Linking endogenous DAN signals to the moment-to-moment probability of movement initiation

Optogenetic manipulations of DAN activity in the physiological range appeared to modulate the probability of initiating the pre-potent, self-timed movement. Given that endogenous DAN signals increased during the timing interval of the self-timed movement task, we reasoned that the probability of movement should likewise increase over the course of the timed interval. We thus derived a nested probabilistic movement-state decoding model to explore the link between DAN signals and movement propensity (Figure 8A). We applied a GLM based on logistic regression, in which we classified each moment of time as either a non-movement (0) or movement (1) state (Figure 8A–B), and we examined how well various parameters could predict the probability of transitioning from the non-movement state to the movement state. Unlike the decoding model in Figure 6, which considers a single threshold-crossing time, the probabilistic approach takes into account continuous DAN signals. Initial model selection included previous trial history (movement time and reward outcome history) in addition to the DAN GCaMP6f signal, but Bayesian Information Criterion (BIC) analysis indicated that the instantaneous GCaMP6f signal alone was a robustly significant predictor of movement state, whereas previous trial outcomes were insignificant contributors and did not further improve the model (Figure 8—figure supplement 1). We thus only considered the DAN GCaMP6f signal as a predictor in subsequent analyses.

Figure 8 with 2 supplements see all

Download asset Open asset

Single-trial dynamic dopaminergic signals predict the moment-to-moment probability of movement initiation.

(A) Probabilistic movement-state model schematic. (B) Single-trial DAN GCaMP6f signals at SNc from one session. First-lick time truncated 150 ms before movement detection to exclude peri-movement signals. Bottom: Movement states for the trials shown as a function of time. Diagram on the right schematizes the model predictors relative to an example time = t on a single trial. (C) Nested model fitted coefficients. (D) Decoded hazard functions from full model (with all 10 predictors). Thick line: mean. n = 12 mice. (E) Hazard function fitting with shuffled datasets abolished the predictive power of the model (same 12 mice). See also Figure 8—figure supplements 1–2. Figure 8—source data 1.

Figure 8—source data 1 Movement state decoding model.: https://cdn.elifesciences.org/articles/62583/elife-62583-fig8-data1-v3.zip
Download elife-62583-fig8-data1-v3.zip

The continuous DAN GCaMP6f signal was indeed predictive of current movement state at any time t, and it served as a significant predictor of movement state up to at least 2 s in the past (Figure 8C). However, the signals became progressively more predictive of the current movement state as time approached t. That is, the dopaminergic signal level closer to time t tended to absorb the behavioral variance explained by more distant, previous signal levels (Figure 8C), reminiscent of how threshold crossing time absorbed the variance explained by the baseline dopaminergic signal in the movement-timing decoding model (Figure 6B–C). This observation is consistent with a diffusion-like ramping process on single trials, in which the most recent measurement gives the best estimate of whether there will be a transition to the movement state (but is difficult to reconcile with a discrete step process on single trials, consistent with the results in Figure 6—figure supplements 3–4).

We applied the fitted instantaneous probabilities of transitioning to the movement state to derive a fitted hazard function for each behavioral session (Figure 8D). The DAN GCaMP6f signals were remarkably predictive of the hazard function, both for individual sessions and on average, explaining 65% of the variance on average (n = 12 mice). Conversely, when the model was fit on the same data in which the timepoint identifiers were shuffled, this predictive power was essentially abolished, explaining only 5% of the variance on average (Figure 8E).

Together, these results demonstrate that slowly evolving dopaminergic signals are predictive of the moment-to-moment probability of movement initiation. When combined with the optogenetics results, they argue that dopaminergic signals causally modulate the moment-to-moment probability of the pre-potent movement. In this view, trial-by-trial variability in the DAN signal gives rise to trial-by-trial differences in movement timing in the self-timed movement task.

Discussion

We made two main findings. First, both baseline and slowly ramping DAN signals were predictive of the timing of the first-lick. Second, optogenetic modulation of DANs affected the timing of movements on the concurrent trial, suggesting that DANs can play a ‘real time’ role in behavior. These observations raise two (presumably separable) questions of interpretation: (1) what is the mechanistic origin of ramping DAN signals in the self-timed movement task, and (2) how do DAN signals affect self-timed movements in real time?

The origin of ramping DAN signals

A number of studies have reported short-latency (<500 ms) modulations in DAN activity following reward-predicting sensory cues and immediately preceding movements (Coddington and Dudman, 2018; da Silva et al., 2018; Dodson et al., 2016; Howe and Dombeck, 2016; Schultz et al., 1997), similar to the sensory- and motor-related transients we observed within ~500 ms of the cue and first-lick. However, the ramping DAN signals we observed during self-timing were markedly slower, unfolding over seconds and preceding the first-lick by as long as 10 s. Previous studies have reported similarly slow ramping dopaminergic signals in other behavioral contexts, including goal-directed navigation toward rewarded targets (Howe et al., 2013), multi-step tasks leading to reward (Hamid et al., 2016; Howard et al., 2017; Mohebi et al., 2019), and passive observation of dynamic visual cues indicating proximity to reward (Kim et al., 2019). A common feature in these experiments and our self-timed movement task is that trials culminated in the animal’s receiving reward. Thus, parsimony suggests that dopaminergic ramping could reflect reward expectation. However, dopaminergic ramping is generally absent in Pavlovian paradigms, in which animals learn to expect passive reward delivery at a fixed time following a conditioned stimulus (Menegas et al., 2015; Tian et al., 2016; Schultz et al., 1997; Starkweather et al., 2017). One exception is a report of ramping activity in monkey DANs during a Pavlovian paradigm with reward uncertainty (Fiorillo et al., 2003); however, ramping was not subsequently reproduced under similar conditions, either in monkeys (Fiorillo, 2011; Matsumoto and Hikosaka, 2009; Tobler et al., 2005) or rodents (Hart et al., 2015; Tian and Uchida, 2015). Thus, while dopaminergic ramping is likely related to reward expectation, the preponderance of evidence suggests that reward expectation alone is insufficient to cause DAN ramping.

To reconcile these disparate findings, Gershman and colleagues proposed a formal model in which dopaminergic ramping encodes reward expectation in the form of an ‘ongoing’ reward-prediction error (RPE) that arises from resolving uncertainty of one’s position in the value landscape (i.e., one’s spatial-temporal distance to reward delivery/omission). For example, uncertainty is resolved if animals are provided visuospatial cues indicating proximity to reward (Howe et al., 2013; Kim et al., 2019). In contrast, because animals can only imprecisely estimate the passage of time, the animal is uncertain of when reward will be delivered/omitted in Pavlovian tasks. The RPE model holds that this temporal uncertainty flattens the Pavlovian value landscape, thereby flattening dopaminergic ramping to the degree that it is obscured (Gershman, 2014; Kim et al., 2019; Mikhael and Gershman, 2019; Mikhael et al., 2019). Although both our task and Pavlovian tasks involve timing, the key difference may be that the animal actively determines when reward will be delivered/omitted in the self-timed movement task—just after it moves. Certainty in the timing of reward relative to its own movement would resolve the animal’s uncertainty of its position in the value landscape, and may thus explain why dopaminergic ramping occurs prominently in the self-timed movement task, but not in Pavlovian tasks (Hamilos and Assad, 2020). Although the RPE model provides a plausible explanation for our findings, dopaminergic ramping signals are also consistent with broader views of ‘reward expectation’, such as value tracking as animals approach reward (Hamid et al., 2016; Mohebi et al., 2019). In a companion theoretical paper (Hamilos and Assad, 2020), we explore the reward expectation-based computational framework in more detail, including a reconciliation of apparently contradictory DAN signals reported in the context of a perceptual timing task (Soares et al., 2016).

How do DAN signals affect movement in real time?

We found that trial-by-trial variability in ramping dynamics explained the precise timing of self-timed licks. However, because the animals could expect reward shortly after the first-lick, the ramping dopaminergic signal might serve as a passive monitor of reward expectation rather than causally influencing the timing of movement initiation. To distinguish these possibilities, we optogenetically manipulated SNc DAN activity. We found that exciting or inhibiting DANs altered the timing of the first-lick on the concurrent trial, in a manner suggesting an increase/decrease in the probability of movement, respectively. This suggests that endogenous DAN signaling could play a causal role in the initiation of reward-related movements in real time—but by what mechanism?

One possibility is that endogenous or exogenous DAN signals could increase the animal’s motivation or heighten its expectation of reward, which then secondarily influences reward-related movement. There is some evidence that might support this view. Phillips et al. found that electrical stimulation of the VTA in rats elicited approach behavior for self-delivery of intravenous cocaine; however, the electrical stimulation could have activated non-DAN fibers/pathways via the VTA (Phillips et al., 2003). Hamid et al. found that selective optogenetic stimulation of DANs could shorten the latency for rats to engage in a port-choice task—but only if the rat was disengaged from the task; if the rat was already engaged in task performance, the latency became slightly longer (Hamid et al., 2016).

In contrast to these equivocal findings, a large body of evidence suggests that selective optogenetic stimulation or inhibition of DANs generally does not affect reward-related movements on the same trial. First, we ourselves could not evoke licking (nor inhibit spontaneous licking) outside the context of our self-timed movement task (Figure 7—figure supplement 4). Our mice were thirsty and perched near their usual juice tube, but offline DAN stimulation/inhibition did not alter licking behavior, even though we applied the same optical power that altered movement probability during the self-timed movement task. Numerous studies have also examined the effects of optogenetic modulation of DANs in Pavlovian conditioning paradigms, with the general finding that DAN modulation affects conditioned movements on subsequent trials or sessions—a learning effect—but not on the same trial (Coddington and Dudman, 2018; Coddington and Dudman, 2021; Lee et al., 2020; Maes et al., 2020; Morrens et al., 2020; Pan et al., 2021; Saunders et al., 2018). For example, Lee et al. found that optogenetic inhibition of mouse DANs at the same time as an olfactory conditioned stimulus had no effect on anticipatory licking on the concurrent trial, even though inhibition at the time of reward delivery reduced the probability and rate of anticipatory licking on subsequent trials (Lee et al., 2020). Thus, the preponderance of evidence argues against a simple scheme whereby modulating DAN activity leads to a change in motivation that automatically evokes or suppresses reward-related movements in real time. The fact that we observed robust, concurrent optogenetic modulation of movement timing in our experiment suggests that additional factors were at play for self-timed movements.

One possibility is that during self-timing, exogenous (optogenetic) stimulation of DANs summed with the endogenous ramping DAN signal, leading to supra-heightened motivation to obtain reward. However, when we deliberately over-stimulated DANs—eliciting even higher dopamine signals in the DLS (Figure 7—figure supplement 2)—we observed ‘dyskinetic’ body movements rather than purposive licking. An alternative possibility is that the explicit timing requirement of the self-timed movement task made it particularly responsive to dopaminergic modulation. A long history of pharmacological and lesion experiments suggests that the dopaminergic system modulates timing behavior (Meck, 2006; Merchant et al., 2013). Broadly speaking, conditions that increase/decrease dopamine availability or efficacy speed/slow the ‘internal clock’, respectively (Dews and Morse, 1958; Mikhael and Gershman, 2019; Schuster and Zimmerman, 1961; Malapani et al., 1998; Meck, 1986; Meck, 2006; Merchant et al., 2013). The dopaminergic ramping signals we observed also bear resemblance to Pacemaker-Accumulator models of neural timing, in which a hypothetical accumulator signals that an interval has elapsed when it reaches a threshold level (Gallistel and Gibbon, 2000; Lustig and Meck, 2005; Meck, 2006). To ‘self-time’ a movement also implies that the movement is prepared and pre-potent during the timing period, potentially making the relevant neural motor circuits more sensitive to dopaminergic modulation.

Regardless of the detailed mechanism, our results provide a link between dopaminergic signaling and the initiation of self-timed movements. Although endogenous dopaminergic ramping likely reflects reward expectation, we propose that these reward-related ramping signals can influence the timing of movement initiation, at least in certain behavioral contexts. This framework also provides a link between two seemingly disparate roles that have been proposed for the dopaminergic system—reward/reinforcement-learning on one hand, and movement modulation on the other.

Importantly, we are not suggesting that DANs directly drive movement (like corticospinal or corticobulbar neurons). To the contrary, outside of the context of the self-timed movement task, we could not evoke reward-related movements by activating DANs. Even during the self-timed movement task, DAN stimulation did not elicit immediate movements: first-lick times still spanned a broad distribution from trial-to-trial. Moreover, dopaminergic ramping does not invariably lead to movement. For example, Kim et al. found dopaminergic ramping in the presence of visual cues that signaled proximity to reward, independent of reward-related movements (Kim et al., 2019). Consequently, we propose that when a movement is pre-potent (as in our self-timed movement task), dopaminergic signaling can modulate the probability of initiating that movement. Consistent with this view, we found that the endogenous ramping dynamics were highly predictive of the moment-by-moment probability of movement (as captured by the hazard function), with DAN signals becoming progressively better predictors as the time of movement onset approached.

This view of dopaminergic modulation of movement probability could be related to classic findings from extrapyramidal movement disorders, in which dysfunction of the nigrostriatal pathway produces aberrations in movement initiation rather than paralysis or paresis (Bloxham et al., 1984; Fahn, 2011; Hallett and Khoshbin, 1980). That is, movements do occur in extrapyramidal disorders, but at inappropriate times, either too little/late (e.g., Parkinson’s), or too often (e.g., dyskinesias). Based on the deficits observed in Parkinsonian states (e.g., perseveration), this role may extend to behavioral transitions more generally, for example, starting new movements or stopping ongoing movements (Guru et al., 2020).

Is DAN ramping also present before ‘spontaneous’ movements?

We have suggested that the ramping DAN signals in the self-timed movement task could be related to reward expectation coupled with the explicit timing requirement of the task. However, when we averaged DAN signals aligned to ‘spontaneous’ licks during the ITI, we also observed noisy, slow ramping signals building over seconds up to the time of the next lick, with a time course related to the duration of the inter-lick interval (Figure 8—figure supplement 2). This observation raises the possibility that slowly evolving DAN signals may be related to the generation of self-initiated movements more generally—although our highly trained animals may have also been ‘rehearsing’ timed movements between trials and/or expecting reward even for spontaneous licks.

Relationship to setpoint and stretching dynamics in other neural circuits

We found that DAN signals predict movement timing via two low-dimensional signals: a baseline offset and a ramping dynamic that ‘stretches’ depending on trial-by-trial movement timing. Intriguingly, similar stretching of neural responses has been observed before self-timed movement in other brain areas in rats and primates, including the dorsal striatum (Emmons et al., 2017; Mello et al., 2015; Wang et al., 2018), lateral interparietal cortex (Maimon and Assad, 2006), presupplementary and supplementary motor areas (Mita et al., 2009), and dorsomedial frontal cortex (DMFC; Remington et al., 2018; Sohn et al., 2019; Wang et al., 2018; Xu et al., 2014). In the case of DMFC, applying dimensionality reduction to the population responses revealed two lower-dimensional characteristics that resembled our findings in DANs: (1) the speed at which the population dynamics unfolded was scaled (‘stretched’) to the length of the produced timing interval (Wang et al., 2018), and (2) the population state at the beginning of the self-timed movement interval (‘setpoint’) was correlated with the timed interval (Remington et al., 2018; Sohn et al., 2019). Recurrent neural network models suggested variation in stretching and setpoint states could be controlled by (unknown) tonic or monotonically-ramping inputs to the cortico-striatal system (Remington et al., 2018; Sohn et al., 2019; Wang et al., 2018). We found that DANs exhibit both baseline (e.g., ‘setpoint’) signals related to timing, as well as monotonically ramping input during the timing interval. Thus, through their role as diffusely-projecting modulators, DANs could potentially orchestrate variations in cortico-striatal dynamics observed during timing behavior. Ramping DAN signals could also be related to the slow ramping signals that have been observed in the human motor system in anticipation of self-initiated movements, for example, readiness potentials in EEG recordings (Deecke, 1996; Libet et al., 1983).

Possible relationship to motivational/movement vigor

In operant tasks in which difficulty is systematically varied over blocks of trials, increased intertrial dopamine in the nucleus accumbens has been associated with higher average reward rate and decreased latency to engage in a new trial, suggesting a link between dopamine and ‘motivational vigor’, the propensity to invest effort in work (Hamid et al., 2016; Mohebi et al., 2019). Intriguingly, we observed the opposite relationship in the self-timed movement task: periods with higher average reward rates had lower average baseline dopaminergic signals and later first-lick times. Moreover, for a given first-lick time (e.g., 3.5–3.75 s), we did not detect differences in baseline (or ramping) signals during periods with different average reward rates, such as near the beginning or end of a session. This difference between the two tasks may be due to their opposing strategic constraints: in the aforementioned experiments, faster trial initiation increased the number of opportunities to obtain reward, whereas earlier first-licks tended to decrease reward acquisition in our self-timed movement task.

The basal ganglia have also been implicated in controlling ‘movement vigor,’ generally referring to the speed, force or frequency of movements (Bartholomew et al., 2016; Dudman and Krakauer, 2016; Panigrahi et al., 2015; Turner and Desmurget, 2010; Yttri and Dudman, 2016). The activity of nigrostriatal DANs has been shown to correlate with these parameters during movement bouts and could promote more vigorous movement via push-pull interactions with the direct and indirect pathways (Barter et al., 2015; da Silva et al., 2018; Mazzoni et al., 2007; Panigrahi et al., 2015). Movement vigor might also entail earlier self-timed movements, mediated by moment-to-moment increases in dopaminergic activity.

If moving earlier is a signature of greater movement vigor, then earlier self-timed movements might also be executed with greater force/speed. We looked for movement-related vigor signals, examining both the amplitude of lick-related EMG signals and the latency between lick initiation and lick-tube contact. We detected no consistent differences in these force- or speed-related parameters as a function of movement time; on the contrary, the EMG signals were highly stereotyped irrespective of the first-lick time (data not shown). It is possible that vigor might affect movement timing without affecting movement kinematics/dynamics—but, if so, the distinction between ‘timing’ and ‘vigor’ would seem largely semantical.

Overall view

We have posited that dopaminergic ramping reflects reward expectation, a common element of behavioral paradigms that reveal slow dopaminergic ramping. Furthermore, our optogenetic manipulations indicate that dopaminergic signals do not directly trigger movements, but rather act as if modulating the probability of the pre-potent self-timed movement. Taken together, these observations suggest that as DAN activity ramps up, the probability of movement likewise increases. In this view, different rates of increase in DAN activity lead to shorter or longer elapsed intervals before movement, on average. This framework leaves open the question of what makes movement timing ‘probabilistic.’ One possibility is that recurrent cortical-basal ganglia–thalamic circuits could act to generate movements ‘on their own,’ without direct external triggers (e.g., a ‘go!’ cue). By providing crucial modulation of these circuits, DANs could tune the propensity to make self-timed movements—and pathological loss of DANs could reduce the production of such movements. Future experiments should address how dynamic dopaminergic input influences downstream motor circuits involved in self-timed movements.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Strain, strain background (M. musculus)	DAT-Cre	The Jackson Laboratory, Bar Harbor, ME	B6.SJL-Slc6a3^tm1.^1(cre)Bkmm/JRRID:IMSR_JAX:020080	Cre expression in dopaminergic neurons
Strain, strain background (M. musculus)	Wild-type	The Jackson Laboratory, Bar Harbor, ME	C57BL/6RRID:IMSR_JAX:000664
Other	tdTomato (“tdt”)	UNC Vector Core, Chapel Hill, NC	AAV1-CAG-FLEX-tdT	Virus, for control photometry expression
Other	gCaMP6f	Penn Vector Core, Philadelphia, PA	AAV1.Syn.Flex.GCaMP6f.WPRE.SV40	Virus, for photometry expression
Other	DA2m	Vigene, Rockville, MD	AAV9-hSyn-DA4.4(DA2m)	Virus, for photometry expression
Other	dLight1.1	Lin Tian Lab; Children’s Hospital Boston Viral Core, Boston, MA	AAV9.hSyn.dLight1.1.wPRE	Virus, for photometry expression
Other	turboRFP	Penn Vector Core	AAV1.CB7.CI.TurboRFP.WPRE.rBG	Virus, for control photometry expression
Other	ChR2	UNC Vector Core, Chapel Hill, NC	AAV5-EF1a-DIO-hChR2(H134R)-EYFP-WPRE-pA	Virus, for opsin expression
Other	ChrimsonR	UNC Vector Core, Chapel Hill, NC	AAV1-hSyn-FLEX-ChrimsonR-tdT	Virus, for opsin expression
Other	stGtACR2	Addgene/Janelia Viral Core, Ashburn, VA	AAV2/8-hSyn1-SIO-stGtACR2-FusionRed	Virus, for opsin expression
Software, algorithm	Matlab	Mathworks	Matlab2018B	For most analyses
Software, algorithm	Julia Programming Language	The Julia Project	Julia 1.5.3	For probabilistic models
Software, algorithm	Gen.jl	The Gen Team	Gen.jl	For probabilistic models

Animals

Adult male and female hemizygous DAT-cre mice (Bäckman et al., 2006; B6.SJL-Slc6a3^tm1.^1(cre)Bkmm/J, RRID:IMSR_JAX:020080; The Jackson Laboratory, Bar Harbor, ME) or wild-type C57BL/6 mice were used in all experiments (>2 months old at the time of surgery; median body weight 23.8 g, range 17.3–31.9 g). Mice were housed in standard cages in a temperature and humidity-controlled colony facility on a reversed night/day cycle (12 hr dark/12 hr light), and behavioral sessions occurred during the dark cycle. Animals were housed with enrichment objects provided by the Harvard Center for Comparative Medicine (IACUC-approved plastic toys/shelters, e.g., Bio-Huts, Mouse Tunnels, Nest Sheets, etc.) and were housed socially whenever possible (1–5 mice per cage). All experiments and protocols were approved by the Harvard Institutional Animal Care and Use Committee (IACUC protocol #05098, Animal Welfare Assurance Number #A3431-01) and were conducted in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals.

Share this article

Cite this article

Self-timed movement task.

Figure 1—source data 1

SNc DAN signals preceding self-timed movement.

Figure 2—source data 1

Figure 2—source data 2

Figure 2—source data 3

Figure 2—source data 4

Striatal dopamine release during the self-timed movement task.

Figure 3—source data 1

Movement controls reliably detected movements, but there were no systematic differences in movement during the timing interval.

Figure 4—source data 1

Contribution of optical artifacts, task variables and nuisance bodily movements to SNc GCaMP6f signals.

Figure 5—source data 1

Single-trial DAN signals predict first-lick timing.

Figure 6—source data 1

Optogenetic DAN manipulation systematically and bidirectionally shifts the timing of self-timed movements.

Figure 7—source data 1

Single-trial dynamic dopaminergic signals predict the moment-to-moment probability of movement initiation.

Figure 8—source data 1

Author details

Allison E Hamilos

Contribution

For correspondence

Competing interests

Giulia Spedicato

Contribution

Competing interests

Ye Hong

Contribution

Competing interests

Fangmiao Sun

Contribution

Competing interests

Yulong Li

Contribution

Competing interests

John A Assad

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism