Mesolimbic dopamine ramps reflect environmental timescales

  1. Neuroscience Graduate Program, University of California, San Francisco, CA, USA
  2. Department of Neurology, University of California, San Francisco, CA, USA
  3. Weill Institute for Neurosciences, Kavli Institute for Fundamental Neuroscience, Center for Integrative Neuroscience, University of California, San Francisco, CA, USA

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a response from the authors (if available).

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Mihaela Iordanova
    Concordia University, Montreal, Canada
  • Senior Editor
    Michael Frank
    Brown University, Providence, United States of America

Reviewer #1 (Public Review):

Summary:

In this study, Floedder et al report that dopamine ramps in both Pavlovian and Instrumental conditions are shaped by reward interval statistics. Dopamine ramps are an interesting phenomenon because at first glance they do not represent the classical reward prediction errors associated with dopamine signaling. Instead, they seem somewhat to bridge the gap between tonic and phasic dopamine, with an intense discussion still being held in the field about what is their actual behavioral role. Here, in tests with head-fixed mice, and dopamine being recorded with a genetically encoded fluorescent sensor in the nucleus accumbens, the authors find that dopamine ramps were only present when intertrial intervals were relatively short and the structure of the task (Pavlovian cue or progression in a VR corridor) contained elements that indicated progression towards the reward (e.g., a dynamic cue). The authors show that these findings are well explained by their previously published model of Adjusted Net Contingency of Causal Relation (ANCCR).

Strengths:

This descriptive study delineates some fundamental parameters that define dopamine ramps in the studied conditions. The short, objective, and to-the-point format of the manuscript is great and really does a service to potential readers. The authors are very careful with the scope of their conclusions, which is appreciated by this reviewer.

Weaknesses:

The discussion of the results is very limited to the conceptual framework of the authors' preferred model (which the authors do recognize, but it still is a limitation). The correlation analysis presented in panel I of Figure 3 seems unnecessary at best and could be misleading, as it is really driven by the categorical differences between the two conditions that were grouped for this analysis. There are some key aspects of the data and their relationship with each other, the previous literature, and the methods used to collect them, that could have been better discussed and explored.

Reviewer #2 (Public Review):

In this manuscript by Floeder et al., the authors report a correlation between ITI duration and the strength of a dopamine ramp occurring in the time between a predictive conditioned stimulus and a subsequent reward. They found this relationship occurring within two different tasks with mice, during both a Pavlovian task as well as an instrumental virtual visual navigation task. Additionally, they observed this relationship only in conditions when using a dynamic predictive stimulus. The authors relate this finding to their previously published model ANCCR in which the time constant of the eligibility trace is proportionate to the reward rate within the task.

The relationship between ITI duration and the extent of a dopamine ramp which the authors have reported is very intriguing and certainly provides an important constraint for models for dopamine function. As such, these findings are potentially highly impactful to the field. I do have a few questions for the authors which are written below.

(1) I was surprised to see a lack of counterbalance within the Pavlovian design for the order of the long vs short ITI. Ramping of the lick rate does increase from the long-duration ITIs to the short-duration ITI sessions. Although of course, this increase in ramping of the licking across the two conditions is not necessarily a function of learning, it doesn't lend support to the opposite possibility that the timing of the dynamic CS hasn't reached asymptotic learning by the end of the long-duration ITI. The authors do reference papers in which overtraining tends to result in a reduction of ramping, which would argue against this possibility, yet differential learning of the dynamic CS would presumably be required to observe this effect. Do the authors have any evidence that the effect is not due to heightened learning of the timing of the dynamic CS across the experiment?

(2) The dopamine response, as measured by dLight, seems to drop after the reward is delivered. This reduction in responding also tends to be observed with electrophysiological recordings of dopamine neurons. It seems possible that during the short ITI sessions, particularly on the shorter ITI duration trials, that dopamine levels may still be reduced from the previous trial at the onset of the CS on the subsequent trial. Perhaps the authors can observe the dynamics of the recovery of the dopamine response following a reward delivery on longer-duration ITIs in order to determine how quickly dopamine is recovering following a reward delivery. Are the trials with very short ITIs occurring within this period that dopamine is recovering from the previous trial? If so, how much of the effect may be due to this effect? It should be noted that the lack of observance of a ramp on the condition of short-duration ITIs with fixed CSs provides a potential control for this effect, yet the extent to which a natural ramp might occur following sucrose deliveries should be investigated.

(3) The authors primarily relate the finding of the correlation between the ITI and the slope of the ramp to their ANCCR model by suggesting that shorter time constants of the eligibility trace will result in more precisely timed predictors of reward across discrete periods of the dynamic cue. Based on this prediction, would the change in slope be more gradual, and perhaps be more correlated with a broader cumulative estimate of reward rate than just a single trial?

Reviewer #3 (Public Review):

Summary:

Floeder and colleagues measure dopamine signaling in the nucleus accumbens core using fiber photometry of the dLight sensor, in Pavlovian and instrumental tasks in mice. They test some predictions from a recently proposed model (ANCCR) regarding the existence of "ramps" in dopamine that have been seen in some previous research, the characteristics of which remain poorly understood.

They find that cues signaling a progression toward rewards (akin to a countdown) specifically promote ramping dopamine signaling in the nucleus accumbens core, but only when the intertrial interval just experienced was short. This work is discussed in the context of ongoing theoretical conceptions of dopamine's role in learning.

Strengths:

This work is the clearest demonstration to date of concrete training factors that seem to directly impact whether or not dopamine ramps occur. The existence of ramping signals has long been a feature of debates in the dopamine literature and this work adds important context to that. Further, as a practical assessment of the impact of a relatively simple trial structure manipulation on dopamine patterns, this work will be important for guiding future studies. These studies are well done and thoughtfully presented.

Weaknesses:

It remains somewhat unclear what limits are in place on the extent to which an eligibility trace is reflected in dopamine signals. In the current study, a specific set of ITIs was used, and one wonders if the relative comparison of ITI/history variables ("shorter" or "longer") is a factor in how the dopamine signal emerges, in addition to the explicit length ("short" or "long") of the ITI. Another experimental condition, where variable ITIs were intermingled, could perhaps help clarify some remaining questions.

In both tasks, cue onset responses are larger, and longer on long ITI trials. One concern is that this larger signal makes seeing a ramp during the cue-reward interval harder, especially with a fluorescence method like photometry. Examining the traces in Figure 1i - in the long, dynamic cue condition the dopamine trace has not returned to baseline at the time of the "ramp" window onset, but the short dynamic trace has. So one wonders if it's possible the overall return to baseline trend in the long dynamic conditions might wash out a ramp.

Not a weakness of this study, but the current results certainly make one ponder the potential function of cue-reward interval ramps in dopamine (assuming there is a determinable function). In the current data, licking behavior was similar on different trial types, and that is described as specifically not explaining ramp activity.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation