Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework
Abstract
Midbrain dopamine neurons have been proposed to signal reward prediction errors as defined in temporal difference (TD) learning algorithms. While these models have been extremely powerful in interpreting dopamine activity, they typically do not use value derived through inference in computing errors. This is important because much real world behavior - and thus many opportunities for error-driven learning - is based on such predictions. Here, we show that error-signaling rat dopamine neurons respond to the inferred, model-based value of cues that have not been paired with reward and do so in the same framework as they track the putative cached value of cues previously paired with reward. This suggests that dopamine neurons access a wider variety of information than contemplated by standard TD models and that, while their firing conforms to predictions of TD models in some cases, they may not be restricted to signaling errors from TD predictions.
Article and author information
Author details
Ethics
Animal experimentation: Experiments were performed at the National Institute on Drug Abuse Intramural Research Program in accordance with NIH guidelines and an approved institutional animal care and use committee protocol (15-CNRB-108). The protocol was approved by the ACUC at NIDA-IRP (Assurance Number: A4149-01).
Copyright
This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Metrics
-
- 3,524
- views
-
- 853
- downloads
-
- 100
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Evidence increasingly suggests that dopaminergic neurons play a more sophisticated role in predicting rewards than previously thought.
-
- Neuroscience
Memory consolidation during sleep depends on the interregional coupling of slow waves, spindles, and sharp wave-ripples (SWRs), across the cortex, thalamus, and hippocampus. The reuniens nucleus of the thalamus, linking the medial prefrontal cortex (mPFC) and the hippocampus, may facilitate interregional coupling during sleep. To test this hypothesis, we used intracellular, extracellular unit and local field potential recordings in anesthetized and head restrained non-anesthetized cats as well as computational modelling. Electrical stimulation of the reuniens evoked both antidromic and orthodromic intracellular mPFC responses, consistent with bidirectional functional connectivity between mPFC, reuniens and hippocampus in anesthetized state. The major finding obtained from behaving animals is that at least during NREM sleep hippocampo-reuniens-mPFC form a functional loop. SWRs facilitate the triggering of thalamic spindles, which later reach neocortex. In return, transition to mPFC UP states increase the probability of hippocampal SWRs and later modulate spindle amplitude. During REM sleep hippocampal theta activity provides periodic locking of reuniens neuronal firing and strong crosscorrelation at LFP level, but the values of reuniens-mPFC crosscorrelation was relatively low and theta power at mPFC was low. The neural mass model of this network demonstrates that the strength of bidirectional hippocampo-thalamic connections determines the coupling of oscillations, suggesting a mechanistic link between synaptic weights and the propensity for interregional synchrony. Our results demonstrate the presence of functional connectivity in hippocampo-thalamo-cortical network, but the efficacy of this connectivity is modulated by behavioral state.