Dopamine neurons learn relative chosen value from probabilistic rewards

  1. Armin Lak  Is a corresponding author
  2. William R Stauffer
  3. Wolfram Schultz
  1. University College London, United Kingdom
  2. University of Cambridge, United Kingdom

Abstract

Economic theories posit reward probability as one of the factors defining reward value. Individuals learn the value of cues that predict probabilistic rewards from experienced reward frequencies. Building on the notion that responses of dopamine neurons increase with reward probability and expected value, we asked how dopamine neurons in monkeys acquire this value signal that may represent an economic decision variable. We found in a Pavlovian learning task that reward probability-dependent value signals arose from experienced reward frequencies. We then assessed neuronal response acquisition during choices among probabilistic rewards. Here, dopamine responses became sensitive to the value of both chosen and unchosen options. Both experiments showed also the novelty responses of dopamine neurones that decreased as learning advanced. These results show that dopamine neurons acquire predictive value signals from the frequency of experienced rewards. This flexible and fast signal reflects a specific decision variable and could update neuronal decision mechanisms.

Article and author information

Author details

  1. Armin Lak

    Institute of Ophthalmology, University College London, London, United Kingdom
    For correspondence
    arminlak@gmail.com
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1926-5458
  2. William R Stauffer

    Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge, United Kingdom
    Competing interests
    No competing interests declared.
  3. Wolfram Schultz

    Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge, United Kingdom
    Competing interests
    Wolfram Schultz, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8530-4518

Funding

Wellcome (WT106101)

  • Armin Lak

Wellcome

  • Wolfram Schultz

European Research Council

  • Wolfram Schultz

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Michael J Frank, Brown University, United States

Ethics

Animal experimentation: All experimental protocols and procedures were approved by the Home Office of the United Kingdom (project licence number: 80 / 2416).

Version history

  1. Received: May 21, 2016
  2. Accepted: October 25, 2016
  3. Accepted Manuscript published: October 27, 2016 (version 1)
  4. Version of Record published: November 15, 2016 (version 2)

Copyright

© 2016, Lak et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,840
    views
  • 845
    downloads
  • 70
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Armin Lak
  2. William R Stauffer
  3. Wolfram Schultz
(2016)
Dopamine neurons learn relative chosen value from probabilistic rewards
eLife 5:e18044.
https://doi.org/10.7554/eLife.18044

Share this article

https://doi.org/10.7554/eLife.18044

Further reading

    1. Neuroscience
    James Malkin, Cian O'Donnell ... Laurence Aitchison
    Research Article

    Biological synaptic transmission is unreliable, and this unreliability likely degrades neural circuit performance. While there are biophysical mechanisms that can increase reliability, for instance by increasing vesicle release probability, these mechanisms cost energy. We examined four such mechanisms along with the associated scaling of the energetic costs. We then embedded these energetic costs for reliability in artificial neural networks (ANNs) with trainable stochastic synapses, and trained these networks on standard image classification tasks. The resulting networks revealed a tradeoff between circuit performance and the energetic cost of synaptic reliability. Additionally, the optimised networks exhibited two testable predictions consistent with pre-existing experimental data. Specifically, synapses with lower variability tended to have (1) higher input firing rates and (2) lower learning rates. Surprisingly, these predictions also arise when synapse statistics are inferred through Bayesian inference. Indeed, we were able to find a formal, theoretical link between the performance-reliability cost tradeoff and Bayesian inference. This connection suggests two incompatible possibilities: evolution may have chanced upon a scheme for implementing Bayesian inference by optimising energy efficiency, or alternatively, energy-efficient synapses may display signatures of Bayesian inference without actually using Bayes to reason about uncertainty.

    1. Neuroscience
    Wenyu Tu, Samuel R Cramer, Nanyin Zhang
    Research Article

    Resting-state brain networks (RSNs) have been widely applied in health and disease, but the interpretation of RSNs in terms of the underlying neural activity is unclear. To address this fundamental question, we conducted simultaneous recordings of whole-brain resting-state functional magnetic resonance imaging (rsfMRI) and electrophysiology signals in two separate brain regions of rats. Our data reveal that for both recording sites, spatial maps derived from band-specific local field potential (LFP) power can account for up to 90% of the spatial variability in RSNs derived from rsfMRI signals. Surprisingly, the time series of LFP band power can only explain to a maximum of 35% of the temporal variance of the local rsfMRI time course from the same site. In addition, regressing out time series of LFP power from rsfMRI signals has minimal impact on the spatial patterns of rsfMRI-based RSNs. This disparity in the spatial and temporal relationships between resting-state electrophysiology and rsfMRI signals suggests that electrophysiological activity alone does not fully explain the effects observed in the rsfMRI signal, implying the existence of an rsfMRI component contributed by ‘electrophysiology-invisible’ signals. These findings offer a novel perspective on our understanding of RSN interpretation.