Dopamine neurons learn relative chosen value from probabilistic rewards

  1. Armin Lak  Is a corresponding author
  2. William R Stauffer
  3. Wolfram Schultz
  1. University College London, United Kingdom
  2. University of Cambridge, United Kingdom

Abstract

Economic theories posit reward probability as one of the factors defining reward value. Individuals learn the value of cues that predict probabilistic rewards from experienced reward frequencies. Building on the notion that responses of dopamine neurons increase with reward probability and expected value, we asked how dopamine neurons in monkeys acquire this value signal that may represent an economic decision variable. We found in a Pavlovian learning task that reward probability-dependent value signals arose from experienced reward frequencies. We then assessed neuronal response acquisition during choices among probabilistic rewards. Here, dopamine responses became sensitive to the value of both chosen and unchosen options. Both experiments showed also the novelty responses of dopamine neurones that decreased as learning advanced. These results show that dopamine neurons acquire predictive value signals from the frequency of experienced rewards. This flexible and fast signal reflects a specific decision variable and could update neuronal decision mechanisms.

Article and author information

Author details

  1. Armin Lak

    Institute of Ophthalmology, University College London, London, United Kingdom
    For correspondence
    arminlak@gmail.com
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1926-5458
  2. William R Stauffer

    Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge, United Kingdom
    Competing interests
    No competing interests declared.
  3. Wolfram Schultz

    Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge, United Kingdom
    Competing interests
    Wolfram Schultz, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8530-4518

Funding

Wellcome (WT106101)

  • Armin Lak

Wellcome

  • Wolfram Schultz

European Research Council

  • Wolfram Schultz

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Animal experimentation: All experimental protocols and procedures were approved by the Home Office of the United Kingdom (project licence number: 80 / 2416).

Reviewing Editor

  1. Michael J Frank, Brown University, United States

Publication history

  1. Received: May 21, 2016
  2. Accepted: October 25, 2016
  3. Accepted Manuscript published: October 27, 2016 (version 1)
  4. Version of Record published: November 15, 2016 (version 2)

Copyright

© 2016, Lak et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,533
    Page views
  • 818
    Downloads
  • 52
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Armin Lak
  2. William R Stauffer
  3. Wolfram Schultz
(2016)
Dopamine neurons learn relative chosen value from probabilistic rewards
eLife 5:e18044.
https://doi.org/10.7554/eLife.18044

Further reading

    1. Computational and Systems Biology
    2. Neuroscience
    Jamie D Costabile, Kaarthik A Balakrishnan ... Martin Haesemeyer
    Research Article

    Brains are not engineered solutions to a well-defined problem but arose through selective pressure acting on random variation. It is therefore unclear how well a model chosen by an experimenter can relate neural activity to experimental conditions. Here we developed 'Model identification of neural encoding (MINE)'. MINE is an accessible framework using convolutional neural networks (CNN) to discover and characterize a model that relates aspects of tasks to neural activity. Although flexible, CNNs are difficult to interpret. We use Taylor decomposition approaches to understand the discovered model and how it maps task features to activity. We apply MINE to a published cortical dataset as well as experiments designed to probe thermoregulatory circuits in zebrafish. MINE allowed us to characterize neurons according to their receptive field and computational complexity, features which anatomically segregate in the brain. We also identified a new class of neurons that integrate thermosensory and behavioral information which eluded us previously when using traditional clustering and regression-based approaches.

    1. Neuroscience
    Tyler Bonnen, Mark AG Eldridge
    Research Advance

    Decades of neuroscientific research has sought to understand medial temporal lobe (MTL) involvement in perception. Apparent inconsistencies in the literature have led to competing interpretations of the available evidence; critically, findings from human participants with naturally occurring MTL damage appear to be inconsistent with findings from monkeys with surgical lesions. Here we leverage a 'stimulus-computable' proxy for the primate ventral visual stream (VVS), which enables us to formally evaluate perceptual demands across stimulus sets, experiments, and species. With this approach, we analyze a series of experiments administered to monkeys with surgical, bilateral damage to perirhinal cortex (PRC), a MTL structure implicated in visual object perception. Across experiments, PRC-lesioned subjects showed no impairment on perceptual tasks; this originally led us (Eldridge et al., 2018) to conclude that PRC is not involved in perception. Here we find that a 'VVS-like' model predicts both PRC-intact and -lesioned choice behaviors, suggesting that a linear readout of the VVS should be sufficient for performance on these tasks. Evaluating these data alongside findings from human experiments, we suggest that results from Eldridge et al., 2018 alone can not be used as evidence against PRC involvement in perception. These data suggest that the experimental findings from human and non-human primate literature are consistent, and apparent discrepancies between species was due to reliance on informal accounts of perceptual processing.