Dopamine neurons learn relative chosen value from probabilistic rewards

  1. Armin Lak  Is a corresponding author
  2. William R Stauffer
  3. Wolfram Schultz
  1. University College London, United Kingdom
  2. University of Cambridge, United Kingdom

Abstract

Economic theories posit reward probability as one of the factors defining reward value. Individuals learn the value of cues that predict probabilistic rewards from experienced reward frequencies. Building on the notion that responses of dopamine neurons increase with reward probability and expected value, we asked how dopamine neurons in monkeys acquire this value signal that may represent an economic decision variable. We found in a Pavlovian learning task that reward probability-dependent value signals arose from experienced reward frequencies. We then assessed neuronal response acquisition during choices among probabilistic rewards. Here, dopamine responses became sensitive to the value of both chosen and unchosen options. Both experiments showed also the novelty responses of dopamine neurones that decreased as learning advanced. These results show that dopamine neurons acquire predictive value signals from the frequency of experienced rewards. This flexible and fast signal reflects a specific decision variable and could update neuronal decision mechanisms.

Article and author information

Author details

  1. Armin Lak

    Institute of Ophthalmology, University College London, London, United Kingdom
    For correspondence
    arminlak@gmail.com
    Competing interests
    No competing interests declared.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1926-5458
  2. William R Stauffer

    Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge, United Kingdom
    Competing interests
    No competing interests declared.
  3. Wolfram Schultz

    Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge, United Kingdom
    Competing interests
    Wolfram Schultz, Reviewing editor, eLife.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8530-4518

Funding

Wellcome (WT106101)

  • Armin Lak

Wellcome

  • Wolfram Schultz

European Research Council

  • Wolfram Schultz

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Michael J Frank, Brown University, United States

Ethics

Animal experimentation: All experimental protocols and procedures were approved by the Home Office of the United Kingdom (project licence number: 80 / 2416).

Version history

  1. Received: May 21, 2016
  2. Accepted: October 25, 2016
  3. Accepted Manuscript published: October 27, 2016 (version 1)
  4. Version of Record published: November 15, 2016 (version 2)

Copyright

© 2016, Lak et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,756
    Page views
  • 841
    Downloads
  • 56
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Armin Lak
  2. William R Stauffer
  3. Wolfram Schultz
(2016)
Dopamine neurons learn relative chosen value from probabilistic rewards
eLife 5:e18044.
https://doi.org/10.7554/eLife.18044

Share this article

https://doi.org/10.7554/eLife.18044

Further reading

    1. Neuroscience
    Kiwamu Kudo, Kamalini G Ranasinghe ... Srikantan S Nagarajan
    Research Article

    Alzheimer’s disease (AD) is characterized by the accumulation of amyloid-β and misfolded tau proteins causing synaptic dysfunction, and progressive neurodegeneration and cognitive decline. Altered neural oscillations have been consistently demonstrated in AD. However, the trajectories of abnormal neural oscillations in AD progression and their relationship to neurodegeneration and cognitive decline are unknown. Here, we deployed robust event-based sequencing models (EBMs) to investigate the trajectories of long-range and local neural synchrony across AD stages, estimated from resting-state magnetoencephalography. The increases in neural synchrony in the delta-theta band and the decreases in the alpha and beta bands showed progressive changes throughout the stages of the EBM. Decreases in alpha and beta band synchrony preceded both neurodegeneration and cognitive decline, indicating that frequency-specific neuronal synchrony abnormalities are early manifestations of AD pathophysiology. The long-range synchrony effects were greater than the local synchrony, indicating a greater sensitivity of connectivity metrics involving multiple regions of the brain. These results demonstrate the evolution of functional neuronal deficits along the sequence of AD progression.

    1. Medicine
    2. Neuroscience
    Luisa Fassi, Shachar Hochman ... Roi Cohen Kadosh
    Research Article

    In recent years, there has been debate about the effectiveness of treatments from different fields, such as neurostimulation, neurofeedback, brain training, and pharmacotherapy. This debate has been fuelled by contradictory and nuanced experimental findings. Notably, the effectiveness of a given treatment is commonly evaluated by comparing the effect of the active treatment versus the placebo on human health and/or behaviour. However, this approach neglects the individual’s subjective experience of the type of treatment she or he received in establishing treatment efficacy. Here, we show that individual differences in subjective treatment - the thought of receiving the active or placebo condition during an experiment - can explain variability in outcomes better than the actual treatment. We analysed four independent datasets (N = 387 participants), including clinical patients and healthy adults from different age groups who were exposed to different neurostimulation treatments (transcranial magnetic stimulation: Studies 1 and 2; transcranial direct current stimulation: Studies 3 and 4). Our findings show that the inclusion of subjective treatment can provide a better model fit either alone or in interaction with objective treatment (defined as the condition to which participants are assigned in the experiment). These results demonstrate the significant contribution of subjective experience in explaining the variability of clinical, cognitive, and behavioural outcomes. We advocate for existing and future studies in clinical and non-clinical research to start accounting for participants’ subjective beliefs and their interplay with objective treatment when assessing the efficacy of treatments. This approach will be crucial in providing a more accurate estimation of the treatment effect and its source, allowing the development of effective and reproducible interventions.