Dopamine neurons learn relative chosen value from probabilistic rewards

  1. Armin Lak  Is a corresponding author
  2. William R Stauffer
  3. Wolfram Schultz
  1. University of Cambridge, United Kingdom
8 figures and 2 additional files

Figures

Monkeys rapidly learn the value of cues that predict rewards with different probabilities.

(A) Pavlovian task. Left: example of novel visual cues (fractal images) presented to monkeys. In each trial, animals were presented with a visual cue and received a large (0.4 ml) or small (0.1 ml) …

https://doi.org/10.7554/eLife.18044.003
Figure 2 with 1 supplement
Responses of dopamine neurons acquire predictive value from the frequency of rewards.

(A) Peri-stimulus time histograms (PSTHs) of a dopamine neuron in response to novel cues predicting rewards with different probabilities. Pink (0.1–0.2 s after cue onset) and grey (0.2–0.6 s after …

https://doi.org/10.7554/eLife.18044.004
Figure 2—figure supplement 1
Compound novelty-value responses of dopamine neurons to novel cues associated with different probabilistic rewards.

(A) PSTHs of dopamine population responses to novel reward predicting cues. Neuronal responses in the first, second, third and fourth trials are plotted separately. (B) Neuronal population responses …

https://doi.org/10.7554/eLife.18044.005
Responses of dopamine neurons to reward delivery develop over trials to reflect the learned value of probabilistic cues.

(A) PSTHs of example dopamine neurons in response to delivery of large and small juice rewards (top, bottom). Probabilities indicated in colour refer to the occurrence of the large reward in gambles …

https://doi.org/10.7554/eLife.18044.006
Figure 4 with 1 supplement
A reinforcementlearning model with a novelty term and an adaptive learning rate account for dopamine responses during learning.

(A) Schematic of RL models fitted on neuronal responses. In each trial, the model updates the value of stimulus based on the experienced reward prediction error. Six variants of RL models were …

https://doi.org/10.7554/eLife.18044.007
Figure 4—figure supplement 1
A reinforcement learning model with a novelty term and an adaptive learning rate account for dopamine responses during learning.

(A) Novelty + value estimates of the superior model (i.e. the model with a novelty term and adaptive learning rate) overlaid on neuronal population responses measured 0.1–0.6s after the cue onset …

https://doi.org/10.7554/eLife.18044.008
Monkeys rapidly learn to make meaningful choices among probabilistic reward predicting cues.

(A) Choice task. In each trial, after successful central fixation for 0.5 s, the animal was offered a choice between two cues, the familiar cue and the novel cue. The animal indicated its choice by …

https://doi.org/10.7554/eLife.18044.009
Figure 6 with 1 supplement
Dopamine responses to cues differentiate as monkeys learn the value of novel cues in the choice task.

(A) Neuronal population responses to cues over consecutive trials of the choice task, measured during 0.1–0.2 s after the cue onset (Dopamine novelty responses, see inset). Only trials in which …

https://doi.org/10.7554/eLife.18044.010
Figure 6—figure supplement 1
Neuronal responses to cue in the choice task.

The responses were averaged in the time window indicated in each panel. In each panel, only trials in which animal chose the novel cue were shown. Responses very early after cue onset only reflect …

https://doi.org/10.7554/eLife.18044.011
Figure 7 with 1 supplement
During learning dopamine neurons acquire choice-sensitive responses which emerge prior to response initiation.

(A) Population dopamine PSTHs to cues in the choice task. Grey horizontal bar indicates the temporal window used for statistical analysis. In all plots, all trials of learning blocks are included. …

https://doi.org/10.7554/eLife.18044.012
Figure 7—figure supplement 1
Population dopamine responses to cues over trials in which animals chose the familiar cue over the novel cues.

After nine choice trials, neuronal responses showed dependency to the value of the unchosen cue. Responses to cues at first and second trials are not shown because in these trials animals almost …

https://doi.org/10.7554/eLife.18044.013
Figure 8 with 2 supplements
Dopamine neurons encode relative chosen values.

(A) Left: Animals choices were simulated using standard reinforcement learning (RL) models (see Figure 8—figure supplements 1 and 2 and Materials and methods). Dotted lines show the performance of …

https://doi.org/10.7554/eLife.18044.014
Figure 8—figure supplement 1
Schematic of the RL model used for simulating monkeys’ choice behaviour.

In each trial, the model makes a choice by comparing values associated with familiar and novel cues (for models with novelty term: value vs novelty + value associated with familiar and novel cues …

https://doi.org/10.7554/eLife.18044.015
Figure 8—figure supplement 2
Estimated learning rates of the RL model and regression of dopamine novelty responses to model-driven novelty estimates.

(A) Average estimated learning rates of the superior model for familiar and novel cues. (B) Regression of neuronal population responses measured 0.1–0.2 s after the cue onset onto novelty estimates …

https://doi.org/10.7554/eLife.18044.016

Additional files

Supplementary file 1

Estimated parameters for six RL models fitted on dopamine responses in the Pavlovian task.

https://doi.org/10.7554/eLife.18044.017
Supplementary file 2

Estimated parameters for six RL models fitted on monkeys’ choices.

https://doi.org/10.7554/eLife.18044.018

Download links