Audiovisual cues must be predictable and win-paired to drive risky choice

  1. Brett A Hathaway  Is a corresponding author
  2. Dexter R Kim
  3. Salwa BA Malhas
  4. Kelly M Hrelja
  5. Lauren Kerker
  6. Tristan J Hynes
  7. Celyn Harris
  8. Angela Langdon
  9. Catharine Winstanley  Is a corresponding author
  1. Graduate Program in Neuroscience, Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Canada
  2. National Institute of Mental Health Intramural Research Program, United States
  3. Department of Psychology, Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Canada
8 figures, 3 tables and 2 additional files

Figures

The rat gambling task (rGT).

(A) Schematic of the rGT. A nose poke response in the food tray extinguished the traylight and initiated a new trial. After an inter-trial interval (ITI) of 5 s, four stimulus lights were turned on in holes 1, 2, 4, and 5, each of which was associated with a different number of sugar pellets. The order of the options from left to right was counter-balanced within each cohort to avoid development of a simple side bias (version A [shown]: P1, P4, P3, P2; version B: P4, P1, P3, P2). The animal was required to respond at a hole within 10 s. This response was then rewarded or punished depending on the reinforcement schedule for that option. If the animal lost, the stimulus light in the chosen hole flashed at a frequency of 0.5 Hz for the duration of the time-out penalty, and all other lights were extinguished. The maximum number of pellets available per 30 min session shows that P1 and P2 are more optimal than P3 and P4. The percent choice of the different options is one of the primary dependent variables. A score variable is also calculated, as for the IGT, to determine the overall level of risky choice as follows: [(P1 +P2) – (P3 +P4)]. (B) Distinct variants of the rGT. On the uncued variant, no audiovisual cues were present. The standard task featured audiovisual cues that scaled in complexity and magnitude with reward size. The reverse-cued variant inverted this relationship, such that the simplest cue was paired with the largest reward, and vice versa. Audiovisual cues were paired with both wins and losses for the outcome-cued variant. For the random-cued variant, cues were played on 50% of trials, regardless of outcome. Lastly, for the loss-cued variant, cues were only paired with losing outcomes, at the onset of the time-out penalty.

© 2016, Winstanley and Floresco. Panel A is reproduced from Figure 1 from Winstanley and Floresco, 2016, published under the CC-BY 4.0. Further reproductions must adhere to the terms of this license.

Figure 2 with 1 supplement
Differences in baseline performance between task variants.

Comparative baseline performance on variants of the rGT. (A) Percent choice of each option in the six rGT task variants. (B) Average decision score shows risk preference is significantly modulated by the presence and contingency of outcome-paired cues, with preference for the high-risk options (P3 and P4) strongly enhanced in task variants in which the audiovisual cues scale with outcome magnitude and occur on winning trials. (C) Premature responding across the rGT variants, and (D) for risky versus optimal decision-makers. (E) Latency for reward collection on winning trials across the variants of the rGT. Data are expressed as mean + SEM. Black asterisk indicates significant difference (p<0.05) from uncued task; red caret indicates significant difference from standard cued task. ***p<0.001. N = 176 rats.

Figure 2—figure supplement 1
Comparative baseline performance for other metrics on variants of the rGT.

(A) Latency to choose an option across the rGT variants. (B) Number of omitted trials per session across task variants. (C) Number of completed trials across task variants. No significant differences were found between task variants for any of these metrics. Data are expressed as mean + SEM. N = 176 rats.

Effects of sucrose pellet devaluation on choice preference.

(A) P1–P4 choice preference after reinforcer devaluation compared to baseline preference for risk-preferring rats. Devaluation did not shift choice patterns selectively in task variants featuring consistent win-paired cues (standard, outcome-cued, reverse-cued). (B) P1–P4 choice preference after reinforcer devaluation compared to baseline preference in optimal rats. Reinforcer devaluation induced a slight shift in choice preference, with no differences found between tasks. Data are expressed as the mean change in % choice from baseline + SEM to highlight effects independent of differences in preference for each option between cohorts. Asterisk indicates significant choice × devaluation effect, p<.05. n = 31 risk-preferring rats, n = 95 optimal rats.

Figure 4 with 1 supplement
Difference in WAIC between each model and the basic model across the rGT task variants.

Differences were calculated by subtracting each model’s WAIC from the basic model WAIC for that task variant. Larger differences indicate a better explanation of the data. Error bars are SEM of the WAIC difference.

Figure 4—figure supplement 1
Visualization of how different values of the m, r, and b parameters impact the transformed time-out penalty durations into equivalent cost in pellets.

Values were chosen to represent the spread of estimated values from model fits. (A) m parameter visualization; scaled and scaled +offset models assume linear relationship between time-out duration and equivalent cost, with m parameter determining the slope of this relationship. Lower values produce a flatter relationship, such that long durations have relatively less of an impact on latent Q-values. (B) b parameter visualization (scaled + offset model); the global offset parameter allows a uniform cost to impact latent Q-values for all loss trials. (C) r parameter visualization; the nonlinear model preserves the assumption that equivalent costs monotonically increase with length of duration, but allows for nonlinearity in the shape of the relationship, such that longer durations yield a smaller increase in cost. (D) As in (B), b parameter allows a uniform cost to impact Q-values on all loss trials.

Average simulated decision score and P1–P4 choice (sessions 36–40) for the nonlinear and scaled + offset models, simulated with the subject-level parameter estimates for each task variant.

(A) Average simulated decision score for the nonlinear model. (B) As in (A), for the scaled + offset model. (C) Average simulated P1-P4 choice for the nonlinear model. (D) As in (C), for the scaled + offset model. Black asterisk indicates significant difference from uncued task; red caret indicates significant difference from standard cued task. N = 165 rats (simulated from subject-level parameter estimates).

Group-level posterior estimates of nonlinear cost model parameters.

Asterisks within the inset tables mark parameters for which the 95% HDI of the sample difference did not contain zero, indicating a credible difference. For each distribution, the line demarcates the mean, the box demarcates the interquartile interval, and the whiskers demarcate the 95% HDI. N = 165 rats.

Figure 7 with 2 supplements
Group-level posterior estimates of scaled +offset model parameters.

Asterisks within the inset tables mark parameters for which the 95% HDI of the sample difference did not contain zero, indicating a credible difference. For each distribution, the line demarcates the mean, the box demarcates the interquartile interval, and the whiskers demarcate the 95% HDI. N = 165 rats.

Figure 7—figure supplement 1
Group-level posterior estimates of scaled model parameters.

Asterisks within the inset tables mark parameters for which the 95% HDI of the sample difference did not contain zero, indicating a credible difference. For each distribution, the box demarcates the interquartile interval and the whiskers demarcate the 95% HDI. N = 165 rats.

Figure 7—figure supplement 2
Group-level posterior estimates of scaled reward model parameters.

Asterisks within the inset tables mark parameters for which the 95% HDI of the sample difference did not contain zero, indicating a credible difference. For each distribution, the box demarcates the interquartile interval and the whiskers demarcate the 95% HDI. N = 165 rats.

Relationships between subject-level parameter estimates and final decision score across rats.

Simple linear regression models were fit for each parameter to assess whether parameter values covaried with decision score at the end of training. Dotted lines indicate 95% confidence intervals around the regression lines. (A) Subject-level global offset (b) estimates from nonlinear model versus final decision score. (B) Subject-level punishment learning rate (η–) estimates from scaled + offset model versus final decision score. (C) Subject-level beta (β) estimates from nonlinear model versus final decision score. (D) As in (C), with beta estimates from the scaled + offset model. N = 165 rats.

Tables

Table 1
Decision score comparisons.
Tukey HSD
Task comparisonMean differenceSignificance
UncuedStandard45.51<0.0001
Reverse25.320.006
Outcome51.75<0.0001
Random–3.290.99
Loss–22.230.03
StandardReverse–20.190.06
Outcome6.240.95
Random–48.81<0.0001
Loss–67.75<0.0001
ReverseOutcome26.430.006
Random–28.620.002
Loss–47.56<0.0001
OutcomeRandom–55.05<0.0001
Loss–73.99<0.0001
RandomLoss–18.940.11
  1. Comparisons of decision score between task variants using Tukey’s HSD test. Bolded values indicate a significant difference.

Table 2
Devaluation in risk-preferring rats: P1–P4 choice.
TaskF valueDegrees of freedomp value
Uncued/random/loss (n=7)4.173,180.04
Standard (n=7)0.143,150.93
Reverse (n=3)2.983,60.12
Outcome (n=14)1.473,390.24
  1. Choice × devaluation interactions for each task in risk-preferring rats. Bolded values indicate a significant difference.

Table 3
Devaluation in risk-preferring rats: Decision score.
TaskF valueDegrees of freedomp value
Uncued/random/loss (n=7)55.491,60.005
Standard (n=7)0.021,60.89
Reverse (n=3)4.881,20.16
Outcome (n=14)3.451,130.09
  1. Decision score × devaluation interactions for each task in risk-preferring rats. Bolded values indicate a significant difference.

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Brett A Hathaway
  2. Dexter R Kim
  3. Salwa BA Malhas
  4. Kelly M Hrelja
  5. Lauren Kerker
  6. Tristan J Hynes
  7. Celyn Harris
  8. Angela Langdon
  9. Catharine Winstanley
(2026)
Audiovisual cues must be predictable and win-paired to drive risky choice
eLife 14:RP105951.
https://doi.org/10.7554/eLife.105951.3