A process model account of the role of dopamine in intertemporal choice

  1. Alexander Soutschek  Is a corresponding author
  2. Philippe N Tobler
  1. Department of Psychology, Ludwig Maximilian University Munich, Germany
  2. Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Switzerland
  3. Neuroscience Center Zurich, University of Zurich, Swiss Federal Institute of Technology Zurich, Switzerland

Abstract

Theoretical accounts disagree on the role of dopamine in intertemporal choice and assume that dopamine either promotes delay of gratification by increasing the preference for larger rewards or that dopamine reduces patience by enhancing the sensitivity to waiting costs. Here, we reconcile these conflicting accounts by providing empirical support for a novel process model according to which dopamine contributes to two dissociable components of the decision process, evidence accumulation and starting bias. We re-analyzed a previously published data set where intertemporal decisions were made either under the D2 antagonist amisulpride or under placebo by fitting a hierarchical drift diffusion model that distinguishes between dopaminergic effects on the speed of evidence accumulation and the starting point of the accumulation process. Blocking dopaminergic neurotransmission not only strengthened the sensitivity to whether a reward is perceived as worth the delay costs during evidence accumulation (drift rate) but also attenuated the impact of waiting costs on the starting point of the evidence accumulation process (bias). In contrast, re-analyzing data from a D1 agonist study provided no evidence for a causal involvement of D1R activation in intertemporal choices. Taken together, our findings support a novel, process-based account of the role of dopamine for cost-benefit decision making, highlight the potential benefits of process-informed analyses, and advance our understanding of dopaminergic contributions to decision making.

Editor's evaluation

This important study reanalyzes a prior dataset testing effects of D2 antagonism on choices in a delay discounting task. While the prior report using standard analysis, showed no effects, the current study used a DDM to examine more carefully possible contrasting effects on different subcomponents of the decision process. This approach revealed convincing evidence of contrasting effects with D2 blockade increasing the effect of reward size differences to favor selection of the larger, later reward, while also shifting the bias toward selection of the small immediate reward. The authors speculate that these opposing effects explain the variability in effects across studies, since they mean that effects would depend on which of these factors is more important in a particular design.

https://doi.org/10.7554/eLife.83734.sa0

Introduction

Many decisions require trading-off potential benefits (rewards) against the costs of actions, such as the time one has to wait for reward to occur (Soutschek and Tobler, 2018). The neurotransmitter dopamine is thought to play a central role in such cost-benefit trade-offs by increasing the tolerance for action costs in order to maximize net subjective benefits (Beeler, 2012; Robbins and Everitt, 1992; Salamone and Correa, 2012; Schultz, 2015). Tonic dopaminergic activity was hypothesized to implement a ‘cost control’ which moderates whether a reward or goal is considered to be worth its costs (Beeler and Mourra, 2018). Prominent accounts of dopaminergic functioning thus predict that dopamine should strengthen the preference for costly larger-later (LL) over less costly smaller-sooner (SS) rewards. However, empirical studies modulating dopaminergic neurotransmission during intertemporal decision making provided inconsistent evidence for these hypotheses (for a review, see Webber et al., 2021). Blocking dopaminergic activation even seems to increase rather than to reduce the preference for delayed outcomes (Arrondo et al., 2015; Soutschek et al., 2017; Wagner et al., 2020; Weber et al., 2016), in apparent contrast to accounts proposing that lower dopaminergic activity should decrease the attractiveness of costly rewards (Beeler and Mourra, 2018; Robbins and Everitt, 1992; Salamone and Correa, 2012). Thus, the link between dopamine and cost-benefit weighting in intertemporal choice remains elusive. Yet, a plausible account of how dopamine affects cost-benefit weighting is important given that deficits in delay of gratification belong to the core symptoms of several psychiatric disorders and that dopaminergic medication plays a central role in the treatment of these and other disorders (Hasler, 2012; MacKillop et al., 2011).

To account for the conflicting findings on the role of dopamine in intertemporal choice, recent proximity accounts hypothesized that dopamine – in addition to strengthening the pursuit of valuable goals – also increases the preference for proximate over distant rewards (first formulated by Westbrook and Frank, 2018; see also Soutschek et al., 2022). While proximity and action costs often correlate negatively (as cost-free immediate rewards are typically more proximate than costly delayed rewards), they can conceptually be distinguished: perceived costs depend on an individual’s internal state (e.g., available resources to wait for future rewards), whereas proximity is determined by situational factors like familiarity or concreteness (Westbrook and Frank, 2018). The hypothesis that dopamine increases the proximity advantage of sooner over later rewards is consistent with the observed stronger preference for LL options after D2R blockade, which could not be explained by standard accounts of the role of dopamine in cost-benefit decisions (Beeler and Mourra, 2018; Salamone and Correa, 2012).

Still, the question remains as to how the proximity account can be reconciled with the large body of evidence for a motivating role of dopamine in other domains than intertemporal choice (Webber et al., 2021). We recently suggested that both accounts may be unified within the framework of computational process models like the drift diffusion model (DDM) (Soutschek et al., 2022). DDMs assume that decision makers accumulate evidence for two reward options until a decision boundary is reached. The dopamine-mediated cost control may be implemented via dopaminergic effects on the evaluation of reward magnitudes and delay costs during the evidence accumulation process (drift rate), while a proximity advantage for sooner over delayed rewards may shift the starting bias toward the decision boundary for sooner rewards (Soutschek et al., 2022; Westbrook and Frank, 2018). Such proximity effects on the starting bias could reflect an automatic bias toward immediate rewards as posited by dual process models of intertemporal choice (Figner et al., 2010; McClure et al., 2004), whereas the influence of reward and delay on the drift rate involves more controlled and attention-demanding weighting of costs and benefits. Combining these two, in their consequences on overt choices partially opposing, but independent, effects of dopamine in a unified and tractable account could reconcile conflicting findings. In turn, such a process account might provide a knowledge basis to advance our understanding of the neurochemical basis of the decision-making deficits in clinical disorders and improve the effectiveness of pharmaceutical interventions.

Here, we tested central assumptions of the proposed account by re-analyzing the data from two previous studies that investigated how the dopamine D2 receptor antagonist amisulpride and the D1 agonist PF-06412562 impact cost-benefit weighting in intertemporal choice (Soutschek et al., 2017; Soutschek et al., 2020a). D1Rs are prevalent in the direct ‘Go’ pathway and facilitate action selection via mediating the impact of phasic bursts elicited by high above-average rewards (Evers et al., 2017; Kirschner et al., 2020). D2Rs, in contrast, dominate the indirect ‘Nogo’ pathway (which suppresses action) and are more sensitive to small concentration differences in tonic dopamine levels (Missale et al., 1998), which is thought to encode the background, average reward rate (Kirschner et al., 2020; Volkow and Baler, 2015; Westbrook and Frank, 2018; Westbrook et al., 2020). Comparing the influences of the two compounds on the choice process during intertemporal decisions allowed us to test the hypothesized dissociable roles of D1Rs and D2Rs for decision making. Previously reported analyses of these data had shown no influence of D2R blockade or D1R stimulation on the mean preferences for LL over SS options (Soutschek et al., 2017; Soutschek et al., 2020a). However, they had not asked whether the pharmacological agents moderate the influences of reward magnitudes and delay costs on subcomponents of the decision process within the framework of a DDM. We re-analyzed the data sets with hierarchical Bayesian drift diffusion modeling to test central assumptions of the proposed account on dopamine’s role in cost-benefit weighting. First, if D2R activation implements a cost threshold moderating the evaluation of whether a reward is worth the action costs, then blocking D2R activation with amisulpride should increase the influence of reward magnitude on the speed of evidence accumulation, with costly small rewards becoming less acceptable than under placebo. Second, if D2R-mediated tonic dopaminergic activity also moderates the impact of proximity on choices (which affects the starting bias rather than the speed of the evidence accumulation process), D2R blockade should attenuate the effects of waiting costs on the starting bias. Third, we expected D1R stimulation to modulate the sensitivity to rewards during evidence accumulation (via increasing activity in the direct ‘Go’ pathway), without affecting proximity costs which were related to tonic rather than phasic dopaminergic activity (Westbrook and Frank, 2018).

Results

To disentangle how dopamine contributes to distinct subcomponents of the choice process, we re-analyzed a previously published data set where 56 participants had performed an intertemporal choice task under the D2 antagonist amisulpride (400 mg) and placebo in two separate sessions (Soutschek et al., 2017; Figure 1). First, we assessed amisulpride effects on intertemporal choices with conventional model-based and model-free analyses, as they are employed by other pharmacological studies on cost-benefit weighting. Hyperbolic discounting of future rewards was not significantly different under amisulpride (mean log-k=–2.07) compared with placebo (mean log-k=–2.19), Bayesian t-test, HDImean = 0.21, HDI95% = [–0.28; 0.70], and there were also no drug effects on choice consistency (inverse temperature), HDImean = –0.28, HDI95% = [–0.71; 0.13]. Model-free Bayesian mixed generalized linear models (MGLMs) revealed a stronger preference for LL over SS options with increasing differences in reward magnitudes, HDImean = 6.32, HDI95% = [5.03; 7.83], and with decreasing differences in delay of reward delivery, HDImean = –1.27, HDI95% = [–1.87; –0.60]. The impact of delays on choices was significantly reduced under amisulpride compared with placebo, HDImean = 0.75, HDI95% = [0.02; 1.67] (Figure 1C/D and Table 1). When we explored whether dopaminergic effects changed over the course of the experiment, we observed a significant main effect of trial number (more LL choices over time), HDImean = 0.58, HDI95% = [0.19; 0.99]. However, this effect was unaffected by the pharmacological manipulation, HDImean = –0.06, HDI95% = [–0.61; 0.48]. We also re-computed the MGLM reported above on log-transformed decision times, adding predictors for choice (SS vs. LL option) and Magnitudesum (combined magnitudes of SS and LL rewards). Participants made faster decisions the higher the sum of the two rewards, HDImean = –0.12, HDI95% = [–0.18; –0.06], however we observed no significant drug effects on decision times. Thus, based on these conventional analyses one would conclude that reduction of D2R neurotransmission lowers the sensitivity to delay costs, which on the one hand agrees with one line of previous findings (Arrondo et al., 2015; Wagner et al., 2020; Weber et al., 2016). On the other hand, this result seems to contradict the widely held assumption that dopamine increases the preference for costly over cost-free outcomes (Beeler and Mourra, 2018; Webber et al., 2021; Westbrook et al., 2020), because according to this view lower dopaminergic activity should increase, rather than decrease, the impact of waiting costs on LL choices. However, analyses that consider only the observed choices do not allow disentangling dopaminergic influences on distinct subcomponents of the choice process.

Task design and experimental procedures.

(A) Participants made choices between alternatives that provided smaller-sooner rewards (e.g., 100 Swiss francs in 0 day) or larger-later rewards (e.g., 250 Swiss francs in 60 days). (B) In a double-blind crossover design, participants performed the intertemporal decision task after administration of the D2 antagonist amisulpride or placebo on two separate days. (C) Model-free Bayesian analyses revealed weaker influences of delay costs on decision making under amisulpride compared with placebo, consistent with previous findings that D2R antagonism strengthens the preference for delayed rewards. (D) Individual coefficients for the impact of delay on choices in the amisulpride and placebo conditions. (E) Illustration of the choice process in the framework of a drift-diffusion model. After a non-decision time τ (not shown here), evidence is accumulated from a starting point ζ with the weighted difference between benefits and action costs determining the speed of the accumulation process (drift rate v) toward the boundaries for the larger-later or smaller-sooner option. (F) Delay discounting under placebo (log-k; dots correspond to individual participants, with more negative values indicating weaker delay discounting) decreased with the difference in weights assigned to rewards and delay costs during evidence accumulation, replicating previous findings (Amasino et al., 2019).

Table 1
Results of Bayesian generalized linear model regressing binary choices (larger-later [LL] vs. smaller-sooner [SS] option) in the amisulpride study on predictors for Drug, Magnitudediff, Delaydiff, and the interaction terms.

Standard errors of the mean of the posterior distributions are in brackets.

PredictorMean2.5%97.5%
Intercept3.10 (0.64)1.914.43
Drug–0.10 (0.53)–1.120.96
Delaydiff–1.27 (0.32)–1.87–0.60
Magnitudediff6.32 (0.72)5.037.83
Drug×Delaydiff0.75 (0.41)0.021.67
Drug×Magnitudediff0.25 (0.71)–1.051.77
Delaydiff ×Magnitudediff0.16 (0.43)–0.611.10
Drug×Delaydiff×Magnitudediff0.73 (0.60)–0.302.07

DDMs paint a fuller picture of the decision process than pure choice data by integrating information from observed choices and decision times. DDMs assume that agents accumulate evidence for the choice options (captured by the drift parameter v) from a starting point ζ until the accumulated evidence reaches a decision threshold (boundary parameter a; Figure 1E). Following previous procedures analyzing intertemporal choices with DDMs (Amasino et al., 2019), we assumed that the drift rate ν integrates reward magnitudes and delays of choice options via attribute-wise comparisons (DDM-1). In addition, we also allowed the starting bias to vary as a function of differences in delay costs, in line with recent proximity accounts of dopamine (Westbrook and Frank, 2018).

A sanity check revealed that larger differences between the reward magnitudes of the LL and SS options bias evidence accumulation toward the LL option, HDImean = 2.41, HDI95% = [1.93; 2.95], whereas larger differences in delays bias accumulation in favor of the SS option, HDImean = –1.13, HDI95% = [–1.53; –0.78]. Moreover, we assessed the relationship between the difference in DDM parameters (reward magnitude – delay) and hyperbolic discount parameters log-k as purely choice-based indicator of impulsiveness. Replicating previous findings, we found that across individuals the weights relate to delay discounting, r=–0.61, p<0.001 (Amasino et al., 2019; Figure 1F), such that individuals weighting reward magnitudes more strongly over delays make more patient choices. Thus, our model parameters capture essential subprocesses of intertemporal decision making.

Next, we tested the impact of our dopaminergic manipulation on evidence accumulation: D2R blockade strengthened the impact of differences in reward magnitude on evidence accumulation, Drug×Magnitudediff: HDImean = 0.81, HDI95% = [0.04; 1.71], while the contribution of differences in delay costs remained unchanged, Drug×Delaydiff: HDImean = –0.30, HDI95% = [–0.85; 0.20] (Figure 2A–F and Table 2). The drug-induced increase in sensitivity to variation in reward magnitude suggests that low rewards are considered less valuable under amisulpride compared with placebo (Figure 2C). This finding is consistent with the cost control hypothesis (Beeler and Mourra, 2018) according to which low dopamine levels reduce the attractiveness of smaller, below-average rewards.

D2R blockade affects multiple components of the intertemporal decision process.

(A) Larger differences in reward magnitude between the larger-later (LL) and smaller-sooner (SS) option increased drift rates, speeding up evidence accumulation toward LL options under placebo. (B) The impact of differences in reward magnitude was significantly stronger under amisulpride than under placebo. (C) Drug-dependent impact of differences in reward magnitude on the drift rate. Because the sensitivity to differences in reward magnitude was stronger under amisulpride than under placebo (steeper slope), D2R blockade sped up evidence accumulation toward the boundary for LL choices if differences in reward magnitude between the LL and SS options were large. In contrast, if the difference in reward magnitude was small, the drift rate was more negative under amisulpride compared with placebo, speeding up evidence accumulation toward the SS option. (D) Larger differences between delay of reward promoted evidence accumulation toward the (negative) boundary for SS choices under placebo, (E, F) but the impact of delay was not significantly altered by amisulpride. (G) The starting point of the accumulation processes was closer to the boundary for LL than SS choices under placebo, (H) and this starting bias toward the LL option was significantly reduced by amisulpride. (I) For larger differences in waiting costs, the starting point of the evidence accumulation process was increasingly shifted toward the SS option under placebo. (J) This impact of delay costs on the starting bias was significantly reduced under amisulpride. As illustrated in (K), reducing dopaminergic action on D2R with amisulpride shifted the starting bias toward the boundary for the SS option predominantly if no option possessed a clear proximity advantage (small difference between delays). In A, B, D, E, G, H, I, and J, yellow bars close to x-axis indicate 95% HDIs.

Table 2
Results of hierarchical DDM-1 for the amisulpride data.

SEM: standard errors of the mean of the posterior distributions.

ParameterRegressorMean (SEM)2.5%97.5%
Drift rateDelaydiff–1.13 (0.19)–1.53–0.78
Magnitudediff2.41 (0.26)1.932.95
Drug×Delaydiff–0.30 (0.27)–0.850.20
Drug×Magnitudediff0.81 (0.42)0.041.71
vmax0.80 (0.05)0.700.90
Drug×vmax–0.00 (0.06)–0.120.12
Decision thresholdPlacebo3.96 (0.15)3.674.26
Drug0.17 (0.15)–0.110.46
Starting biasPlacebo0.57 (0.01)0.540.60
Drug–0.04 (0.02)–0.08–0.001
Delaydiff–0.02 (0.01)–0.03–0.002
Drug×Delaydiff0.02 (0.01)0.0010.04
Non-decision timePlacebo1.48 (0.06)1.361.60
Drug–0.10 (0.07)–0.240.03

When we assessed dopaminergic effects on the starting bias, we observed that under placebo increasing differences in delay shifted the starting point toward the SS option, HDImean = 0.81, HDI95% = [0.04; 1.71], suggesting that the bias parameter is closer to the proximate (SS) option the stronger the proximity advantage of the SS over the LL option. Amisulpride shifted the starting bias toward the SS option for smaller differences in delay, main effect of Drug: HDImean = –0.04, HDI95% = [–0.08; -0.001], but also attenuated the impact of delay, Drug×Delaydiff: HDImean = 0.02, HDI95% = [0.001; 0.04]. Thus, dopamine appears to moderate the impact of temporal proximity on the starting bias (Figure 2G–K), providing support for recent proximity accounts of dopamine (Soutschek et al., 2022; Westbrook and Frank, 2018). Moreover, compared to the model-free analysis, our process model (which uses not only binary choice but also response time data) provides a fuller picture of the subcomponents of the choice process affected by the dopaminergic manipulation.

Next, we investigated the relation between the drug effects on the drift rate and on the starting bias. We found no evidence that the two effects correlated, r=0.07, p=0.60, suggesting that amisulpride effects on these subprocesses were largely independent of each other. Control analyses revealed no effects of amisulpride on non-decision times, HDImean = –0.10, HDI95% = [–0.24; 0.03], or the decision threshold, HDImean = 0.17, HDI95% = [–0.11; 0.46]. Thus, the results of DDM-1 suggest that dopamine moderates the influence of choice attributes on both the speed of evidence accumulation and on the starting bias, consistent with recent accounts (Soutschek et al., 2022; Westbrook and Frank, 2018) of dopamine’s role in cost-benefit weighting.

To test the robustness of our DDM findings, we computed further DDMs where we either removed the impact of Delaydiff on the starting bias (DDM-2) or the impact of Magnitudediff and Delaydiff on the drift rate (DDM-3). In a further model (DDM-4), we explored whether the starting bias is affected by the overall proximity of the options (sum of delays, Delaysum) rather than the difference in proximity (Delaydiff; see Table 3 for an overview over the parameters included in the various models). Importantly, our original DDM-1 (DIC = 9478) explained the data better than DDM-2 (DIC = 9481), DDM-3 (DIC = 10,224), or DDM-4 (DIC = 9492; Figure 3A). Nevertheless, amisulpride moderated the impact of Magnitudediff on the drift rate also in DDM-2, HDImean = 0.86, HDI95% = [0.18; 1.64], and DDM-4, HDImean = 0.83, HDI95% = [0.04; 1.75], and amisulpride also lowered the impact of Delaydiff on the starting bias in DDM-3, HDImean = –0.02, HDI95% = [–0.04; –0.001]. Thus, the dopaminergic effects on these subcomponents of the choice process are robust to the exact specification of the DDM.

Model comparison.

(A) The deviance information criterion (DIC; lower numbers correspond to better fit) suggests that DDM-1 explained the data slightly better than DDM-2, DDM-4, and DDM-6 and clearly outperformed DDM-3 and DDM-5. (B, C) Posterior predictive checks on the group level (collapsed across all participants), separately for (B) placebo and (C) amisulpride. Particularly DDM-1, DDM-2, and DDM-3 described the empirically observed data well, whereas decisions simulated based on DDMs 4–6 more strongly deviated from observed behavior.

Table 3
Overview over drift diffusion model (DDM) parameters included in the DDMs.

Note that we modeled drug effects for all parameters included in the DDMs.

ParameterDDM-1DDM-2DDM-3DDM-4DDM-5DDM-6
Drift rateDelaydiff
Magnitudediff
vmax
Decision thresholdIntercept
Starting biasIntercept
Delaydiff
Delaysum
Non-decision timeIntercept
τdiff

We compared the winning account also with alternative process models of intertemporal choice. While in DDM-1 the drift rate depends on separate comparisons between choice attributes, one might alternatively assume that they compare the discounted subjective reward values of both options (Wagner et al., 2020), as given by the hyperbolic discount functions. However, a DDM where the drift rate was modeled as the difference between the hyperbolically discounted reward values (with the discount factor as free parameter; DDM-5) showed a worse model fit (DIC = 10,720) than DDM-1. This replicates previous findings according to which intertemporal choices can better be explained by attribute-wise than by option-wise comparison strategies (Amasino et al., 2019; Dai and Busemeyer, 2014; Reeck et al., 2017).

Next, we investigated an alternative to the proposal that differences in delay affect the starting bias via proximity effects. Specifically, we tested whether evidence for delay costs are accumulated earlier than for reward magnitude (relative-starting-time (rs)DDM; Amasino et al., 2019; Lombardi and Hare, 2021). From the perspective of rsDDMs, evidence accumulation for delays would start after a shorter non-decision time than for rewards, which is expressed by the variable τdiff (if τdiff > 0, non-decision time is shorter for delays than rewards, and vice versa if τdiff < 0). However, also this rsDDM (DDM-6) explained the data less well (DIC = 9,548) than DDM-1. Thus, DDM-1 explains the current data better than alternative DDMs.

The currently used dose of amisulpride (400 mg) is thought to have predominantly postsynaptic effects on D2Rs, while lower doses (50–300 mg) might show presynaptic rather than postsynaptic effects (Schoemaker et al., 1997). Given that we used the same dose in all participants, one might argue that we may have studied presynaptic effects in individuals with relatively high body mass (which lowers the effective dose). However, we observed no evidence that individual random coefficients for the drug effects on the drift rate or on the starting bias correlated with body weight, all r<0.22, all p>0.10. There were also no significant correlations between DDM parameters and performance in the digit span backward task as proxy for baseline dopamine synthesis capacity (Cools et al., 2008), all r<0.17, all p>0.22. There was thus no evidence that pharmacological effects on intertemporal choices depended on body weight as proxy of effective dose or working memory performance as proxy for baseline dopaminergic activity.

As further check of the explanatory adequacy of DDM-1, we performed posterior predictive checks and parameter recovery analyses. Plotting the observed RTs (split into quintiles according to Magnitudediff and Delaydiff) against the simulated RTs based on the parameter estimates from the different DDMs suggests that the DDMs provide reasonable accounts of the observed data both on the group and the individual level, at least for DDMs 1–3 (Figure 3B/C and Figure 4). Moreover, the squared differences between observed and simulated RTs were smaller for DDM-1 (0.83) than for alternative DDMs (DDM-2: 0.85; DDM-3: 0.98; DDM-4: 1.05, DDM-5: 0.89; DDM-6: 1.63). To assess parameter recovery, we re-computed DDM-1 on 10 simulated data sets based on the original DDM-1 parameter estimates. All group-level parameters from the simulated data were within the 95% HDI of the original parameter estimates, except for the non-decision time τ (which suggests that our model tends to overestimate the duration of decision-unrelated processes). Nevertheless, all parameters determining the outcome of the decision process (i.e., the choice made) as well as the dopaminergic effects on the parameters could reliably be recovered by DDM-1.

Posterior predictive checks.

For each individual participant (p1–p56), observed RTs (in black) are plotted against the RTs simulated based on the parameters for drift diffusion model (DDM) 1–6, separately for differences in (A) reward magnitude and (B) delay (quintiles). The plots suggest that the DDMs provide reasonable accounts of the observed RTs.

To assess the receptor specificity of our findings, we conducted the same analyses on the data from a study (published previously in Soutschek et al., 2020a) testing the impact of three doses of a D1 agonist (6, 15, 30 mg) relative to placebo on intertemporal choices (between-subject design). In the intertemporal choice task used in this experiment, the SS reward was always immediately available (delay = 0), contrary to the task in the D2 experiment where the delay of the SS reward varied from 0 to 30 days. Again, the data in the D1 experiment were best explained by DDM-1 (DICDDM-1=19,657) compared with all other DDMs (DICDDM-2=20,934; DICDDM-3=21,710; DICDDM-5=21,982; DICDDM-6=19,660; note that DDM-4 was identical with DDM-1 for the D1 agonist study because the delay of the SS reward was 0). Neither the best-fitting nor any other model yielded significant drug effects on any drift diffusion parameter (see Table 4 for the best-fitting model). Also model-free analyses conducted in the same way as for the D2 antagonist study revealed no significant drug effects (all HDI95% included zero). There was thus no evidence for any influence of D1R stimulation on intertemporal decisions.

Table 4
Results of the best-fitting DDM-1 for the D1 agonist experiment.

Standard errors of the mean of the posterior distributions are in brackets.

ParameterRegressorMean2.5%97.5%
Drift rate:
Delaydiff
Placebo (0 mg)–0.42–0.74–0.20
6 mg vs. 0 mg–0.02–0.350.28
15 mg vs. 0 mg–0.15–0.520.19
30 mg vs. 0 mg–0.12–0.500.14
Drift rate:
Magnitudediff
Placebo (0 mg)0.860.491.31
6 mg vs. 0 mg0.17–0.260.63
15 mg vs. 0 mg0.11–0.530.73
30 mg vs. 0 mg0.23–0.230.83
Drift rate: vmaxPlacebo (0 mg)1.721.152.66
6 mg vs. 0 mg–0.36–1.370.60
15 mg vs. 0 mg0.02–1.181.35
30 mg vs. 0 mg–0.39–1.600.75
Decision thresholdPlacebo (0 mg)2.822.593.04
6 mg vs. 0 mg–0.13–0.460.20
15 mg vs. 0 mg–0.11–0.450.24
30 mg vs. 0 mg–0.06–0.390.25
Starting bias:
Intercept
Placebo (0 mg)0.590.540.64
6 mg vs. 0 mg–0.01–0.070.06
15 mg vs. 0 mg–0.01–0.080.06
30 mg vs. 0 mg–0.03–0.090.04
Starting bias:
Delaydiff
Placebo (0 mg)0.00–0.010.02
6 mg vs. 0 mg–0.01–0.030.02
15 mg vs. 0 mg0.01–0.010.04
30 mg vs. 0 mg–0.01–0.030.01
Non-decision timePlacebo (0 mg)0.850.780.93
6 mg vs. 0 mg0.03–0.090.15
15 mg vs. 0 mg–0.02–0.130.09
30 mg vs. 0 mg0.03–0.060.13

Discussion

Dopamine is hypothesized to play a central role in human cost-benefit decision making, but existing empirical evidence does not conclusively support the widely shared assumption that dopamine promotes the pursuit of high benefit-high cost options (for reviews, see Soutschek et al., 2022; Webber et al., 2021). By manipulating dopaminergic activity with the D2 antagonist amisulpride, we provide empirical evidence for a novel process model of cost-benefit weighting that reconciles conflicting views by assuming dissociable effects of dopamine on distinct subcomponents of the decision process.

D2R blockade (relative to placebo) increased the sensitivity to variation in reward magnitudes during evidence accumulation, such that only relatively large future rewards were considered to be worth the waiting cost, whereas small delayed rewards were perceived as less valuable than sooner rewards. This dopaminergic impact on the drift rate is consistent with the view that D2R-mediated tonic dopamine levels implement a cost control determining whether a reward is worth the required action costs (Beeler and Mourra, 2018). From this perspective, lowering D2R activity with amisulpride resulted in a stricter cost control such that only rather large delayed rewards were able to overcome D2R-mediated cortical inhibition (Lerner and Kreitzer, 2011). While this effect is consistent with the standard view according to which dopamine increases the preference for large costly rewards (Robbins and Everitt, 1992; Salamone and Correa, 2012; Schultz, 2015), the dopaminergic effects on the starting bias parameter yielded a different pattern. Here, inhibition of D2R activation reduced the impact of delay costs on the starting bias, such that for shorter delays (where the immediate reward has only a small proximity advantage) D2R inhibition shifts the bias toward the SS option. This finding represents first evidence for the hypothesis that tonic dopamine moderates the impact of proximity (e.g., more concrete vs. more abstract rewards) on cost-benefit decision making (Soutschek et al., 2022; Westbrook and Frank, 2018). Pharmacological manipulation of D1R activation, in contrast, showed no significant effects on the decision process. This provides evidence for the receptor specificity of dopamine’s role in intertemporal decision making (though as caveat it is worth keeping the differences between the tasks administered in the D1 and the D2 studies in mind).

Conceptually, the assumption of proximity effects on the starting bias is consistent with dual process models of intertemporal choice assuming that individuals are (at least partially) biased toward selecting immediate over delayed rewards (Figner et al., 2010; McClure et al., 2004). This automatic favoring of immediate rewards is reflected in a shift of the starting bias and thus occurs before the evidence accumulation process, which relies on attention-demanding cost-benefit weighting (Zhao et al., 2019). In agreement with this notion, DDM-1 with temporal proximity-dependent bias showed better fit than DDM-5 with variable non-decision times for rewards and delays. We note that the hierarchical modeling approach allowed us to compare models on the group-level only, such that in some individuals behavior might better be explained by a different model than DDM-1. Such model comparisons on the individual level, however, were beyond the scope of the current study and might not yield robust results given the limited number of trials per individual. We also emphasize that alternative process models like the linear ballistic accumulator (LBA) model make different assumptions than DDMs, for example by positing the existence of separate option-specific accumulators rather than only one as assumed by DDMs. However, proximity effects as investigated in the current study might be incorporated in LBA models as well by varying the starting points of the accumulators as function of proximity.

A dopaminergic modulation of proximity effects provides an elegant explanation for the fact that in most D2 antagonist studies D2R reduction increased the preference for LL options (Arrondo et al., 2015; Soutschek et al., 2017; Wagner et al., 2020; Weber et al., 2016), contrary to the predictions of energization accounts (Beeler and Mourra, 2018; Salamone and Correa, 2012). Noteworthy, the dopaminergic effects on evidence accumulation and on the starting bias promote potentially different action tendencies, as the impact of amisulpride on evidence accumulation lowered the weight assigned to small future rewards, whereas the amisulpride effects on the starting bias increased the likelihood of LL options being chosen. Rather than generally biasing impulsive or patient choices, the impact of dopamine on decision making may therefore crucially depend on the rewards at stake and the associated waiting costs (Figure 4). In our model, lower dopamine levels strengthen the preference for high reward-high cost options predominantly in two situations. First, if differences in reward magnitude are high (e.g., choosing between your favorite meal vs. a clearly less liked dish) and, second, if the less costly option has a clear proximity advantage over the costlier one (having dinner in a restaurant close-by or a preferred restaurant on the other side of town). Conversely, if differences in both expected reward and waiting costs are small, lower dopamine may bias choices in favor of low-cost rewards over high-cost rewards. By extension, higher dopamine levels should increase the preference for an SS option if the SS option has a pronounced proximity advantage over the LL option, and bias the acceptance of LL options if both options are associated with similar waiting costs. We note though that the effects of increasing dopamine levels are less predictable than the effects of lowering dopaminergic activity due to possible inverted-U-shaped dopamine-response curves (Floresco, 2013); potentially, the dopaminergic effects on drift rate and starting bias might even follow different dose-response functions. Taken together, our process model of the dopaminergic involvement in cost-benefit decisions allows reconciling conflicting theoretical accounts and (apparently) inconsistent empirical findings by showing that dopamine moderates the effects of reward magnitudes and delay costs on different subcomponents of the choice process.

We note that the moderating roles of differences in delays are also reflected in the significant interaction between drug and delay from the model-free analysis, although this analysis could provide no insights into which subcomponents of the choice process are affected by dopamine. As the influence of dopamine on decision making varies as a function of the differences in reward magnitude and waiting costs, the outcomes of standard analyses like mean percentage of LL choices or hyperbolic discount parameters may be specific to the reward magnitudes and delays administered in a given study. For example, if an experimental task includes large differences between rewards and delays, dopamine antagonists may reduce delay discounting, whereas studies with smaller differences between these choice attributes may observe no effect of dopaminergic manipulations (Figure 5). Standard analyses that measure patience by one behavioral parameter only (e.g., discount factors) may thus result in misleading findings. In contrast, process models of decision making do not just assess whether a neural manipulation increases or reduces patience; instead, they quantify the influence of a manipulation on the weights assigned to rewards and waiting costs during different phases of the choice process, with these weights being less sensitive to the administered choice options in a given experiment. Process models may thus provide a less option-specific picture of the impact of pharmacological and neural manipulations.

Illustration of how dopaminergic effects on intertemporal choices depend on differences in both reward magnitude and delay in the proposed framework, separately for (A) placebo, (B) amisulpride, and (C) the difference between amisulpride and placebo.

Plots are based on simulations assuming the group-level parameter estimates we observed under placebo and amisulpride. As dopaminergic effects on decision making affect both reward processing (via the drift rate) and cost processing (via the starting bias), the specific combination of rewards and delays determines whether D2R blockade increases or decreases the probability of larger-later (LL) choices. Low dopamine levels reduce the proximity advantage of smaller-sooner (SS) over LL options particularly if differences in action costs between reward options are large, promoting choices of the LL option. In contrast, if no option possesses a proximity advantage (small differences between delays), dopaminergic effects on evidence accumulation dominate, such that the LL option is perceived as less worth the waiting costs, particularly if its reward magnitude differs only little from that of the alternative SS option.

As potential alternative explanation for the enhanced influence of reward magnitude under amisulpride, one might argue that D2R blockade generally increases the signal-to-noise ratio for decision-relevant information. However, this notion is inconsistent with the proposed role of D2R activation for precise action selection (Keeler et al., 2014), because this view would have predicted amisulpride to result in noisier (less precise action selection) rather than less noisy evidence accumulation. Moreover, our data provide no evidence for drug effects on the inverse temperature parameter measuring choice consistency, and there were also no significant correlations between amisulpride effects on reward and delay processing, contrary to what one should expect if these effects were driven by the same mechanism.

While higher doses of amisulpride (as administered in the current study) antagonize postsynaptic D2Rs, lower doses (50–300 mg) were found to primarily block presynaptic dopamine receptors (Schoemaker et al., 1997), which may result in amplified phasic dopamine release and thus increased sensitivity to benefits (Frank and O’Reilly, 2006). At first glance, the stronger influence of differences in reward magnitude on drift rates under amisulpride compared with placebo might therefore speak in favor of presynaptic (higher dopamine levels) rather than postsynaptic mechanisms of action in the current study. However, amisulpride vs. placebo increased evidence accumulation toward LL rewards (more positive drift rate) only for larger differences between larger (later) and smaller (sooner) rewards, whereas for smaller reward differences amisulpride enhanced evidence accumulation toward SS choices (more negative drift rate; see Figure 2C). The latter finding appears inconsistent with presynaptic effects, as higher dopamine levels are thought to increase the preference for costly larger rewards (Webber et al., 2021). Instead, the stronger influence of reward differences on drift rates under amisulpride could be explained by a stricter cost control (Beeler and Mourra, 2018). In this interpretation, individuals more strongly distinguish between larger rewards that are worth the waiting costs (large difference between LL and SS rewards) and larger rewards that are not worth the same waiting costs (small difference between LL and SS rewards). While this speaks in favor of postsynaptic effects, we acknowledge that the amisulpride effects for larger reward differences are compatible with presynaptic mechanisms.

The result pattern for the starting bias parameter, in turn, suggests the presence of two distinct response biases, reflected by the intercept and the delay-dependent slope of the bias parameter (see Figure 2K), which are both under dopaminergic control but in opposite directions. First, participants seem to have a general bias toward the LL option in the current task (intercept), which is reduced under amisulpride compared with placebo, consistent with the assumption that dopamine strengthens the preference for larger rewards (Beeler and Mourra, 2018; Salamone and Correa, 2012; Schultz, 2015). Second, amisulpride reduced the impact of increasing differences in delay on the starting bias, as predicted by the proximity account of tonic dopamine (Westbrook and Frank, 2018). Both of these effects are compatible with postsynaptic effects of amisulpride. However, we note that in principle one might make the assumption that proximity effects are stronger for smaller than for larger differences in delay, and under this assumption the results would be consistent with presynaptic effects. On balance, the current results thus appear more likely under the assumption of postsynaptic rather than presynaptic effects but the latter cannot be entirely excluded. Unfortunately, the lack of a significant amisulpride effect on decision times (which should be reduced or increased as consequence of presynaptic or postsynaptic effects, respectively) sheds no additional light on the issue. Lastly, while the actions of amisulpride on D2/D3 receptors are relatively selective, it also affects serotonergic 5-HT7 receptors (Abbas et al., 2009). Because serotonin has been related to impulsive behavior (Mori et al., 2018), it is worth keeping in mind that amisulpride effects on serotonergic, in addition to dopaminergic, activity might contribute to the observed result pattern.

An important question refers to whether our findings for delay costs can be generalized to other types of costs as well, including risk, social costs (i.e., inequity), effort, and opportunity costs. We recently proposed that dopamine might also moderate proximity effects for reward options differing in risk and social costs, whereas the existing literature provides no evidence for a proximity advantage of effort-free over effortful rewards (Soutschek et al., 2022). However, these hypotheses need to be tested more explicitly by future investigations. Dopamine has also been ascribed a role for moderating opportunity costs, with lower tonic dopamine reducing the sensitivity to opportunity costs (Niv et al., 2007). While this appears consistent with our finding that amisulpride (under the assumption of postsynaptic effects) reduced the impact of delay on the starting bias, it is important to note that choosing delayed rewards did not involve any opportunity costs in our paradigm, given that participants could pursue other rewards during the waiting time. Thus, it needs to be clarified whether our findings for delayed rewards without experienced waiting time can be generalized to choice situations involving experienced opportunity costs.

To conclude, our findings may shed a new light on the role of dopamine in psychiatric disorders that are characterized by deficits in impulsiveness or cost-benefit weighting in general (Hasler, 2012), and where dopaminergic drugs belong to the standard treatments for deficits in value-related and other behavior. Dopaminergic manipulations yielded mixed results on impulsiveness in psychiatric and neurologic disorders (Acheson and de Wit, 2008; Antonelli et al., 2014; Foerde et al., 2016; Kayser et al., 2017), and our process model regarding the role of dopamine for delaying gratification explains some of the inconsistencies between empirical findings (on top of factors like non-linear dose-response relationships). As similarly inconsistent findings were observed also in the domains of risky and social decision making (Soutschek et al., 2022; Webber et al., 2021), the proposed process model may account for the function of dopamine in these domains of cost-benefit weighting as well. By deepening the understanding of the role of dopamine in decision making, our findings provide insights into how abnormal dopaminergic activation, and its pharmacological treatment, in psychiatric disorders may affect distinct aspects of decision making.

Materials and methods

Participants

D2 antagonist study

Request a detailed protocol

In a double-blind, randomized, within-subject design, 56 volunteers (27 female, Mage = 23.2 years, SDage = 3.1 years) received 400 mg amisulpride or placebo in two separate sessions (2 weeks apart) as described previously (Soutschek et al., 2017). Participants gave informed written consent before participation. The study was approved by the Cantonal ethics committee Zurich (2012-0568).

D1 agonist study

Request a detailed protocol

Detailed experimental procedures for the D1 experiment are reported in Soutschek et al., 2020a. A total of 120 participants (59 females, mean age = 22.57 years, range 18–28) received either placebo or one of three different doses (6, 15, 30 mg) of the D1 agonist PF-06412562 (between-subject design). The study was approved by the Cantonal ethics committee Zurich (2016-01693) and participants gave informed written consent prior to participation. The D1 agonist study was registered on ClinicalTrials.gov (identifier: NCT03181841).

Task design

Request a detailed protocol

In the D2 antagonist study, participants made intertemporal decisions 90 min after drug or placebo intake. We used a dynamic version of a delay discounting task in which the choice options were individually selected such that the information provided by each decision was optimized (dynamic experiments for estimating preferences; Toubia et al., 2013). On each trial, participants decided between an SS (reward magnitude 5–250 Swiss francs, delay 0–30 days) and an LL option (reward magnitude 15–300 Swiss francs, delay 3–90 days). Participants pressed the left or right arrow keys on a standard keyboard to choose the option presented on the left or right side of the screen. On each trial, the reward options were presented until participants made a choice. The next choice options were displayed after an intertrial interval of 1 s. Participants made a total of 20 choices between SS and LL options.

In the D1 agonist experiment, participants performed a task battery including an intertemporal decision task 5 hr after drug administration (the procedures and results for the other tasks are described in Soutschek et al., 2020b, and Soutschek et al., 2020b). In the intertemporal decision task, the magnitude of the immediate reward option varied between 0 and 16 Swiss francs (in steps of 2 Swiss francs), while for the LL option a fixed amount of 16 Swiss francs was delivered after a variable delay of 0–180 days. A total of 54 trials was administered where each combination of SS and LL reward options was presented once. SS and LL options were randomly presented on either the right or left screen side until a choice was made, and participants indicated their choices by pressing the right arrow key (for the option presented on the right side) or the left arrow key (for the option on the left side).

Statistical analysis

Drift diffusion modeling

Request a detailed protocol

We analyzed drug effects on intertemporal decision making with hierarchical Bayesian drift diffusion modeling using the JAGS software package (Plummer, 2003). JAGS utilizes Markov Chain Monte Carlo sampling for Bayesian estimation of drift diffusion parameters (drift rate ν, boundary α, bias ζ, and non-decision time τ) via the Wiener module (Wabersich and Vandekerckhove, 2014) on both the group and the participant level. In our models, the upper boundary (decision threshold) was associated with a choice of the LL option, the lower boundary with a choice of the SS option. A positive drift rate thus indicates evidence accumulation toward the LL option, a negative drift rate toward the SS option. We first describe how the models were set up for the D2 antagonist study. As we were interested in how dopamine modulates different subcomponents of the choice process, in DDM-1 we assumed that the drift rate v is influenced by the comparisons of reward magnitudes and delays between the SS and LL options (Amasino et al., 2019; Dai and Busemeyer, 2014):

(1) ν=β1 (Magnitudediff)+β2 (Drug×Magnitudediff)+β3 (Delaydiff)+β4 (Drug×Delaydiff)

Magnitudediff indicates the difference between the reward magnitudes of the LL and SS options, Delaydiff indicates the difference between the corresponding delays. Both Magnitudediff and Delaydiff were z-transformed to render the size of the parameter estimates comparable (Amasino et al., 2019). Following previous procedures, we transformed v’ with a sigmoidal link function as this procedure explains observed behavior better than linear link functions (Fontanesi et al., 2019; Wagner et al., 2020). Indeed, also the current data were better explained by a DDM with (DIC =9478) than without (DIC =10,283) a sigmoidal link (where vmax indicates the upper and lower borders of the drift rate):

(2) ν=2×β5(vmax) + β6(Drug×vmax)1+exp(v)(β5(vmax) + β6(Drug×vmax))

Next, we assessed whether delay costs affect the starting bias parameter ζ, as assumed by proximity accounts (Soutschek et al., 2022; Westbrook and Frank, 2018):

(3) ζ=β7 (Intercept)+β8 (Drug)+β9 (Delaydiff)+β10 (Drug × Delaydiff)

We also investigated whether the drug affected the decision threshold parameter α (Equation 4) or the non-decision time τ (Equation 5):

(4) α=β11 (Intercept)+β12 (Drug)
(5) τ=β13 (Intercept)+β14 (Drug)

As the experiment followed a within-subject design, we modeled all parameters both on the group level and on the individual level by assuming that individual parameter estimates are normally distributed around the mean group-level effect with a standard deviation λ (which was estimated separately for each group-level effect). We tested for significant effects by checking whether the 95% HDIs of the posterior samples of group-level estimates contained zero. Note that all statistical inferences were based on assessment of group-level estimates, as individual estimates might be less reliable due to the limited number of trials for each participant. We excluded the trials with the 2.5% fastest and 2.5% slowest response times to reduce the impact of outliers on parameter estimation (Amasino et al., 2019; Wagner et al., 2020). As priors, we assumed standard normal distributions for all group-level effects (with mean = 0 and standard deviation = 1) and gamma distributions for λ (Wagner et al., 2020). For model estimation, we computed two chains with 500,000 samples (burning = 450,000, thinning = 5). R was used to assess model convergence in addition to visual inspection of chains. For all effects, R was below 1.01, indicating model convergence.

We compared DDM-1 also with alternative process models. DDM-2 was identical to DDM-1 but did not estimate starting bias as free parameter, assuming ζ=0.5 instead, whereas DDM-3 left out the influences of Magnitudediff and Delaydiff on the drift rate. DDM-4 assessed whether the starting bias is modulated by the sum of the delays (as measure of overall proximity, Delaysum) rather than Delaydiff. In DDM-5 we assumed that the drift rate depends on the comparison of the hyperbolically discounted subjective values of the two choice options rather than on the comparison of choice attributes (Konovalov and Krajbich, 2019). In particular, the drift rate ν’ (prior to being passed through the sigmoidal link function) was calculated with:

(6) v=LL reward magnitude1+(β1+β2(Drug))×LL delaySS reward magnitude1+(β1+β2(Drug))×SS delay

Here, β1 corresponds to the hyperbolic discount factor, which determines the hyperbolically discounted subjective values of the available choice options.

Finally, we considered a model without influence of Delaydiff on the starting bias but with separate non-decision times for rewards and delays. In more detail, DDM-6 included an additional parameter τdiff which indicated whether the accumulation process started earlier for delays than for rewards (τdiff > 0) or vice versa (τdiff < 0). For example, if τdiff > 0, evidence accumulation for delays starts directly after the non-decision time τ, whereas the accumulation process for reward magnitudes starts at τ + τdiff (and then influences the drift rate together with Delaydiff until the decision boundary is reached). A recent study showed that such time-varying drift rates can be calculated as follows (Lombardi and Hare, 2021):

(7) v={β1(Mdiff)+β2(Drug×Mdiff) ifτdiff<0 & τdiff+τ<RTβ3(Ddiff)+β4(Drug×Ddiff)ifτdiff>0& τdiff+τ<RTτ+τdiffRTτ+τdiff×(β1(Mdiff)+β2(Drug×Mdiff))+RTτRTτ+τdiff×(β1(Mdiff)+β2(Drug×Mdiff)+β3(Ddiff)+β2(Drug×Ddiff))ifτdiff<0 & τ<RTτdiffRTτ×(β3(Ddiff)+β4(Drug×Ddiff))+RTττdiffRTτ×(β1(Mdiff)+β2(Drug×Mdiff)+β3(Ddiff)+β2(Drug×Ddiff))ifτdiff>0 & τdiff+τ<RT

For the ease of reading, Magnitudediff and Delaydiff are abbreviated as Mdiff and Ddiff, respectively.

For the D1 agonist study, we computed the same DDMs as for the D2 antagonist study. However, because the D1 agonist experiment followed a between-subject design, we estimated separate group-level parameters for the four between-subject drug groups (placebo, 6, 15, 30 mg). We tested for significant group differences by computing the 95% HDI for the differences between the posterior samples of group-level estimates. For model estimation, we computed two chains with 100,000 samples (burning = 50,000, thinning = 5), which ensured that R values for all group-level effects were below 1.01.

We compared model fits between the different DDMs with the deviance information criterion (DIC) as implemented in the Rjags package. We note that JAGS does not allow computing more recently developed model selection criteria such as the Pareto smoothed importance sampling leave-one-out (PSIS-LOO) approach. However, a recent comparison of model selection approaches found that PSIS-LOO had a slightly higher false detection rate than DIC, but in general both PSIS-LOO and DIC led to converging conclusions (Lu et al., 2017). There is therefore good reason to assume that our findings were not biased by the employed model selection approach.

Posterior predictive checks and parameter recovery analyses

Request a detailed protocol

We performed posterior predictive checks to assess whether the DDMs explained key aspects of the empirical data. For this purpose, we simulated 1000 RT distributions based on the individual parameter estimates from all DDMs. We then binned trials into quintiles based on differences in reward magnitude and plotted the observed empirical data and the simulated data (averaged across the 1000 simulations) as a function of these bins, separately for each individual participant. We performed the same analysis by binning trials based on differences in delay instead of reward magnitude.

We conducted a parameter recovery analysis by re-computing DDM-1 on 10 randomly selected data sets which were simulated based on the original DDM-1 parameters. We checked parameter recovery by assessing whether group-level parameters from the simulated data lie within the 95% HDI of the original parameter estimates.

Model-free analyses

Request a detailed protocol

We analyzed choice data also in a model-free manner and with a hyperbolic discounting model. In the model-free analysis of the D2 antagonist study, we regressed choices of LL vs. SS options on fixed-effect predictors for Drug, Magnitudediff, Delaydiff, and the interaction terms using Bayesian mixed models as implemented in the brms package in R (Bürkner, 2017). For the D1 agonist study, the same MGLM was used with the only difference that Drug (0, 6, 15, 30 mg) represented a between- rather than a within-subject factor. All predictors were also modeled as random slopes in addition to participant-specific random intercepts. Finally, the hyperbolic discounting model was fit using the hBayesDM toolbox (Ahn et al., 2017), using a standard hyperbolic discounting function:

(8) SVdiscounted=reward magnitude1+k×delay

To translate subjective value into choices, we fitted a standard softmax function to each participant’s choices:

(9) P(choice of LL option)=11+eβtemp×(SVLLSVSS)

We estimated parameters capturing the strength of hyperbolic discounting (k) and choice consistency (βtemp) separately for each participant and experimental session by computing two chains of 4000 iterations (burning = 2000). We then performed a Bayesian t-test on the log-transformed individual parameter estimates under placebo vs. amisulpride using the BEST package (Kruschke, 2013).

Data availability

The data supporting the findings of this study and the data analysis code are available on Open Science Framework (https://osf.io/dp2me/).

The following data sets were generated
    1. Soutschek A
    (2023) Open Science Framework
    ID dp2me. Intertemporal Choice_amisulpride.

References

  1. Conference
    1. Plummer M.
    (2003)
    JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling
    Paper presented at the Proceedings of the 3rd international workshop on distributed statistical computing.
    1. Schoemaker H
    2. Claustre Y
    3. Fage D
    4. Rouquier L
    5. Chergui K
    6. Curet O
    7. Oblin A
    8. Gonon F
    9. Carter C
    10. Benavides J
    11. Scatton B
    (1997)
    Neurochemical characteristics of amisulpride, an atypical dopamine D2/D3 receptor antagonist with both presynaptic and limbic selectivity
    The Journal of Pharmacology and Experimental Therapeutics 280:83–97.

Decision letter

  1. Geoffrey Schoenbaum
    Reviewing Editor; National Institute on Drug Abuse, National Institutes of Health, United States
  2. Christian Büchel
    Senior Editor; University Medical Center Hamburg-Eppendorf, Germany
  3. Geoffrey Schoenbaum
    Reviewer; National Institute on Drug Abuse, National Institutes of Health, United States

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "A Process Model Account of the Role of Dopamine in Intertemporal Choice" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Geoffrey Schoenbaum as Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Christian Büchel as the Senior Editor.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Overall the reviewers thought the work had great merit and was well done. The conclusions are novel and provide potential resolution of conflicting data in the literature. However, there were a number of concerns raised.

Many of these suggestions are contained in each reviewers comments, however on discussion we reached a general consensus that there were three main areas that require some additional analysis and/or changes to the framing or interpretation.

1) The first is the generalization from a single task or paradigm – delay discounting – to broader conclusions about cost evaluation. Temporal discounting is obviously just one factor in value or costs. Comments or discussion of this and the likelihood or lack thereof that conclusions will generalize outside the realm of time would be appreciated.

2) The second is similar and it concerns limitations on the use of a receptor specific pharmacological agent. As indicated in the review, it is somewhat unclear if this impacts phasic or tonic functions of dopamine or both, and of course it leaves open the role of other receptor systems as well as the locus of action – pre vs post-synaptic. Acknowledging these limitations and commenting on whether there is any insight into them would also be an improvement.

3) Finally it was suggested the authors might look for other similar studies and/or publicly available data that could be similarly analyzed. If so, applying the current model to a data from a different type of cost/benefit choice task or to data from a discounting task where a different pharmacological agent was used would greatly expand the potential impact of the conclusions and perhaps allow points 1 and 2 to be directly addressed.

Reviewer #1 (Recommendations for the authors):

I have a few comments however.

One is that I wonder if the authors could frame the results more in terms of current proposed functions of dopamine. I gather this is nothing related to error signaling, learning functions, or other effects of phasic activity. That is, the current ideas apply – appropriately since the evidence is pharmacological – to the role of tonic dopamine levels? Can this be more clearly stated? What is the evidence that these factors are not affected?

Second the antagonist is D2 specific. This is not much mentioned. How important is this? What do studies of D1 or general antagonism find?

Third would the effects here be explained by the more general hypothesis that tonic dopamine relates to the value of time? I refer to a proposal made by Yael Niv among others. It seems to me that blocking dopamine would in effect lower the value of time, which would be expected to impact the measures described here, for example by making a subject more willing to wait across a delay? Could this idea be related to the findings?

Reviewer #2 (Recommendations for the authors):

Overall, I appreciated the detailed modeling work, and the thoughtful alternative constructions of the models. Given this, I wanted to see more of the data and results myself – more informative plots that show both main features of the behavior, and more detailed visual presentation of the actual posterior estimates themselves. I think it would also be useful to plot model comparisons – both globally and for individuals – these give a sense of how close different models are. It was hard to keep track of all the different parameters too so a table of the model parameters for each flavor DDM would be extremely useful for understanding and transparency.

I did not find the posterior predictive check particularly convincing – the authors say that the DDMs are a reasonable account, but from purely visual assessment I can see plenty of individual subjects in which the models (all of the variants) are not an unambiguously good approximation. Some kind of summary statistic of the posterior predictive check might be more interpretable?

Reviewer #3 (Recommendations for the authors):

It is possible that the Authors could gain some inferential leverage by examining the effect of drug on reaction times alone. If amisulpride increases post-synaptic dopamine signaling, we might see faster reaction times overall. We also might see reaction times speed when Participants are offered larger overall valued offer pairs and this effect might be larger on amisulpride versus placebo. In any case, it would be helpful for the Authors to report RT effects of the drug. What were the main effects of the drug on reaction times? What were the marginal effects of the drug on reaction times, controlling for differences in reward amount, delay, and choice, etc.?

It is also possible that there are trial-wise dynamics which may be informative. I am curious whether the Authors examined trial-number effects on the propensity to select the LL option. If the propensity either grew or shrank over trials, then it may be possible to test whether possible post-synaptic dopamine signaling amplified this bias towards the LL option when it was smaller versus larger.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "A Process Model Account of the Role of Dopamine in Intertemporal Choice" for further consideration by eLife. Your revised article has been evaluated by Christian Büchel (Senior Editor) and Geoffrey Schoenbaum (Reviewing Editor).

The manuscript has been improved but there are some remaining issues that need to be addressed. Specifically, Reviewer 3 disagrees with some of the interpretations being advanced. We wish to give you an opportunity to make any further modifications to your manuscript/rebuttal in light of these comments in a final revision.

Reviewer #1 (Recommendations for the authors):

Thanks for addressing all my concerns - no further suggestions!

Reviewer #2 (Recommendations for the authors):

The amendments to the paper address all my previous comments and have improved the manuscript overall.

Reviewer #3 (Recommendations for the authors):

The Authors have been responsive to my prior comments. The test of the model in a sample with D1 agonists is welcome and informative – if disappointingly null. In general, I think the Author's edits are suitable.

I disagree with the Authors' response about how the drug effects on the starting point bias and the effects of costs should be interpreted. I still think the data tell a more consistent story, and coincidentally one that is also more consistent with the Westbrook and Frank (2018) hypothesis, if interpreted under the assumption of pre- rather than post-synaptic effects.

Before giving my reasoning, I want to be clear that I think the Authors' interpretation is plausible under narrower assumptions. Also, it is good that they have not ruled out the possibility of pre-synaptic drug effects. If anything, I think they could make a bit more clear in the discussion that an alternative, presynaptic account is plausible (and why this would also support DA-proximity interactions). Ultimately, though, I think this is up to the Authors' discretion. As such, I wanted to articulate my reasoning here in case I might persuade them to reconsider the inferences they make.

There are two points on which I disagree with the Authors' response:

1) the effect of amisulpride on the magnitude effect and 2) drug effects on the DDM starting point bias.

Regarding (1): contrary to the Authors' take, Figure 2B indicates that amisulpride vs. placebo increased the effect of reward magnitude on choice, which supports a pre-synaptic effect. 2C further supports this pre-synaptic account because the drift rate also appears to steepen (as a function of differences in reward magnitude) on amisulpride versus placebo. That is – both figures support a larger reward magnitude/benefit effect on choice, consistent with pre-synaptic amisulpride action, yielding stronger post-synaptic dopamine signaling. In contrast, there is no reliable interaction between drug and delay (Figure 2E) which contradicts the Authors' assertions that D2R activation is implementing a "cost control".

Regarding (2): I'm also inclined to disagree with the Authors' interpretation of the drug effect on the DDM starting bias. While an overall starting bias, and its variation by delay are reduced on amisulpride, the key theoretical question is whether amisulpride alters the proximity bias (stemming from the differences in delay), not the DDM starting bias (though, theoretically, the proximity bias may be reflected in a DDM starting bias).

As noted in my prior review, the drug effect on the DDM starting point may reflect either a reduction of a tendency to select the LL option, or an increase in a proximity bias (arising from differences in delay), or both. I think we are in agreement that all three are possible and that it is not possible to discern between these accounts. Evidence that a proximity bias exists includes that the DDM starting point shifts toward the SS threshold as the differences in delay increase (Figure 2K, on placebo).

The original theory proposes that greater DA signaling increases the SS proximity bias (which predicts a shift in the DDM starting point towards the SS boundary). Contrary to the Authors' claims, there is a main effect of the drug, shifting the DDM starting point towards the SS boundary (Figure 2H). Again, this could be explained by a weakening of the tendency to select the LL option, a strengthening of the proximity bias towards the SS option, or both. Any of these effects could explain Figure 2H. Importantly, however, Figure 2H is inconsistent with the interpretation that "amisulpride weakens the proximity advantage of SS over LL rewards".

Regarding Figure 2K, it is unclear why presynaptic amisulpride binding would strengthen the proximity bias mostly for smaller differences in delay. Nevertheless, the fact that the drug caused a shift in the starting point towards the SS option more for small versus large differences in delay does not contradict the hypothesis that amisulpride pre-synaptically causes a shift in the starting point towards SS rewards. It merely implies that the effects were largest when delay differences were smallest. Perhaps this is true because there is a ceiling effect on how much the proximity bias can influence choice in this paradigm, and it's already maxed out (on placebo) for large delay differences such that the drug can only amplify the proximity bias for small delay differences.

https://doi.org/10.7554/eLife.83734.sa1

Author response

Essential revisions:

Overall the reviewers thought the work had great merit and was well done. The conclusions are novel and provide potential resolution of conflicting data in the literature. However there were a number of concerns raised.

Many of these suggestions are contained in each reviewers comments, however on discussion we reached a general consensus that there were three main areas that require some additional analysis and/or changes to the framing or interpretation.

We thank the reviewers for the positive evaluation of our study. Revising the manuscript according to the reviewers’ comments further improved the quality of the manuscript.

1) The first is the generalization from a single task or paradigm – delay discounting – to broader conclusions about cost evaluation. Temporal discounting is obviously just one factor in value or costs. Comments or discussion of this and the likelihood or lack thereof that conclusions will generalize outside the realm of time would be appreciated.

We agree that temporal delay is just one cost factor influencing value. In the revised manuscript, we clarify that one must be cautious with generalizing our findings for delay discounting to other cost types (for details, see responses to reviewers 1 and 2).

2) The second is similar and it concerns limitations on the use of a receptor specific pharmacological agent. As indicated in the review, it is somewhat unclear if this impacts phasic or tonic functions of dopamine or both, and of course it leaves open the role of other receptor systems as well as the locus of action – pre vs post-synaptic. Acknowledging these limitations and commenting on whether there is any insight into them would also be an improvement.

We added a discussion on whether the observed amisulpride effects can best be explained by presynaptic or postsynaptic effects. We clarify that amisulpride is thought to modulate D2R-mediated tonic dopamine levels, but it may also affect 5-HT7 receptors (see response to reviewer 2).

3) Finally it was suggested the authors might look for other similar studies and/or publicly available data that could be similarly analyzed. If so, applying the current model to a data from a different type of cost/benefit choice task or to data from a discounting task where a different pharmacological agent was used would greatly expand the potential impact of the conclusions and perhaps allow points 1 and 2 to be directly addressed.

We thank the reviewers or this interesting suggestion! In the revised manuscript, we now reported data from a study on the impact of a D1 agonist on intertemporal choice. Conducting the same analyses on this data set as for the D2 antagonist study yielded no significant effects of D1R stimulation on drift diffusion parameters. Based on these findings, it seems more likely that the impact of amisulpride on the choice process is moderated by effects on D2R rather than D1R activation.

Reviewer #1 (Recommendations for the authors):

I have a few comments however.

One is that I wonder if the authors could frame the results more in terms of current proposed functions of dopamine. I gather this is nothing related to error signaling, learning functions, or other effects of phasic activity. That is, the current ideas apply – appropriately since the evidence is pharmacological – to the role of tonic dopamine levels? Can this be more clearly stated? What is the evidence that these factors are not affected?

In the revised manuscript, we now clarify that amisulpride is thought to affect mainly tonic dopaminergic activity, given the half-life of amisulpride (about 12 hours) and given that D2 receptors are more susceptible to slow, tonic changes in dopamine concentrations rather than to fast, phasic dopaminergic bursts. Tonic dopaminergic activity is thought to encode background reward rates, in contrast to the learning processes mediated by phasic dopamine (as described by the reviewer). We now clarify this on p.5:

“D1Rs are prevalent in the direct “Go” pathway and facilitate action selection via mediating the impact of phasic bursts elicited by high above-average rewards (Evers, Stiers, and Ramaekers, 2017; Kirschner, Rabinowitz, Singer, and Dagher, 2020). D2Rs, in contrast, dominate the indirect “Nogo” pathway (which suppresses action) and are more sensitive to small concentration differences in tonic dopamine levels (Missale, Nash, Robinson, Jaber, and Caron, 1998), which is thought to encode the background, average reward rate (Kirschner et al., 2020; Volkow and Baler, 2015; Westbrook and Frank, 2018; Westbrook et al., 2020).”

Second the antagonist is D2 specific. This is not much mentioned. How important is this? What do studies of D1 or general antagonism find?

We agree that the receptor specificity of our findings is an important issue. While we are not aware of a publically available dataset using a D1R antagonist, we now report findings from a D1 agonist study (results from this study were already published in Soutschek et al., 2020, Biological Psychiatry). A re-analysis of this data set (i.e,, the same models as for the D2 antagonist study) provided no evidence for an influence of D1R stimulation on drift diffusion parameters. It seems thus more likely that the amisulpride effects on the decision process are moderated via D2R rather than D1R activity.

The findings for the D1 agonist study are reported in the revised manuscript on p.16:

“To assess the receptor specificity of our findings, we conducted the same analyses on the data from a study (published previously in Soutschek et al. (2020)) testing the impact of three doses of a D1 agonist (6 mg, 15 mg, 30 mg) relative to placebo on intertemporal choices (between-subject design). In the intertemporal choice task used in this experiment, the SS reward was always immediately available (delay = 0), contrary to the task in the D2 experiment where the delay of the SS reward varied from 0-30 days. Again, the data in the D1 experiment were best explained by DDM-1 (DICDDM-1 = 19,657) compared with all other DDMs (DICDDM-2 = 20,934; DICDDM-3 = 21,710; DICDDM-5 = 21,982; DICDDM-6 = 19,660; note that DDM-4 was identical with DDM-1 for the D1 agonist study because the delay of the SS reward was 0). Neither the best-fitting nor any other model yielded significant drug effects on any drift diffusion parameter (see Table 4 for the best-fitting model). Also model-free analyses conducted in the same way as for the D2 antagonist study revealed no significant drug effects (all HDI95% included zero). There was thus no evidence for any influence of D1R stimulation on intertemporal decisions.”

We added a discussion of the receptor specificity of our findings on p.17:

“This finding represents first evidence for the hypothesis that tonic dopamine moderates the impact of proximity (e.g., more concrete versus more abstract rewards) on cost-benefit decision making (Soutschek, Jetter, and Tobler, 2022; Westbrook and Frank, 2018). Pharmacological manipulation of D1R activation, in contrast, showed no significant effects on the decision process. This provides evidence for the receptor specificity of dopamine’s role in intertemporal decision making (though as caveat it is worth keeping the differences between the tasks administered in the D1 and the D2 studies in mind).”

Third would the effects here be explained by the more general hypothesis that tonic dopamine relates to the value of time? I refer to a proposal made by Yael Niv among others. It seems to me that blocking dopamine would in effect lower the value of time, which would be expected to impact the measures described here, for example by making a subject more willing to wait across a delay? Could this idea be related to the findings?

We agree that it is important to relate our current findings to the hypothesis that dopamine encodes the opportunity costs of actions (Niv et al., 2007). Opportunity costs occur in tasks where participants actually experience the waiting time for a reward and cannot engage in other goal-directed behaviors during this waiting time. The intertemporal decision task we used in our study is virtually free from any opportunity costs (other than those associated with taking part in the experiment), as participants could engage in other behaviors while waiting for the delivery of delayed rewards. Thus, while our finding that D2R blockade reduces the impact of delay costs on the starting bias appears to be consistent with Niv’s hypothesis that blocking dopamine lowers the sensitivity to opportunity (waiting) costs, future studies may have to test whether dopamine modulates also the impact of experienced waiting costs on the starting bias. We discuss this issue on p.22:

“Dopamine has also been ascribed a role of moderating opportunity costs, with lower tonic dopamine reducing the sensitivity to opportunity costs (Niv, Daw, Joel, and Dayan, 2007). While this appears consistent with our finding that amisulpride (under the assumption of postsynaptic effects) reduced the impact of delay on the starting bias, it is important to note that choosing delayed rewards did not involve any opportunity costs in our paradigm, given that participants could pursue other rewards during the waiting time. Thus, it needs to be clarified whether our findings for delayed rewards without experienced waiting time can be generalized to choice situations involving experienced opportunity costs.”

Reviewer #2 (Recommendations for the authors):

Overall, I appreciated the detailed modeling work, and the thoughtful alternative constructions of the models. Given this, I wanted to see more of the data and results myself – more informative plots that show both main features of the behavior, and more detailed visual presentation of the actual posterior estimates themselves. I think it would also be useful to plot model comparisons – both globally and for individuals – these give a sense of how close different models are. It was hard to keep track of all the different parameters too so a table of the model parameters for each flavor DDM would be extremely useful for understanding and transparency.

We added further plots visualizing the main features of behavior as well as modelling results. We added plots showing model-free results (Figure 1C/D) as well as the posterior parameter distributions for the influences of reward and delay on the drift rate or the starting bias (Figure 2). We also added a figure illustrating the model comparisons (Figure 3A). Lastly, we added a table providing an overview over the parameters included in the different DDMs (Table 3).

I did not find the posterior predictive check particularly convincing – the authors say that the DDMs are a reasonable account, but from purely visual assessment I can see plenty of individual subjects in which the models (all of the variants) are not an unambiguously good approximation. Some kind of summary statistic of the posterior predictive check might be more interpretable?

We followed the reviewer’s recommendation and added summary statistics for the posterior predictive checks by plotting the empirically observed and simulated reaction times collapsed across all participants (Figure 3B/C). The summary statistics reveal that at least DDMs 1-3 provide good explanations for the empirically observed data, whereas DDMs 4-6 more strongly deviated from the observed behavior.

Reviewer #3 (Recommendations for the authors):

It is possible that the Authors could gain some inferential leverage by examining the effect of drug on reaction times alone. If amisulpride increases post-synaptic dopamine signaling, we might see faster reaction times overall. We also might see reaction times speed when Participants are offered larger overall valued offer pairs and this effect might be larger on amisulpride versus placebo. In any case, it would be helpful for the Authors to report RT effects of the drug. What were the main effects of the drug on reaction times? What were the marginal effects of the drug on reaction times, controlling for differences in reward amount, delay, and choice, etc.?

We thank the reviewer for this interesting suggestion. We conducted a Bayesian mixed model regressing log-transformed reaction times on predictors for Drug, Choice, Magnitudediff, Delaydiff, and the sum of the reward magnitudes (Magnitudesum). As predicted by the reviewer, reaction times were faster if offers included larger overall reward magnitudes, HDImean = -0.12, HDI95% = [-0.18; -0.06], but neither this nor any other effect was significantly modulated by amisulpride (all HDI95% include zero). The decision time data therefore do not allow resolving the question of whether dopamine showed presynaptic (which would predict faster RTs) or postsynaptic (resulting in slower RTs) mechanisms of actions in the current study. We now report the reaction time analysis in the revised manuscript on p.6-7:

“We also tested for drug effects on decision times by re-computing the MGLM reported above on log-transformed decision times, adding predictors for choice (SS versus LL option) and Magnitudesum (combined magnitudes of SS and LL rewards). Participants made faster decisions the higher the sum of the two rewards, HDImean = -0.12, HDI95% = [-0.18; -0.06], however we observed no significant drug effects on decision times.”

It is also possible that there are trial-wise dynamics which may be informative. I am curious whether the Authors examined trial-number effects on the propensity to select the LL option. If the propensity either grew or shrank over trials, then it may be possible to test whether possible post-synaptic dopamine signaling amplified this bias towards the LL option when it was smaller versus larger.

Following the reviewer’s recommendation, we added trial number (z-transformed) as additional factor to the Bayesian mixed model on choices. Participants made indeed more LL choices with increasing trial number, 95% HDI = [0.19; 0.99], but there was no evidence that amisulpride moderated this influence of trial number on choices, 95% HDI = [-0.61; 0.48]. We report this analysis in the revised manuscript on p.6:

“When we explored whether dopaminergic effects changed over the course of the experiment, we observed a significant main effect of trial number (more LL choices over time), HDImean = 0.58, HDI95% = [0.19; 0.99]. However, this effect was unaffected by the pharmacological manipulation, HDImean = -0.06, HDI95% = [-0.61; 0.48].”

References:

Beeler, J. A., and Mourra, D. (2018). To do or not to do: dopamine, affordability and the economics of opportunity. Frontiers in integrative neuroscience, 12, 6.

Evers, E., Stiers, P., and Ramaekers, J. (2017). High reward expectancy during methylphenidate depresses the dopaminergic response to gain and loss. Soc Cogn Affect Neurosci, 12(2), 311-318.

Frank, M. J., and O'Reilly, R. C. (2006). A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. Behavioral neuroscience, 120(3), 497.

Kirschner, M., Rabinowitz, A., Singer, N., and Dagher, A. (2020). From apathy to addiction: Insights from neurology and psychiatry. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 101, 109926.

Missale, C., Nash, S. R., Robinson, S. W., Jaber, M., and Caron, M. G. (1998). Dopamine receptors: from structure to function. Physiological reviews, 78(1), 189-225.

Niv, Y., Daw, N. D., Joel, D., and Dayan, P. (2007). Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology (Berl), 191(3), 507-520. doi:10.1007/s00213-006-0502-4

Salamone, J. D., and Correa, M. (2012). The mysterious motivational functions of mesolimbic dopamine. Neuron, 76(3), 470-485. doi:10.1016/j.neuron.2012.10.021

Schoemaker, H., Claustre, Y., Fage, D., Rouquier, L., Chergui, K., Curet, O.,... Benavides, J. (1997). Neurochemical characteristics of amisulpride, an atypical dopamine D2/D3 receptor antagonist with both presynaptic and limbic selectivity. Journal of Pharmacology and Experimental Therapeutics, 280(1), 83-97.

Schultz, W. (2015). Neuronal Reward and Decision Signals: From Theories to Data. Physiol Rev, 95(3), 853-951. doi:10.1152/physrev.00023.2014

Soutschek, A., Gvozdanovic, G., Kozak, R., Duvvuri, S., de Martinis, N., Harel, B.,... Tobler, P. N. (2020). Dopaminergic D1 Receptor Stimulation Affects Effort and Risk Preferences. Biol Psychiatry, 87(7), 678-685. doi:10.1016/j.biopsych.2019.09.002

Soutschek, A., Jetter, A., and Tobler, P. N. (2022). Towards a Unifying Account of Dopamine’s Role in Cost-Benefit Decision Making. Biological Psychiatry Global Open Science.

Volkow, N. D., and Baler, R. D. (2015). NOW vs LATER brain circuits: implications for obesity and addiction. Trends in neurosciences, 38(6), 345-352.

Westbrook, A., and Frank, M. (2018). Dopamine and Proximity in Motivation and Cognitive Control. Curr Opin Behav Sci, 22, 28-34. doi:10.1016/j.cobeha.2017.12.011

Westbrook, A., van den Bosch, R., Maatta, J. I., Hofmans, L., Papadopetraki, D., Cools, R., and Frank, M. J. (2020). Dopamine promotes cognitive effort by biasing the benefits versus costs of cognitive work. Science, 367(6484), 1362-1366. doi:10.1126/science.aaz5891

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Reviewer #3 (Recommendations for the authors):

The Authors have been responsive to my prior comments. The test of the model in a sample with D1 agonists is welcome and informative – if disappointingly null. In general, I think the Author's edits are suitable.

I disagree with the Authors' response about how the drug effects on the starting point bias and the effects of costs should be interpreted. I still think the data tell a more consistent story, and coincidentally one that is also more consistent with the Westbrook and Frank (2018) hypothesis, if interpreted under the assumption of pre- rather than post-synaptic effects.

Before giving my reasoning, I want to be clear that I think the Authors' interpretation is plausible under narrower assumptions. Also, it is good that they have not ruled out the possibility of pre-synaptic drug effects. If anything, I think they could make a bit more clear in the discussion that an alternative, presynaptic account is plausible (and why this would also support DA-proximity interactions). Ultimately, though, I think this is up to the Authors' discretion. As such, I wanted to articulate my reasoning here in case I might persuade them to reconsider the inferences they make.

There are two points on which I disagree with the Authors' response:

1) the effect of amisulpride on the magnitude effect and 2) drug effects on the DDM starting point bias.

Regarding (1): contrary to the Authors' take, Figure 2B indicates that amisulpride vs. placebo increased the effect of reward magnitude on choice, which supports a pre-synaptic effect. 2C further supports this pre-synaptic account because the drift rate also appears to steepen (as a function of differences in reward magnitude) on amisulpride versus placebo. That is – both figures support a larger reward magnitude/benefit effect on choice, consistent with pre-synaptic amisulpride action, yielding stronger post-synaptic dopamine signaling. In contrast, there is no reliable interaction between drug and delay (Figure 2E) which contradicts the Authors' assertions that D2R activation is implementing a "cost control".

Regarding (2): I'm also inclined to disagree with the Authors' interpretation of the drug effect on the DDM starting bias. While an overall starting bias, and its variation by delay are reduced on amisulpride, the key theoretical question is whether amisulpride alters the proximity bias (stemming from the differences in delay), not the DDM starting bias (though, theoretically, the proximity bias may be reflected in a DDM starting bias).

As noted in my prior review, the drug effect on the DDM starting point may reflect either a reduction of a tendency to select the LL option, or an increase in a proximity bias (arising from differences in delay), or both. I think we are in agreement that all three are possible and that it is not possible to discern between these accounts. Evidence that a proximity bias exists includes that the DDM starting point shifts toward the SS threshold as the differences in delay increase (Figure 2K, on placebo).

The original theory proposes that greater DA signaling increases the SS proximity bias (which predicts a shift in the DDM starting point towards the SS boundary). Contrary to the Authors' claims, there is a main effect of the drug, shifting the DDM starting point towards the SS boundary (Figure 2H). Again, this could be explained by a weakening of the tendency to select the LL option, a strengthening of the proximity bias towards the SS option, or both. Any of these effects could explain Figure 2H. Importantly, however, Figure 2H is inconsistent with the interpretation that "amisulpride weakens the proximity advantage of SS over LL rewards".

Regarding Figure 2K, it is unclear why presynaptic amisulpride binding would strengthen the proximity bias mostly for smaller differences in delay. Nevertheless, the fact that the drug caused a shift in the starting point towards the SS option more for small versus large differences in delay does not contradict the hypothesis that amisulpride pre-synaptically causes a shift in the starting point towards SS rewards. It merely implies that the effects were largest when delay differences were smallest. Perhaps this is true because there is a ceiling effect on how much the proximity bias can influence choice in this paradigm, and it's already maxed out (on placebo) for large delay differences such that the drug can only amplify the proximity bias for small delay differences.

We agree with the reviewer that, depending on the underlying assumptions, the current result pattern can be explained by both postsynaptic and presynaptic mechanisms. Accordingly, we added a more balanced discussion of the arguments in favor of pre- versus postsynaptic effects.

Regarding the drug effects on reward processing, we agree with the reviewer that amisulpride leads to a steeper slope in the relationship between reward magnitudes and the drift rate. This means that for larger differences between reward magnitudes, amisulpride speeds up evidence accumulation towards larger-later choices (which is consistent with presynaptic effects), whereas for smaller differences between reward magnitudes amisulpride promotes choices of SS rewards (as the drift rate becomes more negative under amisulpride compared with placebo). We believe that the latter effect is difficult to reconcile with presynaptic effects, as it is commonly assumed that higher dopamine levels (reflecting presumed presynaptic effects) should strengthen the preference for larger rewards. However, the opposite seems to be true for small reward differences: the more negative drift rate under amisulpride suggests that for such offers amisulpride increases the preferences for smaller-sooner over larger-later rewards. An alternative interpretation of the result pattern therefore is that under amisulpride participants more clearly distinguish between rewards that are worth the waiting costs (large differences between reward magnitudes) and rewards that are not worth to tolerate the same waiting costs (if the larger (later) reward is only minimally larger than the smaller (sooner) reward). This in turn is consistent with the idea of a stricter cost control according to the account of Beeler and Mourra (2018), speaking in favor of reduced dopaminergic activity as results of postsynaptic effects. In the revised manuscript, we explain this reasoning in more detail, but we emphasize that for larger differences in reward magnitudes the effect could in principle also be explained by presynaptic effects (p.20-21):

“At first glance, the stronger influence of differences in reward magnitude on drift rates under amisulpride compared with placebo might therefore speak in favor of presynaptic (higher dopamine levels) rather than postsynaptic mechanisms of action in the current study. However, amisulpride versus placebo increased evidence accumulation towards LL rewards (more positive drift rate) only for larger differences between larger (later) and smaller (sooner) rewards, whereas for smaller reward differences amisulpride enhanced evidence accumulation towards SS choices (more negative drift rate; see Figure 2C). The latter finding appears inconsistent with presynaptic effects, as higher dopamine levels are thought to increase the preference for costly larger rewards (Webber et al., 2020). Instead, the stronger influence of reward differences on drift rates under amisulpride could be explained by a stricter cost control (Beeler and Mourra, 2018). In this interpretation, individuals more strongly distinguish between larger rewards that are worth the waiting costs (large difference between LL and SS rewards) and larger rewards that are not worth the same waiting costs (small difference between LL and SS rewards). While this speaks in favor of postsynaptic effects, we acknowledge that the amisulpride effects for larger reward differences are compatible with presynaptic mechanisms.”

Regarding amisulpride’s influence on the starting bias, the reviewer notes that some aspects of the data are difficult to explain via presynaptic effects: “it is unclear why presynaptic amisulpride binding would strengthen the proximity bias mostly for smaller differences in delay”. In principle though, presynaptic mechanisms could be reconciled with the current results under the additional assumption that proximity effects (at least in the current paradigm) are stronger for smaller than for larger differences in delay. We agree with this view and now clarify in the manuscript that also presynaptic effects might explain the result pattern under certain assumptions. Moreover, we re-formulated our previous statement that “amisulpride weakens the proximity advantage of SS over LL rewards”. The discussion of amisulpride effects on the starting bias now reads as follows (p.21):

“The result pattern for the starting bias parameter, in turn, suggests the presence of two distinct response biases, reflected by the intercept and the delay-dependent slope of the bias parameter (see Figure 2K), which are both under dopaminergic control but in opposite directions. First, participants seem to have a general bias towards the LL option in the current task (intercept), which is reduced under amisulpride compared with placebo, consistent with the assumption that dopamine strengthens the preference for larger rewards (Beeler and Mourra, 2018; Salamone and Correa, 2012; Schultz, 2015). Second, amisulpride reduced the impact of increasing differences in delay on the starting bias, as predicted by the proximity account of tonic dopamine (Westbrook and Frank, 2018). Both of these effects are compatible with postsynaptic effects of amisulpride. However, we note that in principle one might make the assumption that proximity effects are stronger for smaller than for larger differences in delay, and under this assumption the results would be consistent with presynaptic effects.”

https://doi.org/10.7554/eLife.83734.sa2

Article and author information

Author details

  1. Alexander Soutschek

    Department of Psychology, Ludwig Maximilian University Munich, Munich, Germany
    Contribution
    Conceptualization, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft
    For correspondence
    alexander.soutschek@psy.lmu.de
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8438-7721
  2. Philippe N Tobler

    1. Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zürich, Switzerland
    2. Neuroscience Center Zurich, University of Zurich, Swiss Federal Institute of Technology Zurich, Zurich, Switzerland
    Contribution
    Conceptualization, Supervision, Funding acquisition, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4915-9448

Funding

Deutsche Forschungsgemeinschaft (SO 1636/2-1)

  • Alexander Soutschek

Swiss National Science Foundation (100019_176016)

  • Philippe N Tobler

Velux Stiftung (981)

  • Philippe N Tobler

Swiss National Science Foundation (100014_165884)

  • Philippe N Tobler

Swiss National Science Foundation (CRSII5_177277)

  • Philippe N Tobler

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

PNT received funding from the Swiss National Science Foundation (Grants 100019_176016, 100014_165884, and CRSII5_177277) and from the Velux Foundation (Grant 981). AS received an Emmy Noether fellowship (SO 1636/2-1) from the German Research Foundation.

Ethics

Clinical trial registration The D1 agonist study was registered on ClinicalTrials.gov (identifier: NCT03181841).

Human subjects: Participants gave informed written consent before participation. The Cantonal ethics committee Zurich approved both the D2 antagonist study (2012-0568) and the D1 agonist study (2016-01693).

Senior Editor

  1. Christian Büchel, University Medical Center Hamburg-Eppendorf, Germany

Reviewing Editor

  1. Geoffrey Schoenbaum, National Institute on Drug Abuse, National Institutes of Health, United States

Reviewer

  1. Geoffrey Schoenbaum, National Institute on Drug Abuse, National Institutes of Health, United States

Version history

  1. Received: September 27, 2022
  2. Preprint posted: October 25, 2022 (view preprint)
  3. Accepted: February 27, 2023
  4. Version of Record published: March 8, 2023 (version 1)

Copyright

© 2023, Soutschek and Tobler

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 508
    Page views
  • 60
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Alexander Soutschek
  2. Philippe N Tobler
(2023)
A process model account of the role of dopamine in intertemporal choice
eLife 12:e83734.
https://doi.org/10.7554/eLife.83734

Further reading

    1. Evolutionary Biology
    2. Neuroscience
    Katja Heuer, Nicolas Traut ... Roberto Toro
    Research Article

    The process of brain folding is thought to play an important role in the development and organisation of the cerebrum and the cerebellum. The study of cerebellar folding is challenging due to the small size and abundance of its folia. In consequence, little is known about its anatomical diversity and evolution. We constituted an open collection of histological data from 56 mammalian species and manually segmented the cerebrum and the cerebellum. We developed methods to measure the geometry of cerebellar folia and to estimate the thickness of the molecular layer. We used phylogenetic comparative methods to study the diversity and evolution of cerebellar folding and its relationship with the anatomy of the cerebrum. Our results show that the evolution of cerebellar and cerebral anatomy follows a stabilising selection process. We observed 2 groups of phenotypes changing concertedly through evolution: a group of 'diverse' phenotypes - varying over several orders of magnitude together with body size, and a group of 'stable' phenotypes varying over less than 1 order of magnitude across species. Our analyses confirmed the strong correlation between cerebral and cerebellar volumes across species, and showed in addition that large cerebella are disproportionately more folded than smaller ones. Compared with the extreme variations in cerebellar surface area, folial anatomy and molecular layer thickness varied only slightly, showing a much smaller increase in the larger cerebella. We discuss how these findings could provide new insights into the diversity and evolution of cerebellar folding, the mechanisms of cerebellar and cerebral folding, and their potential influence on the organisation of the brain across species.

    1. Neuroscience
    Amanda J González Segarra, Gina Pontes ... Kristin Scott
    Research Article

    Consumption of food and water is tightly regulated by the nervous system to maintain internal nutrient homeostasis. Although generally considered independently, interactions between hunger and thirst drives are important to coordinate competing needs. In Drosophila, four neurons called the interoceptive subesophageal zone neurons (ISNs) respond to intrinsic hunger and thirst signals to oppositely regulate sucrose and water ingestion. Here, we investigate the neural circuit downstream of the ISNs to examine how ingestion is regulated based on internal needs. Utilizing the recently available fly brain connectome, we find that the ISNs synapse with a novel cell-type bilateral T-shaped neuron (BiT) that projects to neuroendocrine centers. In vivo neural manipulations revealed that BiT oppositely regulates sugar and water ingestion. Neuroendocrine cells downstream of ISNs include several peptide-releasing and peptide-sensing neurons, including insulin producing cells (IPCs), crustacean cardioactive peptide (CCAP) neurons, and CCHamide-2 receptor isoform RA (CCHa2R-RA) neurons. These neurons contribute differentially to ingestion of sugar and water, with IPCs and CCAP neurons oppositely regulating sugar and water ingestion, and CCHa2R-RA neurons modulating only water ingestion. Thus, the decision to consume sugar or water occurs via regulation of a broad peptidergic network that integrates internal signals of nutritional state to generate nutrient-specific ingestion.