Abstract
Affective states are dynamic, fluctuating in response to recent events: an unexpected pleasure, a disappointing loss. Affective biases, which cause disruptions in these dynamics, are core components of mental ill-health, but the specific effects of treatments on these biases are poorly understood. Here, we investigate the impact of common psychiatric treatments on subjective assessments of happiness, confidence, and engagement during a reinforcement learning task (N =935; 130 taking antidepressant medications). Half (N =459) of the participants were randomised to practice a common psychotherapeutic technique—cognitive distancing—throughout the task. From a joint computational model of learning and affect, we find evidence for distinct and overlapping impacts of psychiatric treatments on affective dynamics. Cognitive distancing attenuates downward drift in happiness and engagement and increases recency bias in the affective impact of recent choices. Conversely, antidepressant use increases baseline happiness and confidence in individuals with similar levels of current symptoms, and decreases recency bias such that more past events influence affective states. Crucially, both cognitive distancing and antidepressant use converge to dampen negative biases in happiness and confidence specifically in participants experiencing higher levels of anxiety and depression symptoms. Together, our results indicate that common treatments for mental ill-health may alter symptoms through their impact on affective dynamics, but via distinct mechanisms.
1 Background
How are you feeling right now? Research across economics, psychology, and health sciences suggests the answer to this question—your subjective well-being—is closely tied to objective quality of life1,2 and health across the lifespan3. But feelings are far from static, momentarily fluctuating in response to recent events4–6, and even individual choices. Frequently asking participants to rate their feelings enables a read-out of moment-to-moment changes in subjective well-being, or their affective dynamics.
In influential work, Rutledge et al. (2014)5 demonstrated momentary happiness ratings during a gambling task could be accurately predicted by a computational model incorporating the average reward for a gamble (expected value) and the outcome of the gamble minus this average (prediction error). Using a task where reward magnitude and probability were uncorrelated, Blain & Rutledge (2020)7 subsequently showed that momentary happiness is particularly sensitive to changes in learning-related variables—specifically, prediction errors for reward probability—as compared to as compared to reward information that was relevant to behaviour but not learning. Links between happiness ratings and learning-related quantities may extend to subjective assessments of other affective states. Theoretical accounts posit that momentary confidence is the approximate probability a choice is correct8,9 (though see10,11), while effort costs decrease the value of choices independently of reward probability12,13, in turn influencing momentary engagement. Together, these results suggest that affective dynamics are closely coupled with objective quantities that drive choices during learning.
Biases in subjective affect are a core feature of mental ill-health. Symptoms of depression have been consistently linked to lower7,14 and less stable15,16 momentary happiness, while transdiagnostic features of mental ill-health have been linked to biased confidence judgements at different timescales17–20 and impairments in motivation and engagement13,21–23. Affective biases maintain symptoms of mental ill-health by inducing changes in emotion processing and perception24,25. For example, negatively biased perception—a common feature of depression26—may cause low mood by making outcomes appear less rewarding; low mood in turn further negatively biases perception, causing a positive feedback loop which spirals toward a depressive episode27. Successful psychiatric treatment may act to perturb these maladaptive cycles. Short-term selective serotonin reuptake inhibitor (SSRI) administration induces positive perceptual biases in healthy participants28, suggesting that affective biases may be an early target of antidepressant drugs, acting to shift choices away from those that maintain low mood29. Crucially, given that affective biases are precipitated and maintained by negative thinking patterns—a core target of cognitive psychotherapy30,31— they may represent a transdiagnostic treatment target of psychological and pharmacological interventions for symptoms of mental ill-health.
Here, we aimed to link choice behaviour to affective dynamics throughout a reinforcement learning (RL) task32–34 (Figure 1A) and to relate this to mental ill-health symptoms and treatments. We asked online participants (N =935) to rate their feelings (from 0–100) on one of three different affect scales—happiness, confidence, and engagement—after receiving feedback on each choice they made. Half (49.1%) of the participants were randomised to an acute psychological intervention known as “cognitive distancing”, a common35 and effective31 component of psychological therapy which alters learning in this task34. We also collected information on current antidepressant medication use in a demographic questionnaire (reported by 13.9% of participants), and derived transdiagnostic mental health symptom factor scores from a psychiatric questionnaire battery36,37. We then assessed how participants’ affect ratings across each of the three scales covaried with learning-related outcomes throughout the task, accounting for underlying affective biases, using computational modelling. By quantifying the associations between model-derived measures of affective dynamics and transdiagnostic features of mental ill-health, the cognitive distancing intervention, and self-reported antidepressant medication use, we asked whether affective dynamics may be a common target of treatments for symptoms of mental ill-health.

Task design, affect model posterior predictions and model comparison.
A. The task design. B. Mean affect ratings for each rating type (engaged, happy, or confident), by distancing group, compared to model predictions (light-coloured lines). C. Example comparison of predictions from the best-fitting model (light-coloured lines) to raw affect ratings from three different individuals with the median pseudo-R2 for each rating type. D. Model fit compared to the best-fitting model (time elapsed, overall with dual learning rate) in terms of their ELPD (i.e., higher ELPD [or less negative ELPD compared to the best-fitting model] is better), estimated via Bayesian approximate LOO cross-validation47. Ribbons in B-C and error bars in D denote standard errors.
2 Methods
2.1 Online experiment and sample
A total of 995 participants were recruited via Prolific38 over three weeks in April–May 2021. Participants were recruited in batches with fixed pre-screeners for age range, gender, and history of any mental health diagnosis, which resulted in a sample broadly representative of the UK population in terms of these characteristics (see34 for details). After completing a demographic questionnaire, which included questions on current medication usage and mental health diagnoses, participants completed the RL task described below. They then took a short test of working memory (visual digit span), and answered questions from several psychiatric questionnaires, answers from which were used to derive three validated transdiagnostic features of mental health (anxiety/depression, compulsive behaviour, and social withdrawal)36, using methods for computational factor modelling37 described in detail elsewhere34,39. Sixty participants were excluded for meeting pre-registered criteria34. The study was approved by the University of Cambridge Human Biology Research Ethics Committee (HBREC) (HBREC.2020.40) and jointly sponsored by the University of Cambridge and Cambridge University Hospitals National Health Service (NHS) Foundation Trust (IRAS ID 289980). All participants provided written informed consent through an online form, in line with University of Cambridge HBREC procedures for online studies.
2.2 Reinforcement learning task
The reinforcement learning task in the present study—the probabilistic selection task32,33— involved learning which symbol in each of three pairs was more likely correct. Consistent choice of the “better” symbol in each pair enabled a participant to accumulate more points (and maximise their chances of winning a monetary bonus). One symbol in each pair was always more likely correct, but the contingencies varied across the pairs, from 0.8/0.2 (‘AB’) to 0.7/0.3 (‘CD’), to 0.6/0.4 (‘EF’). All participants saw the same six symbols, but the pairs were randomised across individuals and counterbalanced across trials. After making a choice, participants received feedback (“Correct!” or “Incorrect.”), and then rated their subjective happiness, confidence, or engagement (Figure 1A).
One of the three questions was asked after each trial outcome, with each question asked twenty times per block of sixty trials, and never more than twice in a row. Participants were also asked to rate (again from 0–100) how fatigued they felt compared to the beginning of the block after the end of each of the sixty-trial training blocks. After six training blocks, participants were tested on all fifteen unique pairs without feedback. We previously reported the effects of cognitive distancing on task performance and learning, including results of the test phase, in the same sample34.
2.3 Acute psychological intervention and antidepressant use
Half of the participants (n=459; 49.1%) were randomised to be taught, and then practice throughout the task, a psychotherapeutic technique termed “cognitive distancing”, which encouraged them to “take a step back” from their emotional reactions to feedback throughout the task (see here34 for further details). Apart from an additional instructional video before the task started and a small prompt to “Distance yourself...” which appeared with each fixation cross (Figure 1A), the task was identical for distanced and non-distanced participants. To explore similarities between the effects of this psychological intervention, and a pharmacological treatment for mental ill-health, antidepressant medication, we also asked participants to report their current medication use: 130 participants (13.9%) reported current antidepressant use, with the majority (n=94; 72.3%) taking an SSRI.
2.4 Computational modelling: joint RL-affect models
For consistency with previous literature, we used the well-characterised model of momentary happiness first described by Rutledge et al. (2014)5 as a baseline model. This model assumes fluctuations around a baseline (i.e., longer-term mean) can be captured by a weighted sum of recent expected values and prediction error and, importantly, does not condition on previous ratings.
The Rutledge et al. (2014)5 model has been primarily validated in (e.g., gambling) tasks where expected values and prediction errors are explicitly available to participants5,40. As such, we extended it to account for learning in this task. Specifically, our model—which we term a joint RL-affect model—comprised two components: (1) a Q-learning model to infer expected values and prediction errors from participants’ choices (which has been shown to accurately capture choice behaviour in this task33,34); and (2) a model for momentary affect which assumes fluctuations around a baseline can be captured by a recency-weighted sum of Q-learning model-derived expected values and prediction errors5,41. Hierarchical models were simultaneously fitted to task choices plus happiness, confidence, and engagement selfreports, assuming different parameter weights across all scales and participants.
2.4.1 RL models
A Q-learning model infers expected values (termed Q-values) from participant choices—the action (at) of choosing one symbol over the other in each of the three pairs (denoted as states, at)—by assuming they update at each trial t based on prediction errors δt, with the update magnitude controlled by a learning rate α ∈ [0,1]:
We additionally considered a dual learning rate Q-learning model for choices, in which the learning rate is assumed to differ depending on whether the outcome was rewarding (αreward) or not (αloss)33:
In both cases, the difference in Q-values between the chosen (at) and avoided action (āt) is transformed to a choice probability using a softmax function, weighted by an inverse temperature β:
2.4.2 Affect models
Baseline model
Affect ratings scaled to [0, 1] for each participant and rating type (i.e., happiness, confidence or engagement) were assumed to be drawn from independent Beta distributions with a mean-variance reparameterization which models the shape parameters as functions of a (conditional) mean and precision42. Extreme ratings (0 or 1) were allowed in the task, so we transformed ratings to the (0,1) interval using a simple transformation43
where t is the overall trial number at rating number i for rating type p, γ is the discount or ‘forgetting’ factor which imposes a strict weighting on recent trials, and
Accounting for drift over time
We also fit models including an extra weight
where τt is some measure of time elapsed up to trial t: either trial number, block number, overall time elapsed, or time elapsed since the start of that block (see Model comparison).
2.4.3 Fits to data
Models were fitted in a hierarchical Bayesian manner, with approximate posteriors derived via automatic differentiation variational inference (ADVI)45 implemented in CmdStan46. All models were fit to choices and ratings on all three affect scales simultaneously across both distancing and non-distancing participants, with separate weights and decay factors assumed for each person and question, and separate group-level (hyper)priors on each parameter. In other words, participant-level parameter distributions are assumed to be conditionally independent given the group-level distribution over that parameter. Individuallevel predictive accuracy was assessed by comparing responses predicted from each participant’s approximate posterior to their observed affect ratings via pseudo-R2, defined following previous work42 as the squared correlation between observed and mean posterior predictions.
2.4.4 Model comparison
In the affect model, we tested for “drift over time“44; and in the Q-learning model we tested for separate learning rates for rewarding and non-rewarding outcomes. We assumed the Rutledge et al. (2014)5 model (equation 5) to be the baseline model, and so the parameters in this model were included in all models.
Drift over time in affect may be particularly relevant to our task, as participants were able to take as long as they wished to rate their subjective feelings, and time between blocks was additionally unconstrained. As such, we compared four models with different measures of time elapsed (i.e., variants of equation 6) to the baseline model (equation 5), and either single or dual learning rates. The extra parameter added linear weights on either trial number, block number, or total time elapsed. We also tested a final model with two extra parameters: weights on both total time elapsed and time elapsed since the beginning of that block. We then compared all ten models in terms of their approximate leave one out (LOO) expected log pointwise predictive density (ELPD), a metric of estimated out-of-sample predictive accuracy47, corrected for the use of variational approximations to the true posterior48.
2.5 Statistical analysis
For consistency with computational modelling, we adopt a fully Bayesian approach for statistical analyses where possible. As such, results are given as estimates with a highest density interval (HDI), which (unlike a confidence interval (CI)49) can be interpreted as the probability that the true value falls within a given range. We report 95% HDIs to align with convention but interpret results in terms of strength of evidence throughout; an overlap with the null value should not be seen as evidence for lack of an effect, but rather weakened evidence for it.
2.5.1 Associations between model parameters and mental health symptoms & treatments
We tested the effect of differences in transdiagnostic mental health symptoms, current selfreported antidepressant use, or cognitive distancing on model parameters using generalised linear models (GLMs), adjusted for age, gender, and working memory capacity (measured with visual digit span), separately for each rating type. GLMs relating model parameters to antidepressant use were run with and without adjustment for concurrent anxiety/depression symptoms (i.e., factor score), as medication use was not randomised. We also considered whether the effect of cognitive distancing and current antidepressant use on affect model parameters may differ in relation to their association with transdiagnostic mental health, by including factor score as an interaction term in GLMs.
Posterior samples for GLM coefficients were obtained via Markov chain Monte Carlo (MCMC) implemented in CmdStan46, with 2,000 warm-up and 10,000 sampling iterations for each of four chains, using models and priors from the rstanarm R package50. Response distributions were assumed Gaussian for all parameters except for learning and decay rates, which were modelled via Gamma family GLMs (with log link functions). See Interpretation of model-derived parameters in the Supplementary Methods for intuition as to how these regression coefficients are interpreted.
2.5.2 Differences in associations with baseline affect between transdiagnostic factors
To account for potential collinearities in the three transdiagnostic mental health symptom factor scores obtained via computational factor modelling, we used partial least squares (PLS) regression to test which of the transdiagnostic factor scores was most strongly associated with baseline affect (
In line with best-practices39, we first identified the number of components that best described our data in a training set (80% of participants) in terms of mean squared error (MSE) and R2 via ten-fold cross-validation. We then validated the predictive accuracy of this number of components in held-out test data (20% of participants), and formally tested this using a permutation test, where the PLS regression model was re-trained on 10,000 training datasets with shuffled outcome labels (providing a null distribution), and the fraction of these datasets where the MSE between the test data and the predictions from the permuted datasets was lower than the true train-test MSE taken as the p-value. The PLS regression with the chosen number of components was then refitted in the whole dataset, to obtain component loadings on each of the responses and predictors. Lastly, we computed bias-corrected and accelerated (BCa) CIs for each of the loadings, plus the differences in loadings between transdiagnostic factors, from 10,000 bootstrap samples.
3 Results
3.1 A computational model of subjective happiness accounting for learning and affective drift also captures momentary confidence and engagement
Following model comparison, we found that the best-fitting RL-affect model included separate learning rates for rewarding and non-rewarding outcomes and a linear effect of time elapsed since the beginning of the task (equation 6; Figure 1D). This model, when fitted to all affect ratings and participants simultaneously, explained participants’ variance in happiness, confidence, and engagement assessments with similar accuracy (mean [standard deviation (SD)] pseudo-R2 = 0.40–0.42 [0.23–0.26]; see Figure 1C for example individuals).
3.2 Baseline affect is negatively associated with transdiagnostic mental health
We first assessed whether individuals’ estimated model parameters from the best-fitting joint RL-affect model were associated with transdiagnostic features of mental health.
In line with previous work7,14, we found strong evidence for a negative association between baseline happiness (

Associations between affect parameters and transdiagnostic mental health dimensions, and results of PLS regression.
A-C. Estimated differences in baseline affect (
Higher compulsive behaviour and social withdrawal factor scores were also each associated with lower baseline happiness (e.g., mean 2.20-point lower
3.3 Baseline affect is most strongly associated with anxiety/depression symptoms
The three transdiagnostic mental health symptom factor scores obtained via computational factor modelling were highly correlated (r = 0.47 [95% CI: (0.42, 0.52)] between anxiety/ depression and compulsive behaviour; r = 0.61 [95% CI: (0.57, 0.65)] between anxiety/ depression and social withdrawal; r = 0.42 [95% CI: (0.37, 0.47)] between compulsive behaviour and social withdrawal). As such, to compare the strength of associations between baseline affect and transdiagnostic symptom factors, we used a partial least squares regression model to relate baseline affect to the three scores plus age, gender, digit span, and distancing.
We found that a three-component model represented the best compromise between predictive accuracy (in training data) and parsimony (Figure 2D), and there was strong statistical evidence that this model could accurately predict responses in held-out test data (permutation test p<0.0001). The first component of the model negatively loaded on baseline happiness (loading = −0.185, BCa bootstrapped 95% CI = [−0.217, −0.135]), confidence (loading = −0.112, BCa bootstrapped 95% CI = [−0.157, −0.049]), and engagement (loading = −0.131, BCa bootstrapped 95% CI = [−0.171, −0.068]) (Figure 2F). It also positively loaded on each of the transdiagnostic symptom factors: anxiety/depression (loading = 0.660, BCa bootstrapped 95% CI = [0.608, 0.733]), compulsive behaviour (loading = 0.491, BCa bootstrapped 95% CI = [0.415, 0.545]), and social withdrawal (loading = 0.577, BCa bootstrapped 95% CI = [0.529, 0.623]). The other two components did not show this pattern (Figure 2E). There was strong evidence that the first component’s loading was higher for anxiety/depression than both compulsive behaviour and social withdrawal (component 1 loading difference [BCa bootstrapped 95% CI] = 0.169 [0.083, 0.299] and 0.083 [0.037, 0.152] respectively; Figure 2G).
3.4 Cognitive distancing slows affective drift and antidepressant use positively modulates baseline affect
We then assessed the evidence for effects of cognitive distancing and antidepressant use on choice-independent affective dynamics: baseline affect and its drift.
Previously, in the same sample, we reported evidence from linear mixed models that participants practising cognitive distancing declined slightly less in happiness and engagement, but not confidence, over the course of the task34. Evidence from our RL-affect model was consistent with a decrease in affect drift over time: distancing individuals on average drifted less across the task in happiness (estimated mean 21.3% higher odds of increase in happiness over per hour; 95% HDI for multiplier = [1.01, 1.46]; Figure 3Aii), in spite of lower baseline engagement (estimated mean = −2.89 points; 95% HDI = [−5.42, −0.464]; Figure 3Ai). There was also some weak evidence of less drift in engagement in participants randomised to the intervention (17.4% higher

Associations between treatments and affect model parameters.
A-B. Estimated mean differences in individuals’ baseline affect (
There was limited evidence for any difference in affective drift in participants taking antidepressants (Figure 3Bii). However, there was evidence that participants self-reporting current antidepressant use had 3.50-point higher baseline happiness (95% HDI = [0.311, 6.60]) and 2.69-point higher baseline confidence (with much weaker evidence: 95% HDI = [−1.16, 6.54]), after adjusting for anxiety/depression symptom scores (Figure 3Bi).
3.5 Cognitive distancing and antidepressant use have opposite effects on the weighting of choices in subjective happiness
Next, we quantified the associations between treatments—cognitive distancing and antidepressant use—and choice-dependent parameters (i.e.,
There was limited evidence for an association between either treatment and weights on recent prediction errors (
Current antidepressant use was associated with less forgetting of choices and outcomes in happiness ratings (12.5% higher γhappy; 95% HDI for multiplier = [1.04,1.21]) and engagement (6.52% higher γengaged; 95% HDI for multiplier = [1.01,1.13]); Figure 3Biii), suggesting higher weighting of earlier trials’ expected values and prediction errors in subsequent affect ratings. Evidence for this association remained, albeit slightly weakened, after additionally adjusting for current anxiety/depression symptoms (Figure 3Biii), which were themselves positively associated with γhappy and (to a lesser extent) γengaged (Figure S3Aiii). Notably, the converse effect was seen in distancing participants, with the psychological intervention associated with lower happiness forgetting factors, albeit with weak evidence (4.69% lower γhappy; 95% HDI for multiplier = [0.906, 1.004]; Figure 3Aiii).
3.6 Cognitive distancing and antidepressant use dampen negative associations between baseline affect and anxiety/depression symptoms
Lastly, we explored whether the negative associations between choice-independent affective dynamics and transdiagnostic anxiety/depression symptoms were altered by cognitive distancing or current antidepressant use, by including treatment by symptom interactions in outcome GLMs.
We found that both the distancing intervention and antidepressant use weakened the negative associations between baseline happiness and confidence, but not engagement. Specifically, distancing individuals with higher anxiety/depression scores had higher baseline happiness and confidence (mean 1.37-point higher
4 Discussion
Here, we applied a computational model of momentary happiness which assumes fluctuations in affect ratings depend simply on baseline affect, its drift over time, and recency-decayed expected and received outcomes. By extending this model to also capture learning, we were able to link objective behaviour to subjective feelings across distinct affective states—happiness, confidence, and engagement ratings—and show that a a core component of psychological therapy, cognitive distancing, and antidepressant medication use have different effects on affective dynamics, but converge to alter affective biases associated with symptoms of mental ill-health.
There were distinct effects of both treatments on affective dynamics. Randomisation to a psychotherapeutic intervention, cognitive distancing, attenuated declines in happiness and engagement over time, adding to our previously reported findings that this psychotherapeutic technique alters aspects of reward learning34. Self-reported antidepressant use, meanwhile, was associated with higher baseline happiness and confidence after adjustment for current anxiety/depression symptoms (as antidepressant use was not randomised), which is consistent with evidence that antidepressants exert positive affective biases28. Subsequent exploration of changes in affect revealed further mechanistic divergence: current antidepressant use was associated with lower recency biases across all scales (i.e., forgetting factors closer to one), and cognitive distancing reduced the weighting of expectations and higher recency bias in happiness ratings. Together, these results suggest that psychiatric treatments act to alter the contribution of objective learning-related quantities to subjective value judgements.
Consistent with extensive evidence14,18,19,26, we found negative associations between model-derived baseline affect (across all scales) and transdiagnostic psychiatric symptom measures derived from computational factor modelling36,37. This effect, which was strongest for anxiety/depression symptom scores, indicates a consistent, time-invariant negative affective bias which scales with mental ill-health symptom load. Critically, we found evidence for a convergent treatment by symptom interaction with baseline affect across both cognitive distancing and antidepressant use: negative associations between anxiety/depression symptoms and baseline happiness and confidence were attenuated in participants with higher anxiety/depression symptom scores. These results support the cognitive neuropsychological model of antidepressant action, whereby antidepressants are proposed to act acutely to revert negative or maladaptive affective biases29,52, and suggest that cognitive distancing31 and other components of psychotherapy may also act clinically to alter affective biases contributing to symptoms of mental ill-health. We propose that changes in affective dynamics should be investigated further in longitudinal studies as a computational predictor of subsequent symptom change.
Our methodological approach also extends previous work in two ways. First, we applied a theory-driven computational model which allows for fluctuations in momentary happiness as a function of the history of expected values and prediction errors resulting from those expectations, which has been primarily validated in tasks where learning is not required5. We not only show that this model can capture happiness ratings in a task where expected values are never explicitly available and have to be learned from experience, but can also accurately capture variation in subjective ratings of confidence and engagement. Second, we found evidence for drift in happiness over time, replicating recent work which characterised ‘mood drift over time‘44, and extended this to both confidence and engagement. We also found that this drift was strongly associated with self-reported fatigue (Figure S5; see Supplementary Results for more details). We note that we did not find evidence of a previously reported effect—reduced mood drift over time with increased depressive symptoms44—instead finding evidence to the contrary (Figure 1Aii). However, this work primarily reported evidence from short gambling tasks44; our results indicate that associations between affective drift and mental ill-health symptoms are not task-invariant.
We note several limitations. Firstly, there were limitations in our outcome measures. Transdiagnostic measures of mental health psychopathology were estimated using questionnaire subsets34,39, precluding investigation of associations between parameters and individual diagnostic scales. Antidepressant use, meanwhile, was self-reported, non-randomised, and we did not collect information concerning length of treatment. Secondly, our use of a single computational model for all three affect scales is a powerful approach, but limited in its ability to truly contrast trial-to-trial fluctuations in each individual rating scale, as we are only comparing the contribution of a small number of computational components (i.e., affective weights) to a fraction of their variation; the residual (scale-specific) variation is likely also important in explaining how these ratings overlap and differ. Third, as model complexity meant only approximate (mean-field variational, rather than samplingbased) inference was viable, we were unable to account for uncertainty in estimates of individual-level posterior mean parameters in associations with quantities of interest (e.g., by using precision-weighted GLMs), as the posterior covariance matrix cannot accurately capture local interdependencies, meaning that parameter precisions are not reliable enough for uncertainty-weighted outcome models45.
To conclude, we integrated objective choice behaviour in a learning task with trial-to-trial affect ratings across three distinct states—happiness, confidence, and engagement— within a unified computational model. This enabled us to uncover associations between model parameters and treatments for mental health conditions, offering new insights into their underlying mechanisms-of-action. Our results demonstrate the critical importance of affective biases in the maintenance and updating of affective states in mental ill-health, and indicate that existing, effective treatments can be understood at least in part as acting to shift these biases towards the healthy range.
Supplementary Material
Supplementary Methods
Interpretation of model-derived parameters
In the Results, we report intercept (i.e., baseline affect) parameters (
For intuition, consider an estimated GLM coefficient of 0.1 for the distancing group in relation to baseline happiness (
Between-rating RL-affect model
Model definition and fit
In an exploratory analysis, we modified (6) to partition the weights on expected values and prediction errors (
where I denotes the number of ratings between the current rating and the last rating of the same type; in this task, I ≤ 4.
Associations between between-rating RL-affect model parameters and treatments
The associations between the weights on expected values and prediction errors for individual trials and cognitive distancing and antidepressant use were quantified using multilevel Bayesian linear regression models implemented in the brms R package53 and CmdStan46. All models controlled for the same covariates (i.e., age, gender, digit span), but also included a participant-level random intercept and slopes (on trial lag) to account for the fact there were five parameters per person for each affect rating, as well as the main effect of trial lag and its interaction with each treatment. Separate models were fit for each affect rating and parameter (i.e., w2(–t′) and w3(–t′).
Parameter recovery
Joint RL-affect model with mood drift over time
To test whether we could recover known parameter values from the best-fitting model (i.e., dual learning rate, time elapsed over time), we simulated one hundred datasets (including choices and affect ratings) with parameters drawn from the following distributions:
We then fit the model to these simulated data with approximate inference, and compared the posterior mean parameter estimated to those known to have generated the data. We found that all parameters could be recovered with high accuracy (r >0.87; Figure S1A).

Parameter distributions used to simulate data and test parameter recovery.
+i.e., so β ∈ [0,10]

Parameter recovery for A) the joint RL-affect model, and B) the between-rating RL-affect model.
Between-rating RL-affect model
Parameter recovery for the between-rating RL-affect model was tested similarly, with one hundred simulated datasets. The parameter settings were identical to the above except for the time-dependent parameters (and the absence of γp in the model). Specifically,
Comparison between results from models fit to choices alone (with sampling-based inference) vs. in the joint RL-affect model (with variational inference)
We previously reported results in this dataset where we fit Q-learning models to choices alone and compared parameters in distanced and non-distanced participants34.
Besides the obvious difference—that the models were fit to choices alone as opposed to both choices and affect ratings—the models in our previous work34 were fit to data via sampling-based inference (MCMC), as opposed to variational inference (ADVI47). However, despite these differences, we find that individuals’ posterior mean Q-learning parameters from (i) dual learning rate models fit to choices alone with MCMC, and (ii) the best-fitting joint RL-affect model fit choices and affect ratings simultaneously with ADVI are highly correlated with those we previously reported from Q-learning models fit to choices alone34 in the same sample (αreward: r=0.61 [95% CI = (0.57, 0.65)]; αloss: r=0.67 [95% CI: 0.63, 0.70]; β: r = 0.74 [95% CI = (0.70, 0.76)]; Figure S2A). We also replicate a key result from our earlier work: higher inverse temperatures (β) in the distancing group (Figure S2B-C).

Comparison of Q-learning parameters and effects of distancing between dual learning rate models fit to choices alone and the joint RL-affect model additionally fit to affect ratings.
Supplementary Results
Group-level parameter estimates for the best-fitting joint RL-affect model
At the group-level, model-predicted baseline affect (
Compulsive behaviour and social withdrawal factor scores are also associated with altered affective dynamics
In addition to differences in baseline affect and its drift over time detailed in the main text, there was evidence that participants with higher compulsive behaviour scores placed more weight on recent expected values (higher

Associations between higher transdiagnostic psychiatric symptom factor scores and additional affect parameters (A-C), correlation between variance in happiness rating and compulsive behaviour score (D), and the simulated effect on happiness ratings of higher
We additionally found some weak evidence that increases in anxiety/depression factor were associated with slightly higher weighting of previous trials’ expected values and prediction errors for happiness (γhappy; e.g., trial-before-last weighted an estimated 4.04% higher; 95% HDI for multiplier = [1.003, 1.079]; Figure S3Aiii). There was also some stronger evidence for a positive association between social withdrawal factor score and decay factors for happiness and engagement (Figure S3Ciii), suggesting marginally higher weighting of previous trials’ expected values and prediction errors in the computation of affect ratings in those with higher levels social withdrawal symptoms.
Antidepressant use is associated with increased weighting of previous choices’ expected values in affect ratings
To further unpick the effects of treatments on the weighting of previous outcomes, we fit a more flexible between-rating RL-affect model (equation 7). This model, which allowed for different weights on previous expected values (
Parameters from this between-rating model suggested limited evidence for a difference between distancing and non-distancing participants in the weighting of the most recent or intervening outcomes in their affective judgements (Figure S4A). There was also no evidence of an effect of either treatment on weightings of prediction errors from previous trials (Figure S4Aiii–iv & Figure S4Biii–iv). There was, however, some evidence of a small effect of antidepressant use on between-rating changes in affect: higher weighting of the most recent expected value in subjective affect ratings (Figure S4Bi). The evidence for this was strongest for engagement, with a unit increase in the most recent Q-value associated with 4.33% higher odds of an increase in engagement rating (95% HDI for multiplier = [1.001, 1.089]). Furthermore, there was limited evidence of an accompanying interaction effect (Figure S4Bii), suggesting the contribution of less recent expected values to engagement ratings was also marginally higher in participants taking antidepressants, which may in turn explain the higher forgetting factor γengaged (Figure 3Bv).

Effects of cognitive distancing (A) and antidepressant use (B) on expected value and prediction error parameters, derived from the between-rating RL-affect model.
Affective drift over time is associated with self-reported fatigue
Previous work on ‘mood drift over time’ has suggested it is mostly distinct from boredom and mind wandering44. Here, were able to test an additional aspect of this phenomenon, namely its relation to fatigue, as we asked participants the following question after the end of each of the six blocks: “How fatigued do you feel compared to the beginning of the block?”. We hence examined the association between
We found strong evidence, after adjusting for age, gender, digit span, and distancing group, that higher mean post-block fatigue ratings were associated with lower baseline affect (lower

Associations between baseline affect and affective drift, and self-reported fatigue.
Data and code availability
All code to replicate the analyses here can be found in accompanying Jupyter notebooks, alongside cleaned, anonymised data. See the GitHub repository for more details.
Acknowledgements
This study was funded by an AXA Research Fund Fellowship awarded to C.L.N. (G102329) and the Medical Research Council (MC_UU_00030/12). C.L.N. is funded by a Wellcome Career Development Award (226490/Z/22/Z) and acknowledges support by the NIHR Cambridge NIHR Biomedical Research Centre (BRC-1215-20014). Q.D. is funded by a Wellcome Trust PhD studentship. Q.D. and Q.J.M.H acknowledge support by the NIHR UCLH BRC. Q.J.M.H. acknowledges grant funding from the NIHR, Wellcome Trust, Carigest S.A. and Koa Health. R.B.R. is supported by the National Institute of Mental Health (R01MH124110). R.B.R. holds equity in Maia.
Additional information
Rights retention
For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission.
Acronyms
ADVI: Automatic Differentiation Variational Inference
BCa: Bias-Corrected and accelerated
CI: Confidence Interval
ELPD: Expected Log Pointwise Predictive Density
GLM: Generalised Linear Model
HDI: Highest Density Interval
HBREC: Human Biology Research Ethics Committee
LOO: Leave One Out
MSE: Mean Squared Error
MCMC: Markov Chain Monte Carlo
NHS: National Health Service
PLS: Partial Least Squares
RL: Reinforcement Learning
SD: Standard Deviation
SSRI: Selective Serotonin Reuptake Inhibitor
References
- 1.Objective Confirmation of Subjective Measures of Human Well-Being: Evidence from the U.S.AScience 327:576–79https://doi.org/10.1126/science.1180606Google Scholar
- 2.Advances in Subjective Well-Being ResearchNature Human Behaviour 2:253–60https://doi.org/10.1038/s41562-018-0307-6Google Scholar
- 3.Subjective Wellbeing, Health, and AgeingThe Lancet 385:640–48https://doi.org/10.1016/S0140-6736(13)61489-0Google Scholar
- 4.Events and Subjective Well-Being: Only Recent Events MatterJournal of Personality and Social Psychology 70:1091–1102https://doi.org/10.1037/0022-3514.70.5.1091Google Scholar
- 5.A Computational and Neural Model of Momentary Subjective Well-BeingProceedings of the National Academy of Sciences of the United States of America 111:12252–57https://doi.org/10.1073/pnas.1407535111Google Scholar
- 6.Hedonism and the Choice of Everyday ActivitiesProceedings of the National Academy of Sciences of the United States of America 113:9769–73https://doi.org/10.1073/pnas.1519998113Google Scholar
- 7.Momentary Subjective Well-Being Depends on Learning and Not RewardeLife 9:e57977https://doi.org/10.7554/eLife.57977Google Scholar
- 8.Confidence and Certainty: Distinct Probabilistic Quantities for Different GoalsNature Neuroscience 19:366–74https://doi.org/10.1038/nn.4240Google Scholar
- 9.Comparing Bayesian and Non-Bayesian Accounts of Human Confidence ReportsPLOS Computational Biology 14:1006572https://doi.org/10.1371/journal.pcbi.1006572Google Scholar
- 10.The Idiosyncratic Nature of ConfidenceNature Human Behaviour 1:810–18https://doi.org/10.1038/s41562-017-0215-1Google Scholar
- 11.Confidence Reports in Decision-Making with Multiple Alternatives Violate the Bayesian Confidence HypothesisNature Communications 11https://doi.org/10.1038/s41467-020-15581-6Google Scholar
- 12.Weighing up the Benefits of Work: Behavioral and Neural Analyses of Effort-Related Decision MakingNeural Networks 19:1302–14https://doi.org/10.1016/j.neunet.2006.03.005Google Scholar
- 13.Cognitive Effort-Based Decision-Making in Major Depressive DisorderPsychological Medicine :1–8https://doi.org/10.1017/S0033291722000964Google Scholar
- 14.Association of Neural and Emotional Impacts of Reward Prediction Errors With Major DepressionJAMA Psychiatry 74:790–97https://doi.org/10.1001/jamapsychiatry.2017.1713Google Scholar
- 15.Quality of Life in Depression: Daily Life Determinants and VariabilityPsychiatry Research 88:173–89https://doi.org/10.1016/s0165-1781(99)00081-5Google Scholar
- 16.Mood Homeostasis, Low Mood, and History of Depression in 2 Large Population SamplesJAMA Psychiatry 77:944–51https://doi.org/10.1001/jamapsychiatry.2020.0588Google Scholar
- 17.Psychiatric Symptom Dimensions Are Associated With Dissociable Shifts in Metacognition but Not Task PerformanceBiological Psychiatry 84:443–51https://doi.org/10.1016/j.biopsych.2017.12.017Google Scholar
- 18.Abnormalities of Confidence in Psychiatry: An Overview and Future PerspectivesTranslational Psychiatry 9:268https://doi.org/10.1038/s41398-019-0602-7Google Scholar
- 19.How do confidence and self-beliefs relate in psychopathology: a transdiagnostic approachNature Mental Health 1:337–345https://doi.org/10.1038/s44220-023-00062-8Google Scholar
- 20.Distorted learning from local metacognition supports transdiagnostic underconfidenceNature Communications 16:1854https://doi.org/10.1038/s41467-025-57040-0Google Scholar
- 21.Why Don’t You Try Harder? An Investigation of Effort Production in Major DepressionPloS One 6:23178https://doi.org/10.1371/journal.pone.0023178Google Scholar
- 22.Why Not Try Harder? Computational Approach to Motivation Deficits in Neuro-Psychiatric DiseasesBrain 141:629–50https://doi.org/10.1093/brain/awx278Google Scholar
- 23.Neuroscience of Apathy and Anhedonia: A Transdiagnostic ApproachNature Reviews Neuroscience 19:470–84https://doi.org/10.1038/s41583-018-0029-9Google Scholar
- 24.Neural mechanisms of the cognitive model of depressionNature Reviews Neuroscience 12:467–477https://doi.org/10.1038/nrn3027Google Scholar
- 25.Variation in the recall of socially rewarding information and depressive symptom severity: a prospective cohort studyActa Psychiatrica Scandinavica 135:489–498https://doi.org/10.1111/acps.12729Google Scholar
- 26.Emotional Reactivity to Daily Events in Major and Minor DepressionJournal of Abnormal Psychology 120:155–67https://doi.org/10.1037/a0021662Google Scholar
- 27.Mood as Representation of MomentumTrends in Cognitive Sciences 20:15–24https://doi.org/10.1016/j.tics.2015.07.010Google Scholar
- 28.Serotonin and Emotional Processing: Does It Help Explain Antidepressant Drug Action?Neuropharmacology 55:1023–28https://doi.org/10.1016/j.neuropharm.2008.06.036Google Scholar
- 29.How Do Antidepressants Work? New Perspectives for Refining Future Treatment ApproachesThe Lancet Psychiatry 4:409–18https://doi.org/10.1016/S2215-0366(17)30015-9Google Scholar
- 30.Thinking and depression: II. Theory and therapyArchives of General Psychiatry 10:561–571https://doi.org/10.1001/archpsyc.1964.01720240015003Google Scholar
- 31.“Asking Why” from a Distance: Its Cognitive and Emotional Consequences for People with Major Depressive DisorderJournal of Abnormal Psychology 121:559–69https://doi.org/10.1037/a0028808Google Scholar
- 32.By Carrot or by Stick: Cognitive Reinforcement Learning in ParkinsonismScience 306:1940–43https://doi.org/10.1126/SCIENCE.1102941Google Scholar
- 33.Genetic Triple Dissociation Reveals Multiple Roles for Dopamine in Reinforcement LearningProceedings of the National Academy of Sciences of the United States of America 104:16311–16https://doi.org/10.1073/pnas.0706111104Google Scholar
- 34.A Core Component of Psychological Therapy Causes Adaptive Changes in Computational Learning MechanismsPsychological Medicine :1–11https://doi.org/10.1017/S0033291723001587Google Scholar
- 35.Regulating emotion through distancing: A taxonomy, neurocognitive model, and supporting meta-analysisNeuroscience & Biobehavioral Reviews 96:155–173https://doi.org/10.1016/j.neubiorev.2018.04.023Google Scholar
- 36.Characterizing a Psychiatric Symptom Dimension Related to Deficits in Goal-Directed ControleLife 5:e11305https://doi.org/10.7554/eLife.11305Google Scholar
- 37.Identifying Transdiagnostic Mechanisms in Mental Health Using Computational Factor ModelingBiological Psychiatry 93:690–703https://doi.org/10.1016/j.biopsych.2022.09.034Google Scholar
- 38.Prolific.Ac—A Subject Pool for Online ExperimentsJournal of Behavioral and Experimental Finance 17:22–27https://doi.org/10.1016/J.JBEF.2017.12.004Google Scholar
- 39.Associations between Aversive Learning Processes and Transdiagnostic Psychiatric Symptoms in a General Population SampleNature Communications 11:4179https://doi.org/10.1038/s41467-020-17977-wGoogle Scholar
- 40.Computational Models of Subjective Feelings in PsychiatryNeuroscience & Biobehavioral Reviews 145:105008https://doi.org/10.1016/j.neubiorev.2022.105008Google Scholar
- 41.The effect of reward prediction errors on subjective affect depends on outcome valence and decision contextEmotion 24:894–911https://doi.org/10.1037/emo0001310Google Scholar
- 42.Beta Regression for Modelling Rates and ProportionsJournal of Applied Statistics 31:799–815https://doi.org/10.1080/0266476042000214501Google Scholar
- 43.A Better Lemon Squeezer? Maximum-Likelihood Regression with Beta-Distributed Dependent VariablesPsychological Methods 11:54–71https://doi.org/10.1037/1082-989X.11.1.54Google Scholar
- 44.A Highly Replicable Decline in Mood during Rest and Simple TasksNature Human Behaviour 7:596–610https://doi.org/10.1038/s41562-023-01519-7Google Scholar
- 45.Automatic Differentiation Variational InferencearXiv https://doi.org/10.48550/arXiv.1603.00788Google Scholar
- 46.Stan Modelling Language Users Guide and Reference Manualhttps://mc-stan.org/docs/2_31/cmdstan-guide-2_31.pdf
- 47.Practical Bayesian Model Evaluation Using Leave-One-out Cross-Validation and WAICStatistics and Computing 27:1413–32https://doi.org/10.1007/S11222-016-9696-4Google Scholar
- 48.Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large DataarXiv https://doi.org/10.48550/arXiv.2001.00980Google Scholar
- 49.Statistical Tests, P Values, Confidence Intervals, and Power: A Guide to MisinterpretationsEuropean Journal of Epidemiology 31:337–50https://doi.org/10.1007/s10654-016-0149-3Google Scholar
- 50.rstanarm: Bayesian applied regression modeling via StanR package https://mc-stan.org/rstanarm/
- 51.The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized InversesSIAM Journal on Scientific and Statistical Computing 5:735–43https://doi.org/10.1137/0905052Google Scholar
- 52.Why do antidepressants take so long to work? A cognitive neuropsychological model of antidepressant drug actionBritish Journal of Psychiatry 195:102–108https://doi.org/10.1192/bjp.bp.108.051193Google Scholar
- 53.brms: An R Package for Bayesian Multilevel Models Using StanJournal of Statistical Software 80:1–28https://doi.org/10.18637/jss.v080.i01Google Scholar
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.107269. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Dercon et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 0
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.