Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorSacha NelsonBrandeis University, Waltham, United States of America
- Senior EditorSacha NelsonBrandeis University, Waltham, United States of America
Reviewer #1 (Public review):
Summary:
The authors aimed to determine whether individual serotonin neurons encode a slowly evolving estimate of environmental value during a dynamic Pavlovian conditioning task. They used a Bayesian modeling framework to fit neural activity and behavior to reward history across multiple timescales. A key goal was to distinguish value coding from other influences, particularly thirst, by comparing model fits across neurons. Ultimately, they sought to quantify the prevalence and properties of value coding in single serotonin neurons and assess its relationship to behavior.
Strengths:
The authors employ a Bayesian modeling framework that allows for nuanced hypothesis testing on long timescales of reward history. This approach is well-suited to the complexity of single-neuron data, where noise and variability can obscure meaningful patterns. By fitting generative models to both neural activity and behavior, the authors move beyond descriptive statistics to infer latent variables such as value and thirst, and quantify their contributions to firing rate.
The use of hierarchical Bayesian models enables partial pooling across neurons and sessions, improving parameter estimation while accounting for individual variability. The mixture modeling strategy further strengthens the analysis by explicitly testing whether neurons encode value, thirst, or neither - rather than assuming a single coding scheme. This avoids overfitting and provides a principled way to assess the prevalence and properties of value coding in the serotonergic population.
The authors also validate their modeling choices through cross-validation and comparisons with null and trend models, demonstrating that their value model explains neural activity better than simpler alternatives. This lends credibility to their claim that serotonin neurons encode slowly evolving estimates of value.
Weaknesses:
The authors' decision to analyze neural activity during the ITI is methodologically sound in terms of maximizing spike counts and improving statistical power for single-unit modeling. Their generative model performs best when applied to ITI firing, and the longer duration and higher spike density of this period make it well-suited for capturing slow dynamics in serotonergic neurons.
However, this strength simultaneously introduces a conceptual limitation. The behavioral readout-anticipatory licking-occurs during the cue periods, not the ITI. This creates a temporal disconnect between the neural and behavioral data streams. While the authors cite theoretical work suggesting that ITI value scales with trace period value, this assumption is not directly validated in the current dataset. As a result, it remains unclear whether ITI firing reflects behaviorally relevant value signals or merely captures slow fluctuations unrelated to immediate behavioral output. For example, after all of the analyses performed, the final results section point reads: "Taken together, anticipatory licking is explained partially by value integration occurring at a faster time scale than seen in serotonergic cells and partially by value integration happening at a timescale that matches the serotonergic cells, but the part of the behaviour matching the timescale seen in serotonergic cells is better explained by a model of thirst than a model of value." This appears to negate much of the work of the prior analyses.
The manuscript lacks sufficient population-level illustrations of behavior. Figure 1 presents a single-session example, which does not allow the reader to assess consistency across mice or neurons. Figure 2 improves on this by showing individual traces and means, but the data are already processed and smoothed, obscuring raw behavioral variability.
Additionally, key behavioral metrics are not clearly defined. For instance, the calculation of "reward collection probability" is ambiguous. It is unclear whether this refers to licking during the cue, the outcome window, or some other period. The relationship between reward collection probability and anticipatory licking is also not explicitly described, making it difficult to interpret how these behavioral measures relate to the modeled value signals. The reader is also not shown what licking looks like during the ITI - the precise period the authors analyse and focus on.
Thirst plays a central role in the manuscript, both as a behavioral driver and as a confounding variable in interpreting serotonergic activity. However, the method used to quantify thirst, a linear decrease from an initial value following each drinking event, is overly simplistic and potentially misleading. This approach assumes that thirst diminishes uniformly with each reward, without accounting for the physiological complexity of hydration and satiety regulation.
In reality, thirst is influenced by multiple factors, including fluid balance, timing of intake, and individual variability. Modeling it as a monotonic function of reward consumption risks conflating motivational state with mere reward history. Given how prominently thirst features in the analysis and interpretation, a more nuanced or empirically validated measure would strengthen the manuscript's conclusions.
Minor, but I did not find Panel A of Figure S1 to be helpful to the manuscript. The panel says height, while the caption says hairline. This manuscript is not about faculty, height, or hairline.
Reviewer #2 (Public review):
Summary:
The authors recently published a seminal work (Nature 2025), in which they proposed that the activity of serotonin neurons encodes a "prospective code for value" (value with low-pass filtered negative feedback, roughly resulting in rate-of-change + (compressed) value) and validated this proposal by analyzing several data sets and showing that their theory provided better fit than existing other theories. In the present work, the authors analyzed the activity of serotonin neurons and the licking behavior in reference to their theory by using the data of mice performing a dynamic Pavlovian task, in which the reward probability occasionally changed without a cue in a block-wise manner. While serotonin neuronal activity during task trials in the same data set was analyzed in their previous work, in the present work, the authors focused on the activity during inter-trial intervals and longer time-scale changes. The authors' analyses using Bayesian model fitting revealed that serotonin neurons' activities reflected reward history over long time scales (on average about 100 trials or 10~20 minutes) and the time scales for individual neurons considerably varied (30~300 trials, 5~60 minutes). Analysis of licking, on the other hand, revealed that licking frequency mainly reflected reward history over shorter time scales, and the remaining long-time-scale components could be mostly explained by (gradually decreasing) thirst.
Strengths:
(1) The results supported and further elaborated the authors' prospective value coding theory of serotonin.
(2) The results also raised a question about what then determines the frequency of licking behavior and how.
Weaknesses:
(1) A limitation of the current analyses is the lack of consideration of the effort cost of licking. Given that both involvement of serotonin in effort cost computation (Meyniel et al., 2016 eLife 17282) and the existence/influence of effort cost of licking (Hage et al., 2023 eLife 87238) have been suggested, it is desired to consider (most desirably, formally analyze) such an effect in the current data set. A simple way of incorporating effort cost would be to assume a small (free parameter) negative reward for every single licking (anticipatory and other) and combine these negative rewards with positive (liquid) rewards in the calculation of value. This may not drastically change the main claims of the present work, but could still provide insights into whether/how serotonin is involved in cost-benefit computation (or whether/how reward and cost are combined in the serotonin system).
(2) Another possibility related to effort cost is that the accumulation of effort cost of licking over a long time scale may cause fatigue. Since such a fatigue is expected to gradually increase across the entire session, potentially in a similar time course to thirst (but with a positive rather than negative slope), it may be needed to ask whether the suggested positive effect of thirst on licking (i.e., decrease of licking due to decrease of thirst) could be (partially) explained by a negative effect of fatigue (i.e., decrease of licking due to increase of fatigue).
(3) Are there also possibilities that the decrease of licking (partially) reflects a decrease in the degree of exploration (over the selection between licking and no-licking) and/or meta learning about the occasional sudden changes in the reward probability, such as the meta learning observed in animals engaging in a repetitive reversal learning task (Hattori et al., 2023 Nat Neurosci)?
Reviewer #3 (Public review):
Summary:
The authors are reanalyzing previously published data to test the hypothesis that serotonin neurons encode state value. Here, the authors focus on analyzing the firing rate of serotonin neurons during the inter-trial interval, in which no cues or outcomes are delivered. The goal is to quantify and find neurons whose activity is explained by value encoding, and for those that have this property, determine what the timescale of reward integration is (e.g., a few trials, tens of trials, or the entire session) in individual neurons.
Strengths:
The major strengths are the use of a Bayesian modelling approach to extract value and thirst coding features from individual neurons, and comparison of the time course of adaptation of serotonin neurons with a behavioral output, licking in this case. I also appreciate the use of a separate dataset to establish prior distributions for baseline firing rate to be used in the modelling done here, which is an attempt to deal with the main weakness of this study:
Weaknesses:
The weakness of this study is the small number of neurons available for analysis, resulting in a small number of neurons that unequivocally are modulated by value.
The authors did achieve their aims, but the results show that it is hard to unequivocally separate value-coding neurons with long timescales from thirst-coding neurons, which is acknowledged by the authors.
While the experimental results do not allow for a strong conclusion regarding the distinction of value versus thirst coding in serotonin neurons, the methods employed and the rationale for using them are of great utility to the community and for considerations of behavioral task design and data analysis in future studies. This is a point that the authors could discuss/develop more.
Additional significance of the work:
The comparison between time courses for behavior (anticipatory licking) and serotonin activity (as well as the reference to dopamine activity's time course from a previous study) is of great significance for any researcher studying behavioral control. Mounting evidence suggests that multiple brain circuits contribute to any given action selection. Therefore, expecting a perfect alignment between the time course of neuromodulator activity and behavioral output might be unreasonable. For future studies, modelling behavioral output as a combination of policies determined by multiple brain circuits or neuromodulators might be a promising approach.