Mixed representations of choice and outcome by GABA/glutamate cotransmitting neurons in the entopeduncular nucleus

  1. Department of Anatomy and Neurobiology, Boston University Chobanian & Avedisian School of Medicine, Boston, USA
  2. Howard Hughes Medical Institute, Department of Neurobiology, Harvard Medical School, Boston, USA

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Jesse Goldberg
    Cornell University, Ithaca, United States of America
  • Senior Editor
    Kate Wassum
    University of California, Los Angeles, Los Angeles, United States of America

Reviewer #1 (Public Review):

Summary:

In this series of studies, Locantore et al. investigated the role of SST-expressing neurons in the entopeduncular nucleus (EPNSst+) in probabilistic switching tasks, a paradigm that requires continued learning to guide future actions. In prior work, this group had demonstrated EPNSst+ neurons co-release both glutamate and GABA and project to the lateral habenula (LHb), and LHb activity is also necessary for outcome evaluation necessary for performance in probabilistic decision-making tasks. Previous slice physiology works have shown that the balance of glutamate/GABA co-release is plastic, altering the net effect of EPN on downstream brain areas and neural circuit function. The authors used a combination of in vivo calcium monitoring with fiber photometry and computational modeling to demonstrate that EPNSst+ neural activity represents movement, choice direction, and reward outcomes in their behavioral task. However, viral-genetic manipulations to synaptically silence these neurons or selectively eliminate glutamate release had no effect on behavioral performance in well-trained animals. The authors conclude that despite their representation of task variables, EPN Sst+ neuron synaptic output is dispensable for task performance.

Strengths and Weaknesses:

Overall, the manuscript is exceptionally scholarly, with a clear articulation of the scientific question and a discussion of the findings and their limitations. The analyses and interpretations are careful and rigorous. This review appreciates the thorough explanation of the behavioral modeling and GLM for deconvolving the photometry signal around behavioral events, and the transparency and thoroughness of the analyses in the supplemental figures. This extra care has the result of increasing the accessibility for non-experts, and bolsters confidence in the results. To bolster a reader's understanding of results, we suggest it would be interesting to see the same mouse represented across panels (i.e. Figures 1 F-J, Supplementary Figures 1 F, K, etc i.e via the inclusion of faint hash lines connecting individual data points across variables. Additionally, Figure 3E demonstrates that eliminating the 'reward' and 'choice and reward' terms from the GLM significantly worsens model performance; to demonstrate the magnitude of this effect, it would be interesting to include a reconstruction of the photometry signal after holding out of both or one of these terms, alongside the 'original' and 'reconstructed' photometry traces in panel D. This would help give context for how the model performance degrades by exclusion of those key terms. Finally, the authors claimed calcium activity increased following ipsilateral movements. However, Figure 3C clearly shows that both SXcontra and SXipsi increase beta coefficients. Instead, the choice direction may be represented in these neurons, given that beta coefficients increase following CXipsi and before SEipsi, presumably when animals make executive decisions. Could the authors clarify their interpretation on this point? Also, it is not clear if there is a photometry response related to motor parameters (i.e. head direction or locomotion, licking), which could change the interpretation of the reward outcome if it is related to a motor response; could the authors show photometry signal from representative 'high licking' or 'low licking' reward trials, or from spontaneous periods of high vs. low locomotor speeds (if the sessions are recorded) to otherwise clarify this point?

There are a few limitations with the design and timing of the synaptic manipulations that would improve the manuscript if discussed or clarified. The authors take care to validate the intersectional genetic strategies: Tetanus Toxin virus (which eliminates synaptic vesicle fusion) or CRISPR editing of Slc17a6, which prevents glutamate loading into synaptic vesicles. The magnitude of effect in the slice physiology results is striking. However, this relies on the co-infection of a second AAV to express channelrhodopsin for the purposes of validation, and it is surely the case that there will not be 100% overlap between the proportion of cells infected. Alternative means of glutamate packaging (other VGluT isoforms, other transporters, etc) could also compensate for the partial absence of VGluT2, which should be discussed. The authors do not perform a complimentary experiment to delete GABA release (i.e. via VGAT editing), which is understandable, given the absence of an effect with the pan-synaptic manipulation. A more significant concern is the timing of these manipulations as the authors acknowledge. The manipulations are all done in well-trained animals, who continue to perform during the length of viral expression. Moreover, after carefully showing that mice use different strategies on the 70/30 version vs the 90/10 version of the task, only performance on the 90/10 version is assessed after the manipulation. Together, the observation that EPNsst activity does not alter performance on a well-learned, 90/10 switching task decreases the impact of the findings, as this population may play a larger role during task acquisition or under more dynamic task conditions. Additional experiments could be done to strengthen the current evidence, although the limitation is transparently discussed by the authors.

Finally, intersectional strategies target LHb-projecting neurons, although in the original characterization, it is not entirely clear that the LHb is the only projection target of EPNsst neurons. A projection map would help clarify this point.

Overall, the authors used a pertinent experimental paradigm and common cell-specific approaches to address a major gap in the field, which is the functional role of glutamate/GABA co-release from the major basal ganglia output nucleus in action selection and evaluation. The study is carefully conducted, their analyses are thorough, and the data are often convincing and thought-provoking. However, the limitations of their synaptic manipulations with respect to the behavioral assays reduce generalizability and to some extent the impact of their findings.

Reviewer #2 (Public Review):

Summary:

This paper aimed to determine the role EP sst+ neurons play in a probabilistic switching task.

Strengths:

The in vivo recording of the EP sst+ neuron activity in the task is one of the strongest parts of this paper. Previous work had recorded from the EP-LHb population in rodents and primates in head-fixed configurations, the recordings of this population in a freely moving context is a valuable addition to these studies and has highlighted more clearly that these neurons respond both at the time of choice and outcome.

The use of a refined intersectional technique to record specifically the EP sst+ neurons is also an important strength of the paper. This is because previous work has shown that there are two genetically different types of glutamatergic EP neurons that project to the LHb. Previous work had not distinguished between these types in their recordings so the current results showing that the bidirectional value signaling is present in the EP sst+ population is valuable.

Weaknesses:

(1) One of the main weaknesses of the paper is to do with how the effect of the EP sst+ neurons on the behavior was assessed.

(a) All the manipulations (blocking synaptic release and blocking glutamatergic transmission) are chronic and more importantly the mice are given weeks of training after the manipulation before the behavioral effect is assessed. This means that as the authors point out in their discussion the mice will have time to adjust to the behavioral manipulation and compensate for the manipulations. The results do show that mice can adapt to these chronic manipulations and that the EP sst+ are not required to perform the task. What is unclear is whether the mice have compensated for the loss of EP sst+ neurons and whether they play a role in the task under normal conditions. Acute manipulations or chronic manipulations without additional training would be needed to assess this.

(b) Another weakness is that the effect of the manipulations was assessed in the 90/10 contingency version of the task. Under these contingencies, mice integrate past outcomes over fewer trials to determine their choice and animals act closer to a simple win-stay-lose switch strategy. Due to this, it is unclear if the EP sst+ neurons would play a role in the task when they must integrate over a larger number of conditions in the less deterministic 70/30 version of the task.

The authors show an intriguing result that the EP sst+ neurons are excited when mice make an ipsilateral movement in the task either toward or away from the center port. This is referred to as a choice response, but it could be a movement response or related to the predicted value of a specific action. Recordings while mice perform movement outside the task or well-controlled value manipulations within the session would be needed to really refine what these responses are related to.

(2) The authors conclude that they do not see any evidence for bidirectional prediction errors. It is not possible to conclude this. First, they see a large response in the EP sst+ neurons to the omission of an expected reward. This is what would be expected of a negative reward prediction error. There are much more specific well-controlled tests for this that are commonplace in head-fixed and freely moving paradigms that could be tested to probe this. The authors do look at the effect of previous trials on the response and do not see strong consistent results, but this is not a strong formal test of what would be expected of a prediction error, either a positive or negative. The other way they assess this is by looking at the size of the responses in different recording sessions with different reward contingencies. They claim that the size of the reward expectation and prediction error should scale with the different reward probabilities. If all the reward probabilities were present in the same session this should be true as lots of others have shown for RPE. Because however this data was taken from different sessions it is not expected that the responses should scale, this is because reward prediction errors have been shown to adaptively scale to cover the range of values on offer (Tobler et al., Science 2005). A better test of positive prediction error would be to give a larger-than-expected reward on a subset of trials. Either way, there is already evidence that responses reflect a negative prediction error in their data and more specific tests would be needed to formally rule in or out prediction error coding especially as previous recordings have shown it is present in previous primate and rodent recordings.

(3) There are a lot of variables in the GLM that occur extremely close in time such as the entry and exit of a port. If two variables occur closely in time and are always correlated it will be difficult if not impossible for a regression model to assign weights accurately to each event. This is not a large issue, but it is misleading to have regression kernels for port entry and exits unless the authors can show these are separable due to behavioral jitter or a lack of correlation under specific conditions, which does not seem to be the case.

Reviewer #3 (Public Review):

Summary:

The authors find that Sst-EPN neurons, which project to the lateral habenula, encode information about response directionality (left vs right) and outcome (rewarded vs unrewarded). Surprisingly, impairment of vesicular signaling in these neurons onto their LHb targets did not impair probabilistic choice behavior.

Strengths:

Strengths of the current work include extremely detailed and thorough analysis of data at all levels, not only of the physiological data but also an uncommonly thorough analysis of behavioral response patterns.

Weaknesses:

Overall, I saw very few weaknesses, with only two issues, both of which should be possible to address without new experiments:

(1) The authors note that the neural response difference between rewarded and unrewarded trials is not an RPE, as it is not affected by reward probability. However, the authors also show the neural difference is partly driven by the rapid motoric withdrawal from the port. Since there is also a response component that remains different apart from this motoric difference (Figure 2, Supplementary Figure 1E), it seems this is what needs to be analyzed with respect to reward probability, to truly determine whether there is no RPE component. Was this done?

(2) The current study reaches very different conclusions than a 2016 study by Stephenson-Jones and colleagues despite using a similar behavioral task to study the same Sst-EPN-LHb circuit. This is potentially very interesting, and the new findings likely shed important light on how this circuit really works. Hence, I would have liked to hear more of the authors' thoughts about possible explanations of the differences. I acknowledge that a full answer might not be possible, but in-depth elaboration would help the reader put the current findings in the context of the earlier work, and give a better sense of what work still needs to be done in the future to fully understand this circuit.

For example, the authors suggest that the Sst-EPN-LHb circuit might be involved in initial learning, but play less of a role in well-trained animals, thereby explaining the lack of observed behavioral effect. However, it is my understanding that the probabilistic switching task forces animals to continually update learned contingencies, rendering this explanation somewhat less persuasive, at least not without further elaboration (e.g. maybe the authors think it plays a role before the animals learn to switch?).

Also, as I understand it, the 2016 study used manipulations that likely impaired phasic activity patterns, e.g. precisely timed optogenetic activation/inhibition, and/or deletion of GABA/glutamate receptors. In contrast, the current study's manipulations - blockade of vesicle release using tetanus toxin or deletion of VGlut2, would likely have blocked both phasic and tonic activity patterns. Do the authors think this factor, or any others they are aware of, could be relevant?

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation