Peer review process
Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.
Read more about eLife’s peer review process.Editors
- Reviewing EditorBrice BathellierCentre National de la Recherche Scientifique, Paris, France
- Senior EditorJohn HuguenardStanford University School of Medicine, Stanford, United States of America
Reviewer #1 (Public review):
What neurophysiological changes support the learning of new sensorimotor transformations is a key question in neuroscience. Many studies have attempted to answer this question at the neuronal population level - with varying degrees of success - but few, if any, have studied the change in activity of the apical dendrites of layer 5 cortical neurons. Neurons in the layer 5 of the sensory cortex appear to play a key role in sensorimotor transformations, showing important decision and reward-related signals, and being the main source of cortical and subcortical projections from the cortex. In particular, pyramidal track (PT) neurons project directly to subcortical regions related to motor activity, such as the striatum and brainstem, and could initiate rapid motor action in response to given sensory inputs. Additionally, layer 5 cortical neurons have large apical dendrites that extend to layer 1 where different neuromodulatory and long-range inputs converge, providing motor and contextual information that could be used to modulate layer 5 neurons output and/or to establish the synaptic plasticity required for learning a new association.
In this study, the authors aimed to test whether the learning of a new sensorimotor transformation could be supported by a change in the evoked response of the apical dendrites of layer 5 neurons in the mouse whisker primary somatosensory cortex. To do this, they performed longitudinal functional calcium imaging of the apical dendrites of layer 5 neurons while mice learned to discriminate between two multiwhiskers stimuli. The authors used a simple conditioning task in which one whisker stimulus (upward or backward air puff, CS+) is associated with reward after a short delay, while the other whisker stimulus (CS-) is not. They found that task learning (measured by the probability of anticipatory licking just after the CS+) was not associated with a significant change of the average population response evoked by the CS+ or the CS-, nor change in the average population selectivity. However, when considering individual dendritic tufts, they found interesting changes in selectivity, with approximately equal numbers of dendrites becoming more selective for CS+ and dendrites becoming more selective for CS-.
One of the major challenges when assessing changes in neural representation during the learning of such Go/NoGo tasks is that the movements and rewards themselves may elicit strong neural responses that may be a confounding factor, that is, inexperienced mice do not lick in response to the CS+, while trained mice do. In this study, the authors addressed this issue in three ways: first, they carefully monitor the orofacial movements of mice and show that task learning is not associated with changes in evoked whisker movements. Second, they show that whisking or licking evokes very little activity in the dendritic tufts compared to whisker stimuli (CS+ and CS-). Finally, the authors introduced into the design of their task a post-conditioning session after the last conditioning session during which the CS+ and the CS- are presented but no reward is delivered. During this post-session, the mice gradually stopped licking in response to the CS+. A better design might have been to perform the pre-conditioning and post-conditioning sessions in non-water-restricted, unmotivated mice to completely exclude any lick response, but the fact that the change in selectivity persists after the mice stopped licking in the last blocks of the post-conditioning session (in mice relying only on their whiskers to perform the task) is convincing.
The clever task design and careful data analysis provide compelling evidence that learning this whisker discrimination task does not result in a massive change in sensory representation in the apical dendritic tufts of layer 5 neurons in the primary somatosensory cortex on average. Nevertheless, individual dendritic tufts do increase their selectivity for one or the other sensory stimulus, likely enhancing the ability of S1 neurons to accurately discriminate the two stimuli and trigger the appropriate motor response (to lick or not to lick).
One limitation of the present study is the lack of evidence for the necessity of the primary somatosensory cortex in the learning and execution of the task. As the authors have strongly emphasized in their previous publications, the primary somatosensory cortex may not be necessary for the learning and execution of simple whisker detection tasks, especially when the stimulus is very salient. Although this new task requires the discrimination between two whisker stimuli, the simplicity and salience of the whisker stimuli used could make this task cortex independent. Especially when considering that some mice seem to not rely entirely on their whiskers to execute the task.
Nevertheless, this is an important result that shows for the first-time changes in the selectivity to sensory stimuli at the level of individual apical dendritic tufts in correlation with the learning of a discrimination task. This study sheds new light on the cortical cellular substrates of reward-based learning, and opens interesting perspectives for future research in this area. In future studies, it will be important to determine whether the change in selectivity of dendritic calcium spikes is causally involved in the learning the task or whether it simply correlates with learning, as a consequence of changes in synaptic inputs caused by reward. The dendritic calcium spikes may be involved in the establishment of synaptic plasticity required for learning and impact the output of layer 5 pyramidal neurons to trigger the appropriate motor response. It would be important also to study the changes in selectivity in the apical dendrite of the identified projection neurons.
Comments on revisions:
The authors have addressed all my questions. I have no further recommendations.
Reviewer #2 (Public review):
Summary:
The authors did not find an increased representation of CS+ throughout reinforcement learning in the tuft dendrites of Rbp4-positive neurons from layer 5B of the barrel cortex, as previously reported for soma from layer 2/3 of the visual cortex.
Alternatively, the authors observed an increased selectivity to both stimuli (CS+ and CS-) during reinforcement learning. This feature 1) was not present in repeated exposures (without reinforcement), 2) was not explained by animal's behaviour (choice, licking and whisking) and 3) was long-lasting, being present even when the mice disengaged from the task.
Importantly, increased selectivity was correlated with learning (% correct choices), and neural discriminability between stimuli increased with learning.
In conclusion, the authors show that tuft dendrites from layer 5B of the barrel cortex increase the representation of conditioned (CS+) and unconditioned stimuli (CS-) applied to the whiskers, during reinforcement learning.
Strengths:
The results presented are very consistent throughout the entire study, and therefore very convincing:
(1) The results observed are very similar using two different imaging techniques (using 2-photon -planar imaging- and SCAPE - volumetric imaging). Fig. 3 and Fig.4 respectively.
(2) The results are similar using "different groups" of tuft dendrites for the analysis (e.g. initially unresponsive and responsive pre- and post-learning). Fig. 5.
(3) The results are similar from a specific set of trials (with the same sensory input, but different choices). Fig.7.
(4) Additionally, the selectivity of tuft dendrites from layer 5B of the barrel cortex was higher in the mice that exclusively used the whisker to respond to the stimuli (CS+ and CS-).
The results presented are controlled against a group of mice that received the same stimuli presentation, except the reinforcement (reward).
Additionally, the behaviour outputs, such as choice, whisking and licking could not account for the results observed.
Although there are no causal experiments, the correlation between selectivity and learning (% of correct choices), as well as the increased neural discriminability with learning, but not in repeated exposure, are very convincing.
Weaknesses:
The biggest weakness is the absence of causality experiments. Although inhibiting specifically tuft dendritic activity in layer 1 from layer 5 pyramidal neurons is very challenging, tuft dendritic activity in layer 1 could be silenced through optogenetic experiments as in Abs et al. 2018. By manipulating NDNF-positive neurons the authors could specifically modify tuft dendritic activity in the barrel cortex during CS presentations, and test if silencing tuft dendritic activity in layer 1 would lead to the lack of selectivity and an impairment of reinforcement learning. Additionally, this experiment will test if the selectivity observed during reinforcement learning is due to changes in the local network, namely changes in local synaptic connectivity, or solely due to changes in the long-range inputs.