The role of higher-order thalamus during learning and correct performance in goal-directed behavior

  1. Danilo La Terra
  2. Ann-Sofie Bjerre
  3. Marius Rosier
  4. Rei Masuda
  5. Tomás J Ryan
  6. Lucy M Palmer  Is a corresponding author
  1. Florey Institute of Neuroscience and Mental Health, University of Melbourne, Australia
  2. School of Biochemistry and Immunology and Trinity College Institute for Neuroscience, Trinity College Dublin, Ireland
  3. Child & Brain Development Program, Canadian Institute for Advanced Research (CIFAR), Canada

Abstract

The thalamus is a gateway to the cortex. Cortical encoding of complex behavior can therefore only be understood by considering the thalamic processing of sensory and internally generated information. Here, we use two-photon Ca2+ imaging and optogenetics to investigate the role of axonal projections from the posteromedial nucleus of the thalamus (POm) to the forepaw area of the mouse primary somatosensory cortex (forepaw S1). By recording the activity of POm axonal projections within forepaw S1 during expert and chance performance in two tactile goal-directed tasks, we demonstrate that POm axons increase activity in the response and, to a lesser extent, reward epochs specifically during correct HIT performance. When performing at chance level during learning of a new behavior, POm axonal activity was decreased to naive rates and did not correlate with task performance. However, once evoked, the Ca2+ transients were larger than during expert performance, suggesting POm input to S1 differentially encodes chance and expert performance. Furthermore, the POm influences goal-directed behavior, as photoinactivation of archaerhodopsin-expressing neurons in the POm decreased the learning rate and overall success in the behavioral task. Taken together, these findings expand the known roles of the higher-thalamic nuclei, illustrating the POm encodes and influences correct action during learning and performance in a sensory-based goal-directed behavior.

Editor's evaluation

The thalamus is the hub connecting sensory inputs to cortical processing. The elegant study here used 2-photon calcium imaging and behavioral tasks to reveal a role for the posteromedial nucleus of the thalamus in goal directed forepaw behaviors in mice.

https://doi.org/10.7554/eLife.77177.sa0

Introduction

Goal-directed behavior is crucial for survival in a dynamic environment. It involves the encoding and integration of sensory information that leads to specific rewarded behaviors (Kepecs et al., 2008; Li et al., 2015; Takahashi et al., 2016; Xu et al., 2012). This process must be dynamic, as flexible switching of learnt behaviors is required throughout life. The thalamus is a fundamental hub for the transfer of sensory information to the cortex, sending and receiving widespread innervation from numerous cortical and subcortical structures (Oh et al., 2014; Sherman and Guillery, 1996). Despite being positioned to coordinate the relay and integration of sensory information required during sensory-based behavior, historically, the thalamus has been viewed as a passive sensory relay center with negligible contribution to higher-order brain function and behavior. Recent studies have challenged this classical view, illustrating the thalamus plays crucial roles in cognitive tasks such as attention (Schmitt et al., 2017; Wimmer et al., 2015; Zhou et al., 2016), sensory perception (Saalmann and Kastner, 2011; Wilke et al., 2009), motor preparation and suppression (Casagrande et al., 2005; Yu et al., 2016), cortical plasticity (Audette et al., 2019; Gambino et al., 2014), and learning (Williams and Holtmaat, 2019).

The higher-order thalamus is an enigmatic class of nonspecific (diffuse projecting) thalamic nuclei which send feedback input to sensory cortical areas (Sherman and Guillery, 1996). These thalamic nuclei are thought to play an important role in behavioral flexibility (Wimmer et al., 2015) as reported changes in firing patterns within the higher-order thalamus (Ramcharan et al., 2005; Urbain et al., 2015) may underlie cortical state changes during adaptive behavior (Bruno and Sakmann, 2006; Poulet et al., 2012). Specifically, the posteromedial nucleus of the thalamus (POm) is the higher-order thalamic nucleus subtending sensory processing in the primary somatosensory cortex (S1) (Deschênes et al., 1998; Jones, 2007). Sending dense projections to layers 1 and 5 of S1 (Meyer et al., 2010), the POm specifically targets a complex cortical microcircuit (Audette et al., 2018) which influences the encoding of somatosensory inputs (Castejon et al., 2016; Mease et al., 2016; Urbain et al., 2015; Zhang and Bruno, 2019) and decision-related information (El-Boustani et al., 2020). The POm is reciprocally connected with S1, but also receives and sends projections to secondary sensory, motor, premotor, and association cortices as well as many subcortical regions including the zona incerta and striatum (Alloway et al., 2017; Oh et al., 2014; Trageser and Keller, 2004; Yamawaki and Shepherd, 2015). Based on its influence on cortical sensory processing and known extensive connectivity, the POm may play an important role during learning and performance in behaviors which require both the perception and integration of sensory information, such as sensory-based goal-directed behavior. To test this, we used two-photon Ca2+ imaging and optogenetics to investigate the role of POm projections in the forepaw S1 during learning, and chance and expert performance in tactile goal-directed behavior.

Results

Ca2+ imaging of POm axonal projections in forepaw S1 during ‘action’ goal-directed behavior

To assess the activity of POm projection axons during tactile-based behavior, two-photon Ca2+ imaging of POm axons projecting to forepaw S1 was performed in mice (p50–70) previously injected with the Ca2+ indicator GCaMP6f (AAV1.Syn.GCaMP6f.WPRE.SV40) into the POm (see Methods; Figure 1—figure supplement 1). Mice were then trained to associate forepaw tactile stimulation (200 Hz, 500 ms) with a reward in a goal-directed tactile detection task (see Methods; Figure 1B). If mice correctly responded by licking a reward port within 1.5 s after receiving the tactile stimulus, a sucrose water reward (10% sucrose water) was delivered. We refer to this behavioral paradigm as ‘action’ goal-directed task (action task). Mice rapidly learnt this task, taking on average 4.38 ± 0.37 days to reach expert level (>80% correct responses to tactile stimulation; Figure 1—figure supplement 2). Once expert, Ca2+ transients were recorded from POm axons that project to layer 1 of the forepaw S1 (48 ± 6.8 μm from the pia surface; Figure 1C, D and Figure 1—figure supplement 3). POm axons were excluded if they had greater than 95% correlated activity with other axons within the POm (see Methods and Figure 1—figure supplement 4). During correct performance in the tactile goal-directed task, large Ca2+ transients (>2 SD of the baseline fluorescence; see Methods) were evoked in 90% of POm axons. This task-evoked POm axonal activity was greater than tactile-evoked Ca2+activity in the naive state, with a significantly higher probability of evoking a Ca2+ transient in POm axons during the tactile task (0.35 ± 0.03 vs 0.21 ± 0.02; p = 0.0007; Figure 1E and Figure 1—figure supplement 5). Once evoked, the amplitude of the Ca2+ transients was not significantly different between naive and expert mice (p = 0.58; F test). To further assess the activity of POm axons during tactile goal-directed behavior, we categorized all axons according to their peak activity during baseline (-2 to -1 s prestimulus), stimulus/response (response; 0–1 s poststimulus), or reward (2–3 s poststimulus) epochs of the task (Figure 1F). Since during expert behavior, only a small portion of axons responded during the stimulus epoch alone (6%; see Methods), the stimulus and response epoch were merged. Here, POm axonal activity was greatest during the response epoch, increasing signaling by more than four fold above baseline (probability per trial, 0.32 ± 0.02 vs 0.08 ± 0.003; n = 418 axons, 11 mice; p < 0.0001; Figure 1G). POm axons also significantly increased activity above baseline during the reward epoch (probability per trial, 0.12 ± 0.007; n = 418 axons, 11 mice, p < 0.0225). However, when compared with the response-evoked activity, active POm axons were reduced in number (n = 275 vs 359 axons) and evoked rate (probability per trial, p < 0.0001), suggesting that POm axons preferentially encode the response epoch (Figure 1G). Direct comparison of the Ca2+ transient amplitudes from POm axons with both spontaneous and evoked activity (n = 239 axons, 11 mice) illustrates that Ca2+ transients evoked during the response epoch (0.97 ± 0.04 ∆F/F) were also significantly larger than transients evoked during both baseline (0.756 ± 0.02 ∆F/F, p = 0.0005) and reward delivery (0.679 ± 0.0219 ∆F/F; p < 0.0001), further highlighting the enhanced POm axonal signaling during the behavioral response (Figure 1H). Together, these results illustrate that the POm increases signaling in S1 during both response and reward delivery, with greatest activity during the behavioral response to tactile goal-directed behavior. Licking motion itself did not influence POm axon activity in forepaw S1, as there was no correlation between licking frequency and POm axonal activity (p = 0.923; Figure 1I). Furthermore, there was no detectable change from baseline in POm axonal Ca2+ activity during spontaneous licking (0.09 ± 0.027 Hz; p = 0.29; n = 71 axons, 3 mice). Therefore, overall, POm axons in forepaw S1 encode response and, to a lesser extent, reward information during a tactile goal-directed task.

Figure 1 with 5 supplements see all
Ca2+ activity of POm axonal projections in forepaw S1 during tactile goal-directed behavior.

(A) The Ca2+ indicator GCaMP6f was locally injected into the POm (bottom) which sends axonal projections to layers 1 and 5 of the forepaw S1 (top). Inset, in vivo two-photon Ca2+ image of POm axonal projections in forepaw S1 (depth, 60 μm; scale bar, 10 μm). (B) Two-photon Ca2+ imaging of GCaMP6f-expressing POm axons in forepaw S1 was performed in head-restrained mice trained to report the detection of a tactile stimulus (200 Hz, 500 ms) by licking a reward port. Correct responses (HIT) were rewarded with sucrose water reward (10 μl, 10% sucrose). (C) Top, raster plot showing a typical behavioral response (licks) sorted into correct HIT performance and Catch (no-stimulus) trials. Gray, spontaneous; red, tactile stimulus; green, response epoch; blue, reward epoch. Blue line, reward delivery. Bottom, example of Ca2+ activity pattern during correct performance and Catch trials from the POm axon in (A). Each row represents a single trial, sorted according to trial number. (D) Mass average with standard error of the mean (SEM; shaded area) of all stimulus-evoked Ca2+ transients in all axons during correct goal-directed performance (HIT; black). Behavioral epochs indicated by color bars (red, stimulus; green, response; blue, reward). (E) Probability of evoking a Ca2+ response during correct HIT behavior (black) compared with tactile-evoked activity in the naive state (gray, n = 113 axons; Mann–Whitney test). (F) Top, Ca2+ activity pattern during HIT performance in the tactile goal-directed task. Each row is an independent axon normalized to maximum fluorescence and sorted by the timing of the peak amplitude (gray, baseline; red, stimulus; green, response epoch; blue, reward epoch). Red lines, stimulus delivery. Dashed line, reward delivery. Bottom, average Ca2+ response in POm axons active during the stimulus and response epoch (green), reward epoch (blue); baseline (no behavior; gray). (G) The probability of a Ca2+ transient in POm axons during baseline (gray), response epoch (green), reward epoch (blue). n = 418 axons, 11 mice. Friedman test + Dunn’s multiple comparisons test. (H) The amplitude of Ca2+ transients in POm axons evoked during baseline (gray), response epoch (green), and reward epoch (blue). n = 239 axons, 11 mice with evoked Ca2+ transients. Friedman test + Dunn’s multiple comparisons test. (I) Top, average lick frequency during spontaneous (gray), stim/response (green), and reward (blue) epochs during correct HIT behavior. Bottom, histogram of Ca2+ transient probability in POm axons. *p < 0.05, ***p < 0.001, ****p < 0.0001.

POm axon activity in forepaw S1 is correlated with correct tactile goal-directed behavior

We next assessed whether POm axonal activity in forepaw S1 changes according to behavioral performance. Upon receiving a tactile stimulus in the action task, mice had to lick a reward port within 1.5 s to receive a sucrose water reward (HIT). However, if they did not respond during this epoch, then no water was delivered (MISS; Figure 2A). Despite performing at expert level, mice did not respond (MISS) to on average 12.63% ± 6.63% of tactile stimuli. To assess whether the activity of POm axons is also correlated with MISS behavior, evoked Ca2+ activity was directly compared in POm axons with both HIT and MISS activity (n = 159 axons, 6 mice). Compared to correct HIT behavior, POm axons in forepaw S1 were overall less active during MISS trials (Figure 2B). In addition to an overall decrease in the number of axons active throughout the entire tactile goal-directed task (by 51%), there was also a significant decrease in the probability of evoking an axonal Ca2+ event during the behavioral response in MISS trials (paired; HIT, 0.27 ± 0.02; MISS, 0.12 ± 0.01; p < 0.0001; Figure 2C). In contrast, the peak amplitude of the evoked Ca2+ transients was similar during HIT and MISS trials (paired; response; HIT, 0.954 ± 0.08; MISS 0.888 ± 0.08; p = 0.470, n = 82 axons; reward; HIT, 0.635 ± 0.04; MISS, 0.708 ± 0.06; p = 0.252, n = 73 axons, 6 mice). Behaviorally speaking, HIT and MISS trials differ in the mouse movement, which has been shown to increase overall brain activity (Stringer et al., 2019). To investigate whether the increased POm axonal activity during the HIT response to tactile goal-directed behavior is due to body movement, we compared the evoked Ca2+ activity in POm axons with catch trials where mice spontaneously licked for reward (false alarm, FA). Despite licking during FA trials, POm axons were significantly less active than during HIT trials (unpaired, probability per trial, 0.15 ± 0.01, n = 239 axons, 10 mice, p < 0.0001; Figure 2C). Since there was no significant difference between FA and HIT licking rates (p = 0.203, n = 9 mice), these data further suggest that POm axonal activity is not simply due to licking behavior. There was also a significant decrease in the probability of evoking an axonal Ca2+ event during the reward epoch in MISS trials (HIT, 0.13 ± 0.02; MISS, 0.09 ± 0.01; p = 0.0408; n = 159 axons, 6 mice; Figure 2D). During MISS trials, POm axonal activity was similar to baseline rates (0.07 ± 0.007; p > 0.999; n = 159 axons, 6 mice; Figure 2D). Together, these data suggest that the POm encodes behavioral performance, increasing signaling between the POm and forepaw S1 during correct HIT behavior during both the response and reward epochs in a tactile goal-directed task.

POm axonal projections in forepaw S1 have greatest activity during correct behavioral performance in a tactile goal-directed task.

(A) Behavioral task design. Two-photon Ca2+ imaging of GCaMP6f-expressing POm axons in forepaw S1 was performed in head-restrained mice trained to report the detection of a tactile stimulus (200 Hz, 500 ms) by licking a reward port. Mice received sucrose water reward (10 μl, 10% sucrose) during correct responses (HIT), whereas incorrect responses (MISS) were unrewarded. (B) Ca2+ activity patterns in POm axons with Ca2+ transients evoked during HIT (top) and MISS (bottom) behavior during the tactile goal-directed task (n = 159 axons, 6 mice). Gray, baseline; red, stimulus; green, response epoch; blue, reward epoch. Each row is an independent axon normalized to maximum fluorescence and sorted by the timing of the peak amplitude for both HIT and MISS trials. Orange bar denotes axons that were ‘active’ during the behavior. (C) The probability of a Ca2+ transient evoked during the response epoch in HIT (solid), MISS (empty), and false alarm (FA; dark gray) behavior. Wilcoxon matched-pairs signed rank test (HIT vs MISS) and Mann–Whitney test (FA vs HIT and MISS). (D) The probability of a Ca2+ transient evoked during the same time period as the reward epoch in HIT (solid), MISS (empty), and baseline (light gray). Friedman test + Dunn’s multiple comparisons test. *p < 0.05, ***p < 0.001, ****p < 0.0001.

POm axon activity in forepaw S1 during suppression of a goal-directed action

Goal-directed behavior requires motor actions to be suppressed once they are no longer appropriate to achieve the current goal (Jahanshahi et al., 2015). To investigate the involvement of the higher-order thalamus during suppression of a previously learned goal-directed action, we performed Ca2+ imaging from POm axons during a modified goal-directed paradigm. Here, mice previously injected with the Ca2+ indicator GCaMP6f in the POm were trained in the ‘action’ goal-directed task (as above). Once expert (>80% correct responses to tactile stimulation), the behavioral paradigm was changed such that the mice only received the reward if they suppressed licking in response to the tactile stimulus (Figure 3A). We refer to this behavioral paradigm as ‘action–suppression’ goal-directed task (suppression task). To monitor cognitive arousal (Bradley et al., 2008), dynamic changes in pupil diameter were recorded while mice were performing the behavioral tasks. Despite the enforced behavioral (licking) suppression, the pupil diameter was significantly increased from baseline during the behavior (0.32 ± 0.05 to 0.44 ± 0.07; p = 0.031; n = 6 mice), indicating mice were engaged in the task. When compared with the action goal-directed task, there was no significant difference in peak pupil diameter during the pretrial baseline (0.32 ± 0.05 vs 0.29 ± 0.06 mm, p = 0.312), pretactile (0.35 ± 0.06 vs 0.31 ± 0.07 mm, p = 0.219), and post-tactile (0.44 ± 0.07 vs 0.41 ± 0.09 mm; p = 0.687; n = 6 mice; Figure 3B). Furthermore, there was no significant difference in correct performance rates during the action (83% ± 5% correct; n = 11 mice) and suppression (86% ± 7%; n = 6 mice; p = 0.57) tasks. POm projections in S1 were highly active during the suppression task, with evoked Ca2+ transients that were significantly larger in amplitude than spontaneous activity (0.99 ± 0.06 vs 1.19 ± 0.06 ∆F/F; n = 144 axons, 6 mice; p = 0.0002). Similar to the action task, POm axons were most active during the response epoch in correct HIT trials (evoked rate, 0.24 ± 0.03; n = 144 axons, 6 mice; Figure 3C). Therefore, since the suppression task requires mice to suppress licking during the response epoch, the increased response activity was not correlated with licking behavior. To assess whether POm activity reflected behavioral performance in the suppression task, evoked POm Ca2+ activity was directly compared during MISS trials. Similar to the action task, POm axons in forepaw S1 were less active during MISS behavior, with a significant decrease in the probability of response-evoked activity in MISS trials compared to HIT trials (HIT, 0.24 ± 0.03 vs MISS, 0.15 ± 0.03; n = 144/53 axons, 6 mice; p = 0.029; Figure 3D). Here, MISS behavior involves incorrectly licking for reward during the response epoch, further suggesting that POm axonal activity in S1 does not signal licking behavior. Taken together, in both the action and suppression tasks, POm axons in forepaw S1 preferentially encode the response epoch during correct performance (HIT trials). On average, the peak amplitudes (1.19 ± 0.06 vs 1.26 ± 0.05 ∆F/F, p = 0.3812) and durations (623 ± 50 vs 666 ± 35 ms; p = 0.2234) of Ca2+ transients evoked during the action and suppression tasks were comparable (Figure 3F). However, during the suppression task, the probability of POm signaling during the response epoch was significantly decreased compared to the action task (p = 0.0007; Figure 3F). This contrasts with the similar probability of evoked POm signaling during the reward epoch (p = 0.87; Figure 3F). Together, these results further support the increased signaling of POm axons within forepaw S1 during correct (HIT) goal-directed active behavior.

Ca2+ dynamics in POm axonal terminals during suppression of a goal-directed action.

(A) Behavioral task design. Two-photon Ca2+ imaging of POm axon terminals was performed in head-restrained mice trained to suppress a previously learned goal-directed action. Mice were trained to withhold licking in response to forepaw stimulation (200 Hz, 500 ms) for 1.5 s to get a reward (10 μl, 10% sucrose water). (B) Top, average pupil diameter with SEM (shaded area) during correct performance in the ‘suppression’ goal-directed task (red) and ‘action’ goal-directed task (black). Bottom, comparison of pupil dilation during the ‘action’ and ‘suppression’ goal-directed tasks in baseline, pre-tactile stimulus (pre-tac) and post-tactile stimulus (post-tac) epochs (n = 6 mice: Wilcoxon matched-pairs signed rank test). Gray line, trial start; red line, stimulus, blue line, reward delivery. (C) Top, raster plot showing the typical licking response during correct performance of the task. Gray, spontaneous; red, stimulus; green, response epoch; blue, reward epoch. Blue line, reward delivery. Middle, Ca2+ activity pattern in an example axon during HIT trials. Bottom, average Ca2+ activity pattern with SEM (shaded area) in HIT trials for the example axon (n = 17 trials). Red line, stimulus delivery; blue line, reward delivery. (D) Probability of evoking a Ca2+ transient during HIT (correct suppression of licking behavior; red) and MISS (no suppression of licking behavior; red empty). n = 144 axons, 6 mice; Wilcoxon matched-pairs signed rank test. (E) Overlay of the mass average with SEM (shaded area) of the normalized Ca2+ activity pattern during correct performance in the suppression goal-directed task shown in (C) (red) and action goal-directed task (black). (F) Probability of evoked Ca2+ transients during baseline, response, and reward epochs in the ‘suppression’ goal-directed task (red) and ‘action’ goal-directed task (black). Mann–Whitney test. **p < 0.01, ***p < 0.001, ****p < 0.0001.

POm axon activity during switching of tactile goal-directed behavior

Flexibly switching motor actions in response to changing conditions is crucial for survival. Termed ‘behavioral flexibility’, this enables changes in the behavioral response to sensory information in dynamic environments. To investigate the role of POm during switching of rewarded behavior, we performed Ca2+ imaging from POm axons in forepaw S1 as mice transitioned from the ‘action’ goal-directed task to the ‘action–suppression’ goal-directed task (Figure 4A). We refer to this behavioral paradigm as ‘switching’. On average mouse performance returned to chance level (50% correct performance) 2.25 ± 0.47 training sessions after switching the rewarded behavior. To monitor task engagement, pupil tracking was performed during the switch in behavior. Compared with correct performance in the active goal-directed behavior, there was no significant difference in pupil peak diameter during pre-trial baseline (0.28 ± 0.05 vs 0.30 ± 0.07 mm, p = 0.6871, n = 6 mice), pre-tactile (0.29 ± 0.05 vs 0.32 ± 0.07, p = 0.4372, n = 6 mice) and post-tactile epochs (0.40 ± 0.06 vs 0.43 ± 0.01; p = 0.6874; n = 6 mice, Figure 4B). Although equally engaged in the task, the activity of POm axonal projections in forepaw S1 was overall reduced during chance (50% correct) nonexpert behavior. Unlike expert behavior, the evoked rate of POm activity during chance performance did not reflect task performance, with similar evoked rates during both HIT (no lick, rewarded) and MISS (lick, unrewarded) responses (probability per trial, 0.15 ± 0.02 vs 0.15 ± 0.01, p = 0.74; n = 121 axons, 4 mice; Figure 4C–E). This rate of evoked activity during chance performance was similar to naive mice (p = 0.159), and significantly reduced when compared to expert performance (Figure 4E). Furthermore, during chance performance in nonexpert mice, POm projections in S1 did not signal correct performance nor reward delivery as there was no difference in the evoked rate of POm axonal Ca2+activity during the behavioral response and reward epochs (probability per trial, 0.15 ± 0.02 vs 0.14 ± 0.02; p = 0.62; n = 121 axons, 4 mice). Taken together, unlike expert behavior, POm axonal activity in forepaw S1 was reduced and not correlated with the behavioral response during chance, nonexpert, performance in a goal-directed task.

Ca2+ activity of POm axonal projections in forepaw S1 during chance performance and behavioral switching.

(A) Behavioral task design. Ca2+ imaging from POm axons in forepaw S1 was performed as mice transitioned from the ‘action’ goal-directed task to the ‘action–suppression’ goal-directed task (50% correct performance, green). (B) Average pupil dilation during baseline, pre-tactile stimulation (pre-tac) and post-tactile stimulation (post-tac) during the ‘switch’ (green) and ‘action’ task (black; n = 6). Red bar, tactile stimulus; blue bar, reward delivery. (C) Example licking behavior and associated Ca2+ responses from an example axon during HIT (top) and MISS (bottom) trials. (D) (left) Individual and (right) overlay of average with SEM (shaded area) of evoked Ca2+ transients during correct (green) and incorrect (light blue) performance from example in (C). (E) The probability of a Ca2+ transient in MISS (gray) and HIT (green) trials during chance performance (gray), and expert HIT performance in ‘action’ (black) and ‘suppression’ (red) tasks. (F) Peak amplitude of evoked Ca2+ transients during HIT trials in the action (black), switch (green), and suppression (red) behavioral task. Error bars indicate the mean ± SEM. **p < 0.01, ***p < 0.001, ****p < 0.0001.

To further investigate the potential role of the POm in behavioral switching, direct comparison of the amplitude of POm axonal transients was performed in mice which performed all tasks (Action, Switch, and Suppression tasks; n = 4 mice). Although POm axons were less active overall, when evoked, the amplitude of Ca2+ transients evoked during HIT (Action, 1.19 ± 0.08 ∆F/F; Switch, 1.51 ± 0.09 ∆F/F, Suppression, 1.13 ± 0.09 ∆F/F; p = 0.0003; n = 77/69/47 axons, 4 mice; Figure 4F) and MISS (Action, 0.76 ± 0.05 ∆F/F; Switch, 1.37 ± 0.07 ∆F/F, Suppression, 1.02 ± 0.08 ∆F/F; p = 0.0001; n = 79/88/30 axons, 4 mice) performance during chance behavior was significantly larger than expert behavior. The lower evoked rate, but larger POm axonal transients during chance performance in a goal-directed task suggests a shift in the activity of POm input to forepaw S1 during nonexpert behavior, as mice are adjusting their behavioral strategy while learning the new goal-directed task.

The influence of POm input during expert goal-directed behavior

The results above suggest the POm axonal activity in forepaw S1 is greatest during correct HIT behavioral response during expert, but not chance, performance in tactile goal-directed behavior. To investigate the role of this POm input on the correct performance during expert behavior, the POm was photoinhibited while expert mice performed the goal-directed task. Here, the inhibitory opsin, archaerhodopsin (ArchT; AAV1.CAG.ArchT.GFP.WPRE.SV40, 60 nl) was unilaterally injected into the POm. First, the effectiveness of 565 nm LED photoinhibition of POm neurons expressing ArchT was tested using patch clamp electrophysiology in the thalamic brain-slice preparation. Although photoinhibition did not completely abolish action potentials in POm neurons, the evoked firing rate was significantly decreased by 64% ± 13% (p = 0.031; n = 6 neurons; Figure 5—figure supplement 1). Next, we tested the influence of this decrease in POm activity on active goal-directed behavior in expert mice. A fiber-optic cannula was chronically inserted into the POm which was previously injected with ArchT (see Methods and Figure 5A) and mice were trained in the ‘action’ goal-directed task. Importantly, the duration of training and baseline performance was not affected by the injection of the inhibitory opsin into the POm (Figure 5—figure supplement 2). Once expert (>80% correct performance), the POm was initially photoinactivated with interleaved yellow LED light (565 nm, 5 mW, 2 s) during the stim/response epoch as this was when the POm was most active during the behavior (see Methods). Our findings illustrate that partial photoinactivation of the POm during the stimulus and response epoch produced a significant reduction in the overall behavioral performance (d prime, 2.58 ± 0.15 vs 2.23 ± 0.26; n = 9 mice; p = 0.04; Figure 5B) while no change was observed in the control group injected with green fluorescent protein (GFP) in the POm (d prime, 2.62 ± 0.24 vs 2.83 ± 0.29; n = 9 mice; p = 0.26; Figure 5—figure supplement 2). Specifically, the reduction in correct performance following POm partial photoinactivation was due to a significant decrease in performance during HIT trials (z-score, 2.09 ± 0.09 vs 1.77 ± 0.15; n = 9 mice, p = 0.02) and not the rate of FAs (z-score, 0.48 ± 0.16 to 0.46 ± 0.20; n = 9 mice, p = 0.82; Figure 5C). Despite this change in performance, POm photoinactivation during the stim/response epoch did not alter licking behavior as there was no significant difference in the latency to the first lick (control, 351 ± 29 ms vs ArchT, 342 ± 26 ms, n = 9 mice, p = 0.1282, Figure 5D). The specific influence of POm partial photoinactivation during the stim/response epoch is consistent with the increased signaling of POm axons within forepaw S1 during this epoch in expert mice (Figures 2 and 3). Since POm axons within S1 also increased activity above baseline during reward delivery, albeit less than during the behavioral response, we next tested whether photoinactivation of the POm during the reward epoch influenced behavioral performance. Here, no change was observed in overall behavioral performance when POm was photoinactivated during reward delivery (ArchT d prime, LED ON 3.54 ± 0.27 vs LED OFF 3.64 ± 0.09; n = 5 mice; p = 0.99; Figure 5E). There was also no change in overall behavioral performance during LED ON in reward delivery in control (GFP) mice (GFP d prime, LED ON 3.70 ± 0.19 vs LED OFF 3.69 ± 0.12; n = 6 mice; p = 0.75). Furthermore, similar to photoinactivation during the stim/response epoch, there was no influence on licking behavior when the POm was photoinactivated during the reward epoch (Figure 5F). Taken together, decreasing POm activity during the stim/response epoch in expert mice influenced correct performance in a goal-directed task, suggesting the higher-order thalamus specifically influences correct, but not incorrect, goal-directed responses.

Figure 5 with 2 supplements see all
Optogenetic inactivation of the POm during an active goal-directed task.

(A) Left, experimental design. The inhibitory opsin, archaerhodopsin (ArchT) was unilaterally injected into the POm and a fiber-optic cannula was chronically inserted into the brain. Right, localized ArchT spread in POm and fiber-optic track (dotted line), bar = 1 mm. POm was photoinactivated (590 nm, 5 mW, 2 s) either 500 ms prior to, and during the stimulus (S) and response (Rs) epochs (Stim/Resp), or during the reward epoch (Rw) in expert mice performing the ‘action’ goal-directed task. (B) Behavioral performance (d prime) for LED OFF vs LED ON during the stim/response epoch (n = 9 mice). Wilcoxon matched-pairs signed rank test. (C) z-Score during (left) HIT and (right) false alarm for LED OFF vs LED ON during the stim/response epoch (n = 9 mice). Wilcoxon matched-pairs signed rank test. (D) Latency to the first response lick in LED OFF vs LED ON during the stim/response epoch. Wilcoxon matched-pairs signed rank test. (E) Behavioral performance (d prime) during LED OFF and LED ON during the reward epoch in expert mice performing the ‘action’ goal-directed task (n = 5 mice). Wilcoxon matched-pairs signed rank test. (F) Normalized latency to the first response lick during LED ON in the stim/response epoch (solid) and reward (empty) epoch (normalized to the latency to the first lick during LED OFF). Line, median. Mann–Whitney test. Individual values are shown. *p < 0.05.

The influence of POm input during learning of goal-directed behavior

Our findings suggest that the POm changes activity patterns from naive to expert performance, suggesting that the POm may play a role in learning of a sensory-based goal-directed task. To test this, a fiber-optic cannula was chronically inserted into the POm which was previously injected with ArchT and mice were trained in the ‘action’ goal-directed task. During each training session, the POm was photoinactivated with yellow LED light (565 nm, 5 mW, 2 s) during the stim/response epoch as to not interfere with possible feedback pathways activated during reward, and this was also when the POm was most active during expert behavior. Here, when the POm was photoinactivated during learning, mice took on average 7.6 ± 1.3 sessions to reach expert (>80% correct) performance (n = 5 mice; Figure 6A). This is significantly greater than the number of sessions it took to reach expert performance in mice that were either previously injected with ArchT with LED OFF during learning (3.6 ± 0.4 sessions; n = 9 mice) or GFP (LED ON during learning; 4.5 ± 0.2 sessions; n = 6 mice; p = 0.004; Figure 6B). Therefore, decreasing POm activity during training in the goal-directed task influenced the rate of learning, with mice requiring more sessions to reach expert performance during POm photoinactivation (Figure 6C). Together, these findings suggest that the higher-order thalamus plays an important role in the learning of sensory-based tasks.

Optogenetic inactivation of the POm during learning of a goal-directed task.

(A) Mice were injected with either a control AAV (GFP; green) or the inhibitory opsin, archaerhodopsin (ArchT) into the POm and trained in the ‘action’ goal-directed task. A fiber-optic cannula was inserted to the POm and LED (590 nm, 5 mW, 2 s) was either ON (ArchT, orange; GFP, green) or OFF (ArchT, gray) during all training sessions. Dotted line indicates expert (>80% correct) performance. (B) The number of training sessions required for mice to reach expert (>80% correct) performance in mice injected with ArchT (LED OFF, gray; LED ON, orange) or GFP (LED ON, green). Kruskal–Wallis test. (C) The number of sessions for mice to reach expert (>80% correct) performance. **p < 0.01.

In summary, our findings suggest that POm axonal projections in forepaw S1 preferentially encode the behavioral response during learning and correct performance in tactile goal-directed behavior. Overall, POm axons were more active during expert performance, with greatest evoked rates in active behavior which required licking for reward, and POm photoinactivation during learning of the goal-directed behavior increased the number of training sessions required to reach expert performance. Taken together, these findings suggest that POm input to forepaw S1 shifts in strength and rate dynamically during learning and performance of a behavioral task, specifically encoding correct goal-directed action in the expert mouse.

Discussion

The results presented here highlight the role of the POm during sensory-based goal-directed behavior. We used two-photon Ca2+ imaging to illustrate that POm axonal activity in forepaw S1 encodes correct behavioral response during expert performance in tactile goal-directed behavior. Specifically, POm axons increased activity in the response and, to a lesser extent, reward epochs during correct performance in expert behavior. This is in contrast to chance performance, where POm axonal activity did not correlate with task performance. Furthermore, the POm influences learning and performance in goal-directed behavior, as photoinactivation of archaerhodopsin-expressing neurons in the POm decreased learning rates and correct performance in expert behavior. Taken together, these findings illustrate that POm input to forepaw S1 specifically encodes correct performance during goal-directed behavior and influences sensory-based learning.

The POm is a higher-order nonspecific thalamic nucleus that is reciprocally connected with S1, but also receives and sends projections to motor, premotor, association cortices, and the brainstem (Groh et al., 2014) as well as many subcortical regions including the zona incerta and striatum (Alloway et al., 2017; Oh et al., 2014; Trageser and Keller, 2004; Yamawaki and Shepherd, 2015). Given the extensive and heterogeneous organization of its afferent inputs, it is difficult to determine the input source driving POm activity during complex goal-directed behavior. However, since the POm receives input from the primary motor cortex (Yamawaki and Shepherd, 2015), it could be speculated that the increased axonal activity during the behavioral response has a motor origin. In fact, previous studies have shown that the thalamus is a circuit hub in motor preparation (Guo et al., 2017). While a general increase in POm activity has been reported during active states (Urbain et al., 2015), encoding of whisking related movement in the POm is relatively poor (Moore et al., 2015). In agreement, our findings illustrate that POm input in forepaw S1 does not specifically encode movement, as (1) POm activity is enhanced during the response epoch in both the action (licking) and suppression (no licking) tasks, (2) there is not a strong correlation between POm activity and licking frequency, and (3) POm activity is minimal during spontaneous licking and FA trials.

Here, we illustrate that POm activity is correlated with task performance in expert mice with greater signaling in POm axonal projections within forepaw S1 during correct HIT behavior in both the action and suppression tasks. Similar results were recently found in the thalamocortical circuit subserving the anterior lateral motor cortex (Takahashi et al., 2021) illustrating this may be a universal role of the thalamus. Although the POm encodes sensory information in naive mice, in the expert state, our findings suggest that this increased activity in POm axons during the response epoch is not primarily due to enhanced sensory encoding. Here, despite receiving exactly the same tactile stimulus, POm signaling in forepaw S1 is increased during correct HIT trials compared with MISS trials in both the action and suppression tasks. This difference in POm activity was not due to differences in licking behavior nor arousal, as POm activity was similar during the action and suppression tasks (which involved licking and not licking for reward) and did not reflect levels of arousal measured using pupil tracking. The difference in POm activity during the HIT and MISS trials was also not due to stimulus delivery as all experiments were monitored online via a behavioral camera to examine the location of the forepaw on the stimulus during all trials, and trials where the paw was not clearly resting on the stimulating rod were excluded from analysis. However, we cannot rule out that nondetectable changes in postures/paw grip may occur which may alter the effectiveness of the stimulus.

Although, overall, POm axons in forepaw S1 were predominantly active in the response epoch during correct performance in a tactile goal-directed behavior, the activity patterns of individual axons were heterogenous which may be due to a heterogenous population of POm neurons projecting to S1 (Clascá et al., 2012). Our findings illustrate that a subset of axons were correlated with the sensory stimulus and reward epoch during expert behavior. It is also possible that single POm axons may have heterogenous encoding which, since our findings are based on overall average activity per POm axonal projection, would not be evident in our study. Delving into encoding at the level of a single thalamic axon is an exciting direction for future research. How does POm encoding of goal-directed behavior compare to the activity of other thalamic nuclei which also project to forepaw S1? Of particular interest is the ventral posterolateral nucleus of the thalamus (VPL) which, in contrast to POm, targets the middle cortical layers of forepaw S1. Viewed as a feedforward (sensory) pathway, perhaps the VPL axons would be more active during the stimulus delivery and, in contrast to POm axons, their activity would be similar between the different behavioral tasks (action, suppression, and switch). It is of great interest to compare and contrast these different pathways to gain a holistic view of the role of the thalamus during goal-directed behavior, which will be the focus of exciting future studies.

In agreement with the POm axonal activity within S1, behavioral performance in expert mice was disrupted when the POm was photoinactivated during the stimulus and response epoch, but not during reward delivery. Together, these results suggest that the POm predominantly encodes the behavioral response. The behavioral effect was small which may be due to the following. Firstly, we illustrate that photoinhibiting LED light (565 nm) caused a significant decrease in the evoked action potential rate in POm neurons expressing archaerhodopsin in vitro. Taking into account the high firing rate of POm neurons in vivo, photoinhibition would not completely abolish POm activity in vivo. Therefore, during the goal-directed behavior, the POm is presumably still active, albeit at a reduced rate. Secondly, there was large behavioral variability which may reflect different rates of transfection and optical fiber placement. Thirdly, previous studies have illustrated that similar sensory-based goal-directed behaviors do not require primary cortical areas (Hong et al., 2018). Therefore, it is not expected that partially inhibiting an input stream to the forepaw S1 would have a large effect on the behavioral performance. Combined with the reported increase in POm axonal activity during correct performance in expert mice, the influence of POm photoinactivation on task performance further supports the finding that POm encodes correct performance in goal-directed action. In this study, POm was also photoinactivated during learning of the tactile goal-directed behavior. Here, dampening POm activity significantly decreased the rate of learning, causing a greater than twofold increase in the number of training sessions required to reach expert performance. In our study, the influence of POm photoinactivation on goal-directed behavior was measurably greater during learning than expert performance, suggesting that the POm plays a vital role in learning. The role of the POm during learning requires more in-depth investigation, and since the POm targets many cortical and subcortical regions (Alloway et al., 2017; Oh et al., 2014; Trageser and Keller, 2004; Yamawaki and Shepherd, 2015), future studies with target-specific photoinhibition are required to illustrate which POm projection pathway specifically influences the learning and execution of goal-directed behavior.

In this study, considerable effort was made to ensure the specific targeting of POm. The POm was stereotaxically targeted with small volumes and the resulting fluorescence at both the thalamic injection site and the cortical layer targeted by the axonal projections was scrutinized after every experiment (Gambino et al., 2014). We note that our stereotaxic injections were not flawless and virus occasionally spread into ventral posterior nuclei, or along the injection pipette track and into high-order visual thalamic nuclei, superficial to the POm. If fluorescence was detected in nontargeted areas, then the experiments were excluded from analysis. It is possible that there was weak (undetectable) expression outside of the POm, however, these neighboring thalamic nuclei do not predominantly target layer 1 of the forepaw area of S1 (Kamishina et al., 2009; Meyer et al., 2010; van Groen and Wyss, 1992) and therefore would not significantly contribute to our calcium imaging findings. In the optogenetic photoinhibition experiments, targeting of the fiber-optic canula to the POm was confirmed after every experiment and weak expression of ArchT outside of the POm would therefore also have minimal impact on our findings.

To probe whether thalamocortical projections to S1 are dynamic and change activity patterns according to changes in reward expectation and delivery, we recorded the activity of POm axonal projections in S1 following a switch in the task contingency. In accordance with thalamic function playing an important role in behavioral flexibility (Wimmer et al., 2015), evoked axonal Ca2+ activity was altered during the switch in rewarded behavior, suggesting a shift in action potential firing in POm thalamocortical neurons. Changes in firing mode have been reported in sensory higher-order thalamus of behaving rodents and primates (Ramcharan et al., 2005; Urbain et al., 2015) and may underlie cortical state changes during uncertain conditions and changes in reward expectation (Bruno and Sakmann, 2006; Poulet et al., 2012). These changes in firing patterns could drive different microcircuits (Allen et al., 2017; Morgenstern et al., 2016; Tye and Uchida, 2018) as POm inputs to the cortex directly target both excitatory and inhibitory neurons (Audette et al., 2018); however, more in-depth studies are required to directly investigate the influence of changing POm input on cortical microcircuits. Increased activity in the higher-order thalamus has also been associated with the expected value and significance of rewarded sensory stimuli (Komura et al., 2001) and may reflect learning-dependent strengthening of specific POm thalamocortical synapses (Audette et al., 2019). Our findings show that POm activity is enhanced during reward delivery in the tactile goal-directed task, although the absolute POm signaling is less than during the behavioral response.

Considering that patterns of cortical activity during behavior have been associated with task engagement, brain state, attention, motivation, or reward (Kobak et al., 2016; Lacefield et al., 2019; Poort et al., 2015; Poulet and Petersen, 2008; Reimer et al., 2014), we monitored pupil dynamics during the goal-directed tasks. We report that while overall POm activity increased concomitantly with pupil diameter during the behavioral response, this trend was reversed during reward delivery. By sorting POm axons according to their peak activity during the tactile goal-directed task, we revealed a subgroup of POm axons highly responsive during the reward epoch. This finding highlights the heterogeneity of the higher-order thalamus, with subsets of POm axonal projections specifically encoding either the stimulus, response, or reward delivery. In line with this finding, a recent report further supports the functional heterogeneity of POm cortical input and suggests it has a modulatory role in various brain regions during decision making in a goal-directed task (El-Boustani et al., 2020). However, overall, the results presented here illustrate that the POm predominantly transfers behaviorally relevant information to forepaw S1 during the response epoch of goal-directed behavior. Specifically, POm input to S1 is greatest in the response epoch during correct HIT performance in expert behavior. Although this increased activity during the behavioral response epoch may not be necessary for maintaining the tactile information, these findings suggest that the POm does not simply encode sensory information, but it also reports behavioral outcome in learnt behavior and changes in behavioral state. Since the POm projects to various cortical and subcortical regions (Oh et al., 2014; Yamashita et al., 2018), the POm may also send task relevant information to other brain regions (El-Boustani et al., 2020). Likewise, since S1 also receives input from various brain regions, it would be of interest to investigate whether other input pathways send complimentary information during goal-directed behavior.

In summary, we show that the higher-order thalamus encodes correct performance during goal-directed behavior and influences the rate of learning. This finding expands the known roles of the higher-order thalamic nuclei, from sensory encoding to influencing learning and correct performance in goal-directed behavior. Overall, the thalamus is not a simple relay system. It encodes and influences learning of goal-directed behaviors which are crucial for survival in a dynamic environment.

Materials and methods

All procedures were approved by the Florey Institute of Neuroscience and Mental Health Animal Care and Ethics Committee (17-091-FINMH) and followed the guidelines of the Australian Code of Practice for the Care and Use of Animals for Scientific Purposes.

Mice

Wild type C57BL/6 female mice (PN30–80) were used in this study. Mice were housed in groups of six in a 12:12 natural light/dark cycle. All behavioral tests were performed during the light phase.

Virus injection

Request a detailed protocol

All surgical procedures were conducted under isoflurane anesthesia (~1–2% in O2). Body temperature was maintained at ~36°C and the depth of anesthesia was monitored throughout the experiment. Mice (~PN30–40) were placed in a stereotaxic frame (Narishige) and eye ointment was applied to the eye to prevent dehydration. The skin was disinfected with ethanol 70% and betadine before lidocaine (1%, wt/vol) was topically applied to the wound edges for additional local anesthesia. An incision in the skin (10 mm) was made to expose the skull and a small craniotomy (~0.5 × 0.5 mm) was made over the left posteromedial (POm) complex of the thalamus using the following stereotaxic coordinates: rostrocaudal (RC), 1.7 mm; mediolateral (ML), 1.25 mm; dorsoventral (DV), 3.00 mm from bregma. AAV1.Syn.GCaMP6f.WPRE.SV40 (Addgene plasmid # 100837, 1 x 1013 vg/ml) or AAV1.CAG.ArchT.GFP.WPRE.SV40 (Addgene plasmid # 29777, 1 x 1013 vg/ml) was slowly injected from a glass pipette (60 nl, Wiretrol, Drummond) for at least 5 min using an oil hydraulic manipulator system (MMO-220A, Narishige). The skin was then sutured and Meloxicam (3 mg/kg) was injected intraperitoneally (i.p.) for additional postoperative analgesia and anti-inflammatory action. Mice were then returned to their home cage for recovery.

Chronic cranial window surgery

Request a detailed protocol

Mice previously injected with the Ca2+ indicator GCaMP6f were anaesthetized (isoflurane, ~1–2% in O2, vol/vol) and body temperature was maintained at ~36°C and the depth of anesthesia was monitored throughout the experiment. Eye ointment was applied to prevent dehydration and the top of the head was disinfected with ethanol 70% and betadine and lidocaine (1%, wt/vol) was topically applied for additional local anesthesia. The skin covering the skull was removed, and a craniotomy was performed over the left forepaw area of the primary somatosensory cortex (centered at coordinates: RC, 0 mm; ML, 2.3 mm; from bregma). The dura was left intact and a circular coverslip (3 mm diameter) was placed over the open craniotomy and seal attached to the skull with acrylic glue. A custom-made aluminum head bar (2 x 1 x 0.1 cm) was then attached to the skull for head-fixation using dental cement (C&B metabond, Parkell Inc). Meloxicam (3 mg/kg) was injected i.p. for additional postoperative analgesia and anti-inflammatory action. Mice were then returned to their cages to recover until behavioral training (~2 weeks).

Habituation and behavior

Request a detailed protocol

Mice were trained to perform a goal-directed tactile task using a custom-made behavioral platform (Micallef et al., 2017). A 3- to 4-day habituation period preceded the beginning of the operant conditioning. During this period, mice were handled and acclimatized to the behavioral setup. Mice were head restrained for incremental periods of time until habituated to head restraint. To maximize task engagement, a day prior to the beginning of behavioral training, mice were water restricted (1 ml/day of 10% sucrose water) and from this day onward this water regimen was maintained until the end of the experiment. Behavioral sessions lasted ~300 trials during which the mice typically obtained their daily water intake (1 ml/day) otherwise extra water was supplemented. Ca2+ imaging was performed following this habituation phase for naive data.

Behavioral platform

Request a detailed protocol

Mice were head-fixed to the recording frame and their paws rested unaided on either an active (contralateral) or inactive (ipsilateral) rod coupled to a stepper motor driven by an Arduino Uno microprocessor. The stepper motor delivered a pure frequency forepaw tactile stimulus (500 ms, 200 Hz). A water port was used to deliver a water reward (10 μl, 10% sucrose water) and licking frequency was recorded via a custom-made piezo-based lick sensor attached to the lick port. All behavioral tests were carried out in the dark while the animal behavior was monitored with an infrared sensitive camera (Microsoft lifecam). During the first training sessions, mice were habituated to tactile stimulus and reward delivery (typically one to two sessions). To establish an association between stimulus and reward, mice were able to self-initiate a trial by licking the water port which instantaneously triggered both stimulus and reward. After this habituation phase, operant conditioning was performed. Action goal-directed task: Background white noise (~40 DB) was played for the duration of each trial to indicate task onset and mask nontask-related sounds. Tactile stimulation (200 Hz, 500 ms) was delivered after a 3 s baseline period. Following stimulus presentation, mice were given a 1.5 s interval to report the detection of the tactile stimulus by licking the lickport (response epoch), after which reward was made available and cued by an auditory sound (400 Hz, 200 ms). Mice were then given a 2 s time window to retrieve the reward after which the trial terminated followed by an intertrial interval (ITI) of randomized duration (between 4 and 7 s). Only correct responses (licks during the response epoch) were rewarded (Correct) while failure to report stimulus detection was considered an incorrect response (Incorrect). Trials with no tactile stimulation (catch trials) were randomly interleaved with stimulus trials. Licking within the response epoch during a catch trial was considered a FA and punished with a timeout of incremental duration (2–7 s) while withholding licking was the correct response which was not rewarded, correct rejection (CR). Implementing catch trials and randomized ITI ensured that animals could not solve the task by adopting a time-based strategy. To facilitate learning, during the first training session the frequency of stimulus/catch trials was set to 90%/10%, respectively. The frequency of catch trials was progressively increased up to 40% and maintained at this ratio until mice could reliably perform at expert level (≥80 correct response rate). On average, mice reached expert level within 4.38 ± 0.37 training sessions. Action–suppression goal-directed task: Background white noise (~40 DB) was played for the duration of each trial to indicate task onset and mask nontask-related sounds. As in the action goal-directed task, tactile stimulation (200 Hz, 500 ms) was delivered after a 3-s baseline period. However, following stimulus presentation, mice were trained to withhold their licking for a 1.5-s interval. Mice were then given a 2-s time window to retrieve the reward after which the trial terminated followed by an ITI of randomized duration (between 4 and 7 s). Correct suppression of licking during this epoch was rewarded with sucrose water (10 μl, 10%). Conversely, if mice licked during this interval (early lick) no reward was delivered and the trial was aborted. Catch trials were used as in the action goal-directed task. Mice learned to reliably suppress licking (≥80% correct response rate) after an average 6 ± 0.85 training sessions (Figure 5—figure supplement 2). Switchgoal-directed task: Recordings were performed as mice transitioned from the goal-directed task to the action–suppression task. On average, mice expert in the action-task decreased performance to chance level after 2.25 ± 0.47 training sessions on the action–suppression task, at which point recordings were performed.

Two-photon Ca2+ imaging

Request a detailed protocol

Imaging of POm axons in forepaw S1 expressing the Ca2+ indicator GCaMP6f was performed in awake behaving mice through a chronic cranial window approximately 3 weeks after virus injection. Head-fixed mice were placed under a two-photon microscope (Thorlabs A-scope) and POm axons located 48 ± 6.8 μm below the pia surface were excited using a Ti:Sapphire laser (Spectra Physics Mai Tai Deepsee) tuned to 940 nm and passed through a 16x water immersion objective (Nikon, 0.8 NA). GaAsP photomultiplier tubes (Hamamatsu) were used for detection. The field of view (FOV) spanned 512 x 512 pixels and images were acquired at 30 Hz. To minimize photodamage, the excitation power was adjusted online to the minimal value sufficient to record Ca2+ transients and the number of imaged trials for a given FOV was restricted to a maximum of 40. During each trial, animal behavior was monitored with an infrared sensitive camera (Microsoft Lifecam). Forepaw position on the tactile stimulator was recorded using an infrared webcam and analyzed post hoc. Due to the low resolution of the video recording, the video quality does not allow for detailed tracking of the paw, however, gross forepaw location on the tactile stimulator could be determined and any trials where the forepaw was not in contact with the stimulator were removed from further analysis.

Cannula implant and photoinactivation of POm complex during learning and expert behavior

Request a detailed protocol

For optical inactivation of the POm complex, mice were injected ipsilaterally into the left POm with the inhibitory opsin AAV1.CAG.ArchT.GFP.WPRE.SV40 (60 nl; see virus injection). Following virus injection, a custom-made fiber-optic cannula (FT400EMT, 400 µm 0.39 NA, 2.5 mm fiber, Thorlabs) was slowly lowered down the injection track using a stereotaxic arm until the desired depth was reached (2.5 mm from pia). Dental cement (C&B metabond, Parkell Inc) was then applied around the edges of the cannula to secure it to the skull and left to dry for ~5 min. The same dental cement was used to attach a custom-made aluminum head bar (2 x 1 x 0.1 cm) to the skull for head-fixation. Meloxicam (3 mg/kg) was injected i.p. for additional postoperative analgesia and anti-inflammatory action. Mice were then returned to their cages to recover until behavioral training (~3 weeks). Behavioral procedures: After recovery, mice were trained on the action goal-directed task (see Habituation and Behavior). All behavioral procedures were performed using the Bpod behavioral platform (Bpod State Machine r1, Sanworks). Photoinactivation: Photoinactivation of the POm complex was achieved by delivering a light pulse (565 nm, 5 mW) through a 400 µm optical fiber (FT400EMT, Thorlabs) directly inserted into the cannula (FT400EMT, 400 µm 0.39 NA, 2.5 mm fiber, Thorlabs). A LED light source (LEDD1B, Thorlabs) coupled to a 565-nm LED filter (M565F3, Thorlabs) was used to generate the photostimulus. A custom-made light shield was placed over the animal’s head to prevent scattered light from entering the animal visual field. Custom routines in Matlab were used to operate the behavioral platform and data acquisition. Photoinactivation was either performed during learning, or once mice reached expert level ( ≥ 80% correct response rate). During expert performance, the light pulse was delivered to inactivate the POm during the stim/response epoch (2-s duration; onset 500 ms prior to stimulus onset) or during the reward epoch (2-s duration; onset at reward delivery). During a typical experimental session (~300 trials), LED-ON and LED-OFF trials were randomly interleaved at a rate of 50% each. To photoinactivate the POm during learning of the goal-directed task, the LED was delivered during the stimulus and response epoch in all trials throughout learning (2-s duration; onset 500 ms prior to stimulus onset) until mice had reached expert performance or for a maximum of 10 consecutive days of training. For control experiments, mice were stereotaxically injected into their left POm (see virus injections) with AAV1-PAM MuseeGFP (kindly provided by Daniel Scott, 60 nl) and experiments carried out as above.

Pupil tracking and analysis

Request a detailed protocol

To monitor engagement during the task, pupil tracking was performed in a subset of mice previously trained on the action goal-directed task for the ArchT experiments (see above). Pupil tracking was performed when mice were expert on both the action task and the action–Suppression task (see Habituation and Behavior). Pupil tracking was also performed during the transition between these tasks (switching) when their correct response rate dropped to chance level (∼50%). Mice were head-fixed and the right eye illuminated with infrared light (850 nm LED, Thorlabs). This illumination did not affect pupil diameter. Behavioral sessions were performed on the same apparatus used for two-photon imaging inside an aluminum soundproof optical enclosure. However, some illumination (3.48 lux) was provided as we found that the pupil became maximally dilated and a-dynamic in complete darkness. An IR-sensitive camera (Basler aCA1300-200 µm) mounting a 50 mm lens (Kowa 50 mm/F2.8) was used to image pupil dynamics at 15 frames per second. Frames were triggered externally using an Arduino microprocessor connected to a Bpod (Bpod State Machine r1, Sanworks) which was then used to operate the behavioral paradigm. Changes in pupil diameter were recorded and measured online using custom routines kindly provided by Bahr, Kremkow, Sachdev, and colleagues (Bergmann et al., 2019).

Ex vivo whole-cell recordings and photoinhibition of POm neurons by ArchT activation

Request a detailed protocol

Mice (P40–45) previously injected with ArchT in the POm (>14 days prior) were anaesthetized with isoflurane (3–5% in 0.75 l/min O2) before decapitation. The brain was then rapidly transferred and cut in an ice-cold, oxygenated solution containing (in mM): 110 choline chloride, 11.6 Na-ascorbate, 3.1 Na-pyruvate, 26 NaHCO3, 2.5 KCl, 1.25 NaH2PO4, 0.5 CaCl2, 7 MgCl2, and 10 D-glucose (sigma). Coronal slices of the POm (300 µm thick) were cut with a vibrating microslicer (Leica Vibratome 1000 S) and incubated in an incubating solution containing (in mM): 125 NaCl, 3 KCl, 1.25 NaH2PO4, 25 NaHCO3, 1 CaCl2, 6 MgCl2, and 10 D-glucose at 35°C for 20 min, followed by incubation at room temperature for at least 30 min before recording. All solutions were continuously bubbled with 95% O2/5% CO2 (Carbogen). Whole-cell patch clamp somatic recordings were made from visually identified pyramidal neurons using differential interference contrast microscopy. During recording, slices were constantly perfused at ~1.5 ml/min with carbogen-bubbled artificial cerebral spinal fluid containing (in mM): 125 NaCl, 25 NaHCO3, 3 KCl, 1.25 NaH2PO4, 1.2 CaCl2, 0.7 MgCl2, and 10 D-glucose maintained at 30–34°C. Patch pipettes were pulled from borosilicate glass and had open tip resistance of 5–7 MΩ filled with an intracellular solution containing (in mM): 135 potassium gluconate, 70 KCl, 10 sodium phosphocreatine, 10 HEPES, 4 Mg-ATP, 0.3 Na2-GTP, and 0.3% biocytin adjusted to pH 7.25 with KOH. Photoinhibition of POm neurons was achieved by shining a 565 nm LED light (1 s) onto the slice surface during somatic current injection steps (2 s). Firing rates before and during light application were quantified and compared to the same time period of the current step injection when no light was applied (Figure 5—figure supplement 1).

Histology

Request a detailed protocol

At completion of each experiment, mice were transcardially perfused with phosphate buffer (0.1 M) and 4% paraformaldehyde (PFA) solution. Brains were collected and post fixed overnight (~12 hr) in 4% PFA at 4°C before being cut into 200 μm coronal slices using a vibratome (Leica VT1000 Automated Vibratome) and mounted on glass slides using mounting medium containing nuclear staining dye DAPI (Fluoroshild, Sigma). Images of the brain slices were acquired using wide-field fluorescent microscopy (Zeiss Axio Imager 2). Images were taken such that excitation light (EYFP, 555 nm; DAPI, 430 nm) was optimized below the maximum pixel saturation value for each fluorophore. To evaluate virus (GCaMP6f, ArchT) expression profiles in the POm complex, images of brain sections were registered to the corresponding coronal plates of the Paxinos mouse brain atlas (Paxinos and Franklin, 2001). Data from out of target injections or failed viral expression were removed from further analysis.

Data analysis and statistical methods

Ca2+ data

Request a detailed protocol

All analyses were performed using ImageJ and custom written routines in Matlab or Python. Horizontal and vertical drifts of imaging frames due to animal motion were corrected by registering each frame to a reference image based on whole-frame cross-correlation. The reference image was generated by averaging frames for a given FOV in which motion drifts were minimal (< 15 pixels). Region of interests (ROIs) of axonal shafts or buttons were selected using the standard deviation of the entire imaging session (~6000–8000 frames) and manually drawn using the freehand tool in ImageJ. ROIs were selected so that each ROI represented a single POm axon. The activity profile was compared across all ROIs in a FOV. ROIs with similar activity profiles (where events were temporarily correlated in greater than 95% of trials) were presumed to be axonal branches or boutons of the same neuron and replicates were excluded from analysis. On average each FOV had 19 ± 2 ROIs. Across sessions the FOVs were overlapping, however, due to the size and shear density of axonal projections, individual axons were not imaged across sessions. To calculate the baseline fluorescence (F0) for each ROI, first the average baseline florescence intensity (across 60 frames prior to stimulus onset, 2 s) of each trial was taken. Second, the rolling median of these average baseline values was measured and used as F0. Fluorescence traces are expressed as relative fluorescence changes, ∆F/F = (F – F0)/F0. Only Ca2+ transients which were greater than 2x the baseline standard deviation (F0 + (2x s.d.)) and above the threshold for a period longer than 200 ms were selected. ROIs were only considered for analysis if there was at least one Ca2+ transient reported during the trial (termed ‘active axons’). The onset of a Ca2+ transient was defined as the time point at which a transient crossed the detection threshold (F0 + (2x s.d.)). Both peak amplitude and probability of an evoked Ca2+ transient per trial were typically measured. Ca2+ transient amplitude may reflect the number of action potentials whereas Ca2+ transient probability is independent on the number of evoked action potentials. Average Ca2+ transient probability was measured as ((Σ eventstime)/ Σ trials). The peak amplitude (∆F/F) was measured as the local maxima between the event onset and offset (i.e., when the falling edge of the transient crossed the threshold again). The duration (ms) of a Ca2+ transient was calculated as the time between the event onset and offset.

Three behaviorally relevant epochs were selected (1 s duration) for spontaneous activity (-2 to -1 s, relative to stimulus onset); for response activity (0 to +1 s, relative to stimulus onset) and for reward activity (0 to +1 s, relative to reward delivery). In a subset of axons (n = 107 axons, 3 mice), Ca2+ transients were further subcategorized as either occurring during the stimulus (0–500 ms) and response epochs (500–1000 ms) during the goal-directed task. Here, only a small portion of the axons (6%) were active only during the stimulus, whereas most axons were active during the response (only, 38% or combined, 56%). Therefore, to ensure accurate analysis of Ca2+ transients by using an expanded temporal window, the stimulus and response epoch were merged in the reported results. For probability comparisons, all ROIs were used, while only the subset of ROIs (i.e., axons) with detectable events (greater than the threshold) were used to measure amplitude and duration. This determines the difference in the number of axons used for each analysis. For direct comparison of POm activity during different epochs/behaviors, the subset of active axons with detectable Ca2+ events were typically used for analysis. On occasion, a mass average Ca2+ response was instead used, which is a mass average of all axons whether or not they had a response. Where appropriate, the variance of the peak Ca2+ amplitudes was compared using a F test. For displaying population activity, each row of the Ca2+ activity pattern is an individual axon, which is sorted by the timing of the peak amplitude for the particular behavioral condition.

Pupil tracking

Request a detailed protocol

Videos of pupil tracking and animal behavior were acquired and checked post hoc to remove potential artifacts due to sudden eyelid closing. Analysis of pupil dynamics were performed using a custom written algorithm in python. Briefly, pupil tracking for the entire session was split into single trials (11 s duration) according to behavioral outcome. The average response profile was then calculated for each trial type for each mouse. Pupil dilation was monitored during a 4-s baseline period preceding the beginning of each trial. The average peak diameter was measured as the local maxima of the average pupil response during the baseline epoch (4 to 0 s, relative to trial start; baseline), pre-tactile epoch (-3 to 0 s, relative to stimulus onset, pre-tac), and post-tactile epoch (0 to +4 s relative to stimulus onset, post-tac).

Behavior

Request a detailed protocol

The correct response rate was determined as d prime (the z transforms of HIT rate and FA rate d' = z(H) - z(F)) or as the fraction of correct trials over the total number of trials (HIT trials + correct rejection trials)/(stimulus trials + catch trials). The behavioral effects of POm photoinactivation were quantified by comparing correct responses of photoinactivation (LED-ON trials) vs. control (LED-OFF) trials, typically 150 each per experimental session. LED-ON trials and LED-OFF trials were randomly interleaved. The latency to first lick was calculated as the time of first lick occurrence after stimulus onset.

Statistical analysis

Request a detailed protocol

No predetermined sample sizes were calculated prior to experiments. All statistics were performed using Prism software. The significance level was set at 0.05. Normality of all value distributions was assessed by Shapiro–Wilk test (α = 0.05). Standard parametric tests were used only when data passed the normality test (p > 0.05). Nonparametric tests were used otherwise. Only two-sided tests were used. Specific statistical tests used and sample sizes are shown in figure captions or text.

Data availability

The source code for the behavioral system can be found online at https://github.com/palmerlab/behaviour_box, (copy archived at swh:1:rev:d4fad09624941bd2f25f9878c1ef304e84a6981a) as well as additional documentation at https://palmerlab.github.io. Calcium imaging data is available on Dryad https://doi.org/10.5061/dryad.1rn8pk0wb.

The following data sets were generated
    1. Palmer L
    (2022) Dryad Digital Repository
    Data from: The role of higher order thalamus during learning and correct performance in goal-directed behavior.
    https://doi.org/10.5061/dryad.1rn8pk0wb

References

  1. Book
    1. Paxinos G
    2. Franklin KJ
    (2001)
    The Mouse Brain in Stereotaxic Coordinates
    Elseiver.

Decision letter

  1. Ishmail Abdus-Saboor
    Reviewing Editor; Columbia University, United States
  2. Joshua I Gold
    Senior Editor; University of Pennsylvania, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Higher order thalamus flexibly encodes correct goal-directed behavior" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Joshua Gold as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) As is, the optogenetic experiments have some weaknesses. It would be helpful if the authors inhibited during different epochs. For example, their interpretation of what the activity during the reward epoch means is unclear. It would help if authors would inhibit POm during reward- if done in naïve animals, is learning affected? Another potential experiment could be to perform imaging in another circuit terminating in S1. Relatedly, reviewers were concerned that POM targets to output areas beyond somatosensory cortex may contribute to the observed optogenetic inhibition experiments.

2) The main conclusion about the role of POM remains unclear, and additional alternatives should be considered. For example, terms like "behavioral flexibility" are used to describe its purpose, but the connection of this term to POm is not explained.

Reviewer #1:

1. Figure 1 – Supp 1 suggests that virus expression was always limited to POm. Drawing borders expressing areas from epifluorescence images is probably very dependent on imaging parameters. The Methods indicate that the authors scaled so that no pixels were saturated. This could mean that there was some weak expression of GCaMP6f or ArchT outside of POm. As I understand it, the authors set exposure/gains by the brightest points in the image. The limited extent of the infection in the figures might just reflect its center, which is brightest, rather than its full extent. If there were GCaMP or ArchT in VPL, some results would need to be reinterpreted.

2. Calcium responses are weaker during the naïve state than the expert state (Figure 1D,E), similar to the start of the reversal training (Figure 4G,H). If POm encodes correct actions, why is there any response at all in naïve mice? Is that not also a sign of stimulus encoding? Might there be another correlate of correctness with regard to the task, such as an expert mouse holding their paw more firmly or still on the stimulating rod? This could alter the effective stimulus or involve different motor signals to POm.

3. The authors are rightly concerned that licking might contribute to POm activity and expend some good effort checking this. The reversal is a good control, but doesn't produce identical POm activity. The other licking analyses, while good, did not completely rule out licking effects. First, lines 110-111 state "…as there was no correlation between licking frequency and POm axonal activity (Figure 1I)", but Figure 1I doesn't seem to support that statement. Second, the authors analyze isolated spontaneous licks, but these probably involve less licking and less overall motion than during a real response.

4. Many figures (Figure 1F, 2B, 3C, 4C) make it apparent that a population of axons respond very early to the stimulus itself. I understand the authors point that many of their analyses show that on average the axons are not strongly modulated by this stimulus, but this is not true of every axon. Either some of these axons are coming from cells outside of POm (see #1) or some POm cells are stimulus driven. In either case, if some axons are strongly stimulus driven, the activity of these axons will correlate with correct choices. The stimulus and correct choices are themselves highly correlated because the animals perform so well. I do not understand how stimulus encoding and choice encoding can be disentangled by either behavior or the two behaviors in comparison. Simple stimulus encoding might be further modulated by arousal or reward expectation that increases with task learning (see #6).

5. I was unable to understand the author's conclusion about what POm is doing. They use terms like "behavioral flexibility" to describe its purpose, but the connection of this term to POm is not explained. Is a role as a flexibility switch really supported? Why does S1 need POm to signal a correct choice? Figure 6 did not seem helpful here. Couldn't S1 just detect the stimulus on its own and transmit consequent signals to wherever they need to be to generate behavior?

6. Arousal or reward expectation may be better explanations than flexibility. Lines 323-324 say that POm activity increased with pupil diameter normally but reversed during reward delivery. Which data support this statement? With regards to pupil, the Results only seem to indicate that there is no difference in diameter between the two conditions (expert and 50% chance) using 3 bins of data. However, I could not find the time windows used for computing these. Pupil is known to be lagged and the timing could be critical.

7. There are other possible interpretations of the results when the authors target POm for optogenetic suppression (around lines 246-248). The effects here are also consistent with preventing tonic and evoked POm activity from reaching lots of target structures other than S1: S2, PPC, motor cortex, dorsolateral striatum, etc. Maybe one of these cannot respond to the stimulus as well and Hits decrease?

8. Line 689. What alerts the mouse that a catch trial is happening? Is there something like an audio cue for onset of stimulus trials and catch trials? If there is no cue, wouldn't mice be in a different behavioral state during catch trials than during stimulus trials? The trial types could differ by more than the presence of the stimulus.

9. Would it be more thorough to zoom in on areas like VPL, set exposures/gains very high, and show that there is no detectable VPL expression or gradient of expression crossing into VPL?

10. The authors indicate that they used video of the paw to exclude trials where the mouse removes the paw entirely from the rod. Why not quantify the paw movements as well and check if the paw is overall moving less in experienced than naïve/switched states? Quantified comparisons of paw stability and calcium are probably also good checks.

11. An analysis that might help would be to check the relationship of lick number/rate and calcium. Third, the authors point out that FA trials have licks but different POm activity (lines 132-134), but the FA and Hit licks may differ in number or frequency. Some check of this is needed.

12. There are many possible ways the authors might address these, and depends on them and the data.

13. Why not just plot the average pupil diameter traces of the two conditions over fairly long time periods?

14. Like 12, the authors may want to deal with these in a variety of ways. On a related note with 7, wouldn't Figure 5E be more informative if latency was broken out by Hits and FAs separately? Related to #1, it would be problematic if the infection had spread into VPL.

Reviewer #2 (Recommendations for the authors):

In this manuscript, D LaTerra et al. explored the function of POm neurons during a tactile-based, goal-directed reward behavior. They target POm neurons that project to forepaw S1 and use two-photon ca2+imaging in S1 to monitor activity as mice performed a task where forepaw tactile stimulation (200 Hz, 500 ms) predicted a reward if mice licked at a reward port within 1.5 seconds. If mice did not lick, there was a time-out instead of a reward. The authors found that POm-S1 axons showed enhanced responses during the baseline period, the response window after the cue, and during reward delivery. They then showed that a subset of neurons were active during the response window during correct trials when the tactile stimulus served as a cue, but not on catch trials where animals spontaneously licked for a reward.

They then showed that POm axonal activity in S1 increased during the response window for "HIT" trials where animals correctly responded to the tactile stimulus with licking but the activity was less during "MISS" trials where animals did not respond. In order to probe whether this activity in the response window was being driven by motor activity, they designed a suppression task in which animals had to learn to suppress licking in response to the tactile stimulus in order to the receive a reward. POm neurons also showed increased activity during the response window even though action was being suppressed. However, this activity was less than during the action task. Thus, although POm activity is not encoding action, its activity is significantly different during an action-based task than an action suppression one. They then analyzed calclium activity during the training period between the action task and the suppression task in which animals were learning the new contingency and were not performing as experts. In this non-expert context there was not a difference between in POm axonal activity between "HIT" and "MISS" trials.

Lastly, they used ArchT to inhibit POm cell body activity during the tactile stimulus and response window of some trials and showed that they reduced performance during the trials when light was on.

Altogether, this paper provides evidence that POm neurons are not simply encoding sensory information. They are modulated by learning and their activity is correlated to performance in this goal-directed task. However, the actual role of the POm input to S1 is not discernable from the current experiments. Subsets of neurons show significant activity during the response window as well as reward. In addition, the role of this input is different during the switch task than during expert performance. There are a number of outstanding questions, which, if answered, would help to directly define the role of these neurons in this specific paradigm. For instance, the authors record specifically from POm axons in S1. How distinct is this activity from other neurons in the POm? Some POm neurons still show significant activity during MISS trials. Do these neurons have a different function than those that show a preferential response during HIT trials? Does POm activity during the switch task, which has a component of extinction training, differ from when the animals are first learning the action-based task? Likewise, are the same neurons that acquire a response during the initial learning of the action-based task, the same neurons that are responding during the action suppression task?

The authors provide great evidence that POm neurons that project to the S1 do not simply encode sensory information or actions, but are instead signaling during correct performance. However, inhibition of cell bodies did not dramatically effect performance and it is still unclear what role this circuit actually plays in this behavior. Finer-tuned optogenetic experiments and analysis of cell bodies within POm may provide greater details that will help define this circuit's role.

1. Perform optogenetic inhibition during specific epochs of task (response window vs reward) in order to better define this circuit's function.

2. Perform optogenetic inhibition during initial training before learning, to assess if this circuit is necessary for learning this task

3. Calcium imaging was done in POm axons in S1 and was not perfomed in POm itself, yet inhibition was done in cell bodies in POm and the functional role of the projection to S1 was not isolated. Recording cell bodies in POm might help to better characterize sub populations of functional ensembles and how they change during learning. Likewise, inhibiting POm axon terminals in S1 would provide a more nuanced functional assessment of the calcium imaging data presented here.

Reviewer #3 (Recommendations for the authors):

In their paper "Higher order thalamus flexibly encodes correct goal-directed behavior", LaTerra et al. investigate the function of projections from the thalamic nucleus POm to primary somatosensory cortex (S1) in the performance of goal-directed behaviors. The authors performed in vivo calcium imaging of POm axons in layer 1 of the forepaw region of S1 (fpS1) to monitor the activity of POm-fpS1 projections while mice performed a tactile detection task. They report that the activity of POm-fpS1 axons on successful ('hit') trials was increased in trained mice relative to naïve mice. Additionally, the authors used an action suppression variant of the task to show that POm-fpS1 axon activity was higher on successful trials over unsuccessful ('miss') trials regardless of the correct motor response required. During transition between task conditions, when mice perform at chance levels, the increase of POm-fpS1 activity during correct trials is no longer seen. Finally, the authors use inhibitory optogenetic tools to suppress POm activity, revealing a modest suppression in behavioral success. The authors conclude from these data that POm-fpS1 axons preferentially "encode and influence correct action selection" during tactile goal-oriented behavior.

This study presents several interesting findings, particularly with respect to the change in activity of POm-fpS1 axons during successful execution of a trained behavior. Additionally, the similarity in responses of POm-fpS1 on both the 'goal-directed action' and 'action suppression' tasks provides convincing evidence that POm-fpS1 activity is not likely to encode the motor response. Overall, these results have important implications for how activity in higher order thalamic nuclei corresponds to learning a sensorimotor behavior, and the authors use several clever experiments to address these questions. Yet, the major claim that POm encodes 'correct performance' should be defined more clearly. As is, there are alternative explanations that could be raised and should be discussed in more depth (Points 1), especially as it relates to any causal role the authors ascribe to POm (Point 2). In addition some clarification as to which types of signals (i.e. frequency of active axons vs. amplitude of signal in the active axons) the authors feel are most informative would be helpful (Point 3).

1) The authors argue that POm activity reflects 'correct task performance' and that the increased activity of POm-fpS1 axons in the response epoch is not due to sensory encoding. An alternative explanation is that POm-fpS1 axons do convey sensory information, and these connections are facilitated with learning – meaning the activity of pathways conveying sensory signals that are correlated with task success could be facilitated with training, and this facilitation could be disrupted during the switching task. In this sense, the activity profiles do not encode 'correct action' per se, but rather represent the sensory responses whose correlation to rewarded action have been reinforced with training (which would also be a very interesting finding). This would be quite distinct from the "cognitive functions" they ascribe to this pathway (line 341). It might have helped to introduce a delay period in between the sensory stimulus and response epoch to try to distinguish responses that encode information about the sensory stimulus from those that might be involved in encoding task performance. However, as is, it is difficult to distinguish between these two scenarios with this data, and thus the interpretations the authors present could be rephrased with alternatives discussed in more depth.

2) Similarly, while the authors attempt to establish a causal role for POm in task performance by optogenetically inhibiting POm during the response epoch, the results are also consistent with a deficit in sensory processing, and cannot be interpreted strictly as a disruption of the encoding of 'correct action' task performance signals. Furthermore, these perturbation studies do not demonstrate that the POm-fpS1 projections they are studying are implicated in the modest behavioral deficits. As the authors state, POm projects to many targets (lines 63-66), and similar sensory-based, goal-directed behaviors do not require S1 (lines 302-305). In light of these points, some of the statements ascribing a causal role for these projections in task success could be rephrased (e.g. line 33 "to encode and influence correct action selection", line 252 "a direct influence", line 340 "plays an active role during correct performance").

3) Event amplitude and probability were both quantified, but were not consistently reported throughout the manuscript and figures. For example, Figure 1 reports both probability and amplitude (Figure 1G and H), whereas Figure 2 only reports probability. Thus, it was not always clear as to whether the authors were ascribing biological significance to one or both of these measures, given that in some cases differences were found in one and not the other, and which of the measures were reported was occasionally switched. It would be helpful for the authors to clarify the significance they assign to each measure, and report both measures side by side for all experiments if they interpret them both as relevant.

4. It was unclear why the authors did not attempt to use deconvolution and report spike probabilities, especially when considering the kinetics of GCaMP6f and the results presented in Figure 4, where event amplitude and event probability changed in opposing directions, which could reflect a change in burst firing since spikes in short high frequency bursts can appear as a single large amplitude event compared to single spikes. The authors could consider performing further analysis or discuss the caveats of analyzing the Ca signal across these large time windows.

5. It would be helpful to clarify the basis upon which boutons or ROIs were excluded when determining the 'axon subset' of ROIs. How did Ca event probability, event amplitude, and event duration compare between ROIs that were assessed as being from the same axon? It was unclear what was deemed as 'similar activity profiles' for the exclusion of ROIs. It might help to include an additional figure supplement to Figure 1 showing the ROI correlations and exclusion criteria with images showing 'similar' or 'dissimilar' ROIs marked on an example field of view.

6. The heterogeneity of POm axons was briefly shown (Figure 1) but not discussed or explored in depth. It was unclear how the authors interpret the observation that a subset of POm-fpS1 axons showed a larger increase in the reward epoch compared to the stimulus and response epochs. While this diversity in responses could be relevant to their claim that POm-fpS1 axons encode task performance, the authors did not perform experiments inhibiting POm during the reward epoch, leaving unclear the interpretation of what the reward epoch responses might mean. More discussion of the interpretation of POm-fpS1 activity during the reward epoch would be helpful, given that several sections of the Results are dedicated to this point. Also, a clearer descriptions of row sorting in the figures (e.g. Figure 2B) would enable more direct comparison of activity of the same axon across different trial types.

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the second round of review.]

Thank you for resubmitting the paper entitled "Higher order thalamus flexibly encodes correct goal-directed behavior" for further consideration by eLife. Your revised article has been evaluated by a Senior Editor and a Reviewing Editor. We are sorry to say that we have decided that this submission will not be considered further for publication by eLife.

Although reviewers acknowledge improvement in the clarity of the manuscript, key experiments that were requested were not performed, such as inhibiting during different epochs or in naïve animals. Moreover, a major issue that was raised in the first round was to clarify what function the authors are ascribing to POm. Although the reviewers acknowledge improvement in language, the reviewers all felt that it is still not clear why Pom is needed to signal correct performance, and the data do not exactly support the conclusion that POm=correct.

https://doi.org/10.7554/eLife.77177.sa1

Author response

Essential revisions:

1) As is, the optogenetic experiments have some weaknesses. It would be helpful if the authors inhibited during different epochs. For example, their interpretation of what the activity during the reward epoch means is unclear. It would help if authors would inhibit POm during reward- if done in naïve animals, is learning affected? Another potential experiment could be to perform imaging in another circuit terminating in S1. Relatedly, reviewers were concerned that POM targets to output areas beyond somatosensory cortex may contribute to the observed optogenetic inhibition experiments.

Due to the COVID pandemic, revising this manuscript and performing additional experiments and analysis has been extremely slow. In the past 1.5 years, Melbourne, Australia has endured 260+ days of lockdown which has been reportably the strictest and longest in the world. This has affected our ability to perform the requested experiments as the laboratory was shut down for months on end, mice were culled and experiments ceased.

Despite the delays, we were able to perform the requested additional experiments:

1) We have now performed additional experiments where we inhibit POm during learning of the tactile goal-directed behavior. Here, we found learning was severely disrupted and mice took approximately two-times longer to learn the task. This data is now included in new Figure 6.

2) To assess the role of POm activity during the reward delivery, we photo-inactivated the POm during the reward epoch in expert mice. Here we found no influence of POm photo-inactivation during reward delivery on mouse performance. This result, which corresponds to the lower rates in the activity of POm axons within S1 during reward delivery, is now included in Figure 5.

3) We attempted to use pathway specific DREADDs to specifically target the POm input that synapses in S1. Using dual injection of retrograde cre into S1, and cre-dependent DREADDs into POm, this strategy should have enabled us to specifically inactivate the S1-targeting POm pathway during the task. Unfortunately, after much effort, we were unable to confirm specificity of the labelling and therefore were ultimately unable to persist with this dataset.

2) The main conclusion about the role of POM remains unclear, and additional alternatives should be considered. For example, terms like "behavioral flexibility" are used to describe its purpose, but the connection of this term to POm is not explained.

In the revised manuscript, the main conclusions about the POm are now clearly stated and the use of the term ‘behavioral flexibility’ has been limited to improve the clarity of the findings.

Reviewer #1 (Recommendations for the authors):

1. Figure 1 – Supp 1 suggests that virus expression was always limited to POm. Drawing borders expressing areas from epifluorescence images is probably very dependent on imaging parameters. The Methods indicate that the authors scaled so that no pixels were saturated. This could mean that there was some weak expression of GCaMP6f or ArchT outside of POm. As I understand it, the authors set exposure/gains by the brightest points in the image. The limited extent of the infection in the figures might just reflect its center, which is brightest, rather than its full extent. If there were GCaMP or ArchT in VPL, some results would need to be reinterpreted.

We agree with the reviewer that the determined expression areas are dependent on imaging parameters, however, we are confident that the virus expression used for analysis in this study are confined to the POm. In this study, our analysis of targeting of POm is three-fold. First, we optimized the volume of virus loaded to the minimum necessary to observe POm projections in S1 (a single targeted injection of 60 nl). Second, we analyzed the fluorescence spread using fluorescence microscopy after every experiment. We set exposure to use the full dynamic range of the image as previously described (Gambino et al., 2014). Occasionally, the virus spread to the adjacent VPM nucleus and this was easily recognizable by the characteristic VPM projections with the barrels of the barrel cortex. These animals were excluded from this study and not further analyzed. The VPL nucleus is located further caudally in respect to the VPM and again, we were able to identify if the virus has spread to this nucleus via posthoc fluorescence microscopy. These animals were excluded from this study and not further analyzed. We note that our stereotaxic injections were not flawless and the virus occasionally spread along the injection pipette track and into high-order visual thalamic nuclei LP and LD, superficial to POm. This is shown in Figure 1. These two nuclei, however, do not target S1 (Kamishina et al., 2009; van Groen and Wyss, 1992) and were therefore not imaged within our study. Third, we analyze the projection profile in the forepaw area of S1 to ensure that it corresponds to the projection profile of POm and not VPL. If there is fluorescence in nontargeted areas, then the experiments were excluded from analysis.

An additional degree of precision is offered by our imaging and optogenetic strategy. Calcium imaging was performed in layer 1 which is primarily targeted by Pom (Meyer et al., 2010), and not VPL which primarily targets layer 4. Therefore, spillover into VPL would not influence our imaging results as we only image axons in layer 1 which is targeted by Pom. Furthermore, during the optogenetic experiments, the fiber optic was targeted to the Pom (not the VPL), thus providing a secondary Pom localization of the photo-inhibited region. This is now discussed in the revised manuscript.

2. Calcium responses are weaker during the naïve state than the expert state (Figure 1D,E), similar to the start of the reversal training (Figure 4G,H). If Pom encodes correct actions, why is there any response at all in naïve mice? Is that not also a sign of stimulus encoding? Might there be another correlate of correctness with regard to the task, such as an expert mouse holding their paw more firmly or still on the stimulating rod? This could alter the effective stimulus or involve different motor signals to Pom.

We agree with the reviewer that the Pom is encoding the stimulus in the naïve state. This is evident in our study, and others, which show that the Pom increases activity during sensory input in naïve mice. In the expert state, stimulus encoding may also be performed by a subset of Pom axons, and we now discuss this in the revised manuscript. However, the results from our study, and other studies of different pathways, illustrates that overall sensory encoding reorganizes with learning (Chen et al., 2015; Reinert et al., 2021), with smaller sensory-evoked responses during active sensory-based behavior (Sachidhanandam et al., 2013).

Overall, our findings show that, during task performance, there is a significant increase in the Pom activity during the stimulus/response epoch which is dependent on the behavioral performance (HIT, MISS). This is not due to licking motion as there was similar Pom activity during the action and suppression tasks which involved licking and not licking for reward (Figure 3). Furthermore, all experiments were monitored online via a behavioral camera to examine the location of the forepaw on the stimulus during all trials, and trials where the paw was not clearly resting on the stimulating rod, or where excessive motion was detected were excluded from analysis. This is now discussed in the revised manuscript.

3. The authors are rightly concerned that licking might contribute to Pom activity and expend some good effort checking this. The reversal is a good control, but doesn’t produce identical Pom activity. The other licking analyses, while good, did not completely rule out licking effects. First, lines 110-111 state “…as there was no correlation between licking frequency and Pom axonal activity (Figure 1I)”, but Figure 1I doesn’t seem to support that statement. Second, the authors analyze isolated spontaneous licks, but these probably involve less licking and less overall motion than during a real response.

We thank the reviewer for acknowledging the effort we made to assess the influence of licking behavior on Pom axonal activity. We now include a more direct analysis in the revised manuscript illustrating the relationship between the licking response and Pom activity. This analysis shows there is no correlation between licking and Pom axonal activity (linear regression, p = 0.923), further suggesting that Pom axonal activity is not simply due to licking behavior.

4. Many figures (Figure 1F, 2B, 3C, 4C) make it apparent that a population of axons respond very early to the stimulus itself. I understand the authors point that many of their analyses show that on average the axons are not strongly modulated by this stimulus, but this is not true of every axon. Either some of these axons are coming from cells outside of Pom (see #1) or some Pom cells are stimulus driven. In either case, if some axons are strongly stimulus driven, the activity of these axons will correlate with correct choices. The stimulus and correct choices are themselves highly correlated because the animals perform so well. I do not understand how stimulus encoding and choice encoding can be disentangled by either behavior or the two behaviors in comparison. Simple stimulus encoding might be further modulated by arousal or reward expectation that increases with task learning (see #6).

We agree with the reviewer that individual Pom axons are heterogenous and a subset of axons may respond to the sensory stimulus during the behavior. We have now performed analysis on a subset of Pom axons (n = 107 axons) where the ca2+ transients have been further subcategorized as either occurring during the stimulus and response epochs during the goaldirected task. Here, only a small portion of the axons (6 %) were active only during the stimulus, therefore, although we agree that there is a subset of axons that are stimulus driven, they are a minority during expert behavior. This is information is now included in the revised manuscript.

In this study, we attempted to disentangle stimulus and choice encoding by comparing the Pom axonal activity with the different behavioral performance (HIT or MISS). Here, the same stimulus is always presented (tactile, 200 Hz), however, the mouse response differs. Despite receiving the same tactile stimulus, Pom signaling in forepaw S1 is significantly increased during correct HIT trials compared with MISS trials in both the action and suppression task. Therefore, combined with the small number of axons that were stimulus driven, our results illustrate that Pom axonal activity is predominantly encoding behavioral information in this task.

We agree that simple stimulus encoding might be further modulated by arousal or reward expectation that increases with task learning. In our study, the increase in Pom activity during HIT behavior was not due to elevated task engagement as, despite similar levels of arousal (Figure 4B), Pom activity in expert mice differed in comparison to chance performance (switch behavior; Figure 4E,F). This is now discussed in the revised manuscript.

5. I was unable to understand the author’s conclusion about what Pom is doing. They use terms like “behavioral flexibility” to describe its purpose, but the connection of this term to Pom is not explained. Is a role as a flexibility switch really supported? Why does S1 need Pom to signal a correct choice? Figure 6 did not seem helpful here. Couldn’t S1 just detect the stimulus on its own and transmit consequent signals to wherever they need to be to generate behavior?

We have now revised the manuscript to improve the clarity of our conclusions and have removed (old) Figure 6 as it wasn’t helpful. Overall, our findings suggest that the Pom provides input to S1 which preferentially encodes correct (HIT) performance during goal-directed behavior. Specifically, the POm is primarily active during the behavioral response epoch, and not reward delivery. Furthermore, photo-inactivation of the POm during learning illustrates the POm influences the rate of learning of the goal-directed task.

If S1 simply detected the stimulus on its own and transmitted a consequent signals to generate behavior, then important modulatory processes required during behavior would not be possible. Along with other feedback projections, the POm can provide instant, and dynamic, feedback information to S1 to directly alter cortical signals. Specifically, POm targets the upper layers of the cortex, whereas external sensory information targets the layer 4 input layer. At the level of a single pyramidal neuron, this means POm input lands on the tuft dendrites whereas external sensory information lands on the proximal basal dendrites. This segregation of input provides a great cellular mechanism for increasing the computational capabilities and modulation of neurons – which would be lost if S1 simply detected the stimulus and transmitted consequent signals to generate behavior.

We have now performed additional experiments where we photo-inactivate the POm during learning (new Figure 6) and also during the reward epoch in expert mice (new panels in Figure 5). These additional experiments have shed more light on the role and influence of the POm during goal-directed behavior. Here, we illustrate the POm plays an important role during the response, and not reward, epoch in correct (HIT) behavior and during learning.

6. Arousal or reward expectation may be better explanations than flexibility. Lines 323-324 say that POm activity increased with pupil diameter normally but reversed during reward delivery. Which data support this statement? With regards to pupil, the Results only seem to indicate that there is no difference in diameter between the two conditions (expert and 50% chance) using 3 bins of data. However, I could not find the time windows used for computing these. Pupil is known to be lagged and the timing could be critical.

We have now revised the manuscript to further expand the reporting of our findings. The statement that ‘POm activity increased with pupil diameter normally but reversed during reward delivery’ stems from data illustrated in Figure 1I and 3B. For space and flow of the manuscript, we were not able to show them on the same graph as Author response image 1. Here, you can see that during reward (blue), POm activity decreased compared to response (green) whereas the pupil diameter was maximum during reward delivery. We now include more information in the methods regarding pupil tracking (see Data analysis and statistical methods; Pupil tracking).

Author response image 1

7. There are other possible interpretations of the results when the authors target POm for optogenetic suppression (around lines 246-248). The effects here are also consistent with preventing tonic and evoked POm activity from reaching lots of target structures other than S1: S2, PPC, motor cortex, dorsolateral striatum, etc. Maybe one of these cannot respond to the stimulus as well and Hits decrease?

We agree with the reviewer that there are other possible interpretations of the optogenetic suppression experiments. We now discuss this in the revised manuscript.

8. Line 689. What alerts the mouse that a catch trial is happening? Is there something like an audio cue for onset of stimulus trials and catch trials? If there is no cue, wouldn't mice be in a different behavioral state during catch trials than during stimulus trials? The trial types could differ by more than the presence of the stimulus.

There is broadband noise during the trial that acts as a cue. This is detailed in the methods and text.

9. Would it be more thorough to zoom in on areas like VPL, set exposures/gains very high, and show that there is no detectable VPL expression or gradient of expression crossing into VPL?

We are confident that minimal VPL expression is not influencing our imaging and photosuppression results due to imaging axons only in the upper cortical layers and targeting the POm with the fiber optic respectively. This was confirmed in all recordings.

10. The authors indicate that they used video of the paw to exclude trials where the mouse removes the paw entirely from the rod. Why not quantify the paw movements as well and check if the paw is overall moving less in experienced than naïve/switched states? Quantified comparisons of paw stability and calcium are probably also good checks.

In this study, we used an infrared webcam to monitor body movement. Using this, we were able to exclude any trials where the stimulus was not delivered to the paw, however, due to low resolution, the video quality does not allow us to perform detailed tracking of the paw. We now acknowledge this limitation in the methods. We did, however, analyze at great length the licking behavior and ensured there was not a correlation between licking and POm activity.

11. An analysis that might help would be to check the relationship of lick number/rate and calcium. Third, the authors point out that FA trials have licks but different POm activity (lines 132-134), but the FA and Hit licks may differ in number or frequency. Some check of this is needed.

We have now analyzed the lick rate and correlated it with POm activity. In brief, there is no correlation between licking rate and POm activity and we now include this information in the revised manuscript.

We have also analyzed evoked lick rates in FA and HIT trials. There was no significant difference between FA and HIT lick rates (p = 0.203; n = 9 mice).

12. There are many possible ways the authors might address these, and depends on them and the data.

We thank the reviewer for these suggestions and we have attempted to address all points vigorously.

13. Why not just plot the average pupil diameter traces of the two conditions over fairly long time periods?

We apologize, but we are not quite sure what the reviewer is suggesting. In our study, tracking pupil dilation was used to test whether the arousal state of the mouse was similar in the different behavioral epochs. Using the analysis we performed, we were able to show that arousal was similar in the different tasks.

14. Like 12, the authors may want to deal with these in a variety of ways. On a related note with 7, wouldn't Figure 5E be more informative if latency was broken out by Hits and FAs separately? Related to #1, it would be problematic if the infection had spread into VPL.

In Figure 5D and Figure 5 —figure supplement 2 (old Figure 5E), we focus on the licking latency only during correct HIT performance in mice injected with Archaeorhopsin (top) and GFP (bottom). This serves as an important control to illustrate that LED itself does not influence licking behavior/latency.

In the optogenetic photo-inhibition experiments, targeting of the fiber optic canula to the POm and viral expression was confirmed after every experiment. Weak expression of ArchT outside of the POm would therefore also have minimal impact on our findings. This is now detailed in the revised manuscript.

Reviewer #2 (Recommendations for the authors):

In this manuscript, D LaTerra et al. explored the function of POm neurons during a tactile-based, goal-directed reward behavior. They target POm neurons that project to forepaw S1 and use two-photon ca2+imaging in S1 to monitor activity as mice performed a task where forepaw tactile stimulation (200 Hz, 500 ms) predicted a reward if mice licked at a reward port within 1.5 seconds. If mice did not lick, there was a time-out instead of a reward. The authors found that POm-S1 axons showed enhanced responses during the baseline period, the response window after the cue, and during reward delivery. They then showed that a subset of neurons were active during the response window during correct trials when the tactile stimulus served as a cue, but not on catch trials where animals spontaneously licked for a reward.

They then showed that POm axonal activity in S1 increased during the response window for "HIT" trials where animals correctly responded to the tactile stimulus with licking but the activity was less during "MISS" trials where animals did not respond. In order to probe whether this activity in the response window was being driven by motor activity, they designed a suppression task in which animals had to learn to suppress licking in response to the tactile stimulus in order to the receive a reward. POm neurons also showed increased activity during the response window even though action was being suppressed. However, this activity was less than during the action task. Thus, although POm activity is not encoding action, its activity is significantly different during an action-based task than an action suppression one. They then analyzed calclium activity during the training period between the action task and the suppression task in which animals were learning the new contingency and were not performing as experts. In this non-expert context there was not a difference between in POm axonal activity between "HIT" and "MISS" trials.

Lastly, they used ArchT to inhibit POm cell body activity during the tactile stimulus and response window of some trials and showed that they reduced performance during the trials when light was on.

Altogether, this paper provides evidence that POm neurons are not simply encoding sensory information. They are modulated by learning and their activity is correlated to performance in this goal-directed task. However, the actual role of the POm input to S1 is not discernable from the current experiments. Subsets of neurons show significant activity during the response window as well as reward. In addition, the role of this input is different during the switch task than during expert performance. There are a number of outstanding questions, which, if answered, would help to directly define the role of these neurons in this specific paradigm. For instance, the authors record specifically from POm axons in S1. How distinct is this activity from other neurons in the POm? Some POm neurons still show significant activity during MISS trials. Do these neurons have a different function than those that show a preferential response during HIT trials? Does POm activity during the switch task, which has a component of extinction training, differ from when the animals are first learning the action-based task? Likewise, are the same neurons that acquire a response during the initial learning of the action-based task, the same neurons that are responding during the action suppression task?

The authors provide great evidence that POm neurons that project to the S1 do not simply encode sensory information or actions, but are instead signaling during correct performance. However, inhibition of cell bodies did not dramatically effect performance and it is still unclear what role this circuit actually plays in this behavior. Finer-tuned optogenetic experiments and analysis of cell bodies within POm may provide greater details that will help define this circuit's role.

We thank the reviewer for their comments. We have now revised the manuscript to clearly state the role of the POm during the goal-directed behavioral tasks used in this study. We have provided more information regarding the range of activity patterns in POm axons within S1.

The POm contains a heterogenous population of neurons and since it projects to multiple cortical and subcortical regions, the activity of POm axonal projections in S1 may indeed be different to other projection targets.

The activity of POm axons during MISS behavior may have a different function than those that show a preferential response during HIT trials, however, this evoked rate is not significantly different to baseline and therefore is hard to differentiate from spontaneous activity (see Figure 2). Furthermore, the evoked rate of POm activity during the switch task is not significantly different compared to naïve mice (p = 0.159; Kruskal-Wallis test). This information is now included in the manuscript.

It is unknown whether the same neurons that acquire a response during the initial learning of the action-based task are the same neurons that are responding during the action suppression task as we were unable to conclusively determine whether or not the same POm axons were imaged in the different protocols.

1. Perform optogenetic inhibition during specific epochs of task (response window vs reward) in order to better define this circuit's function.

In this study, we inhibited the POm during the stimulus and response epoch as this is when the POm was most active. We now perform experiments where we restrict optogenetic inhibition of the POm to the reward window. As shown in new Figure 5E and F, photo-inactivating the POm during the reward window did not alter behavioral performance. In combination with the finding that POm axons within S1 are considerably less active during the reward epoch compared with the response epoch, our findings suggest that the POm does not predominantly play a role in reward encoding and behavior. In the revised manuscript, we now include more details about the limitations of using Archaerhodopsin to optogenetically silence the POm.

2. Perform optogenetic inhibition during initial training before learning, to assess if this circuit is necessary for learning this task

We have now performed new experiments where we optogenetically inhibit the POm during learning of the goal-directed task. Here, POm photo-inactivation decreased the rate of learning the tactile goal-directed task, significantly increasing the number of sessions required to reach expert (>80 % correct) performance (p = 0.0036). These findings illustrate that the POm is involved in learning this task and this information is now included as a new Figure 6 in the revised manuscript.

3. Calcium imaging was done in POm axons in S1 and was not perfomed in POm itself, yet inhibition was done in cell bodies in POm and the functional role of the projection to S1 was not isolated. Recording cell bodies in POm might help to better characterize sub populations of functional ensembles and how they change during learning. Likewise, inhibiting POm axon terminals in S1 would provide a more nuanced functional assessment of the calcium imaging data presented here.

We agree with the reviewer that it would be of great to record from POm cell bodies during behavior, however, in this study, we were particularly interested in recording the information transferred from the POm to S1. Since the POm projects to different brain regions, here we isolated the POm projections that target S1 by specifically recording from POm axons within S1. Furthermore, it is difficult to perform calcium imaging to the depth of the POm without perturbing the cortex, which may influence sensory-based behavior. Likewise, another technique used to record neural activity is whole-cell patch clamp electrophysiology, however it is difficult to perform recordings during behavior due to the movement involved in the behavioral response.

We attempted to inhibit Pom axon terminals in S1, however, unfortunately, the attempts were unsuccessful. Firstly, we used the genetically expressed chloride channel, gtACR, to inhibit axonal terminals. However, our control experiments illustrated that photo-activation of gtACR in Pom axonal terminals in S1 results in excitation, and not inhibition (Malyshev et al., 2017). Although not optimal, our supplementary control experiments (Figure 5 —figure supplement 1) clearly show that we were able to photo-inhibit Pom neurons expressing Archaerhodopsin when LED was restricted to the cell body.

Reviewer #3 (Recommendations for the authors):

In their paper “Higher order thalamus flexibly encodes correct goal-directed behavior”, LaTerra et al. investigate the function of projections from the thalamic nucleus Pom to primary somatosensory cortex (S1) in the performance of goal-directed behaviors. The authors performed in vivo calcium imaging of Pom axons in layer 1 of the forepaw region of S1 (fpS1) to monitor the activity of Pom-fpS1 projections while mice performed a tactile detection task. They report that the activity of Pom-fpS1 axons on successful (‘hit’) trials was increased in trained mice relative to naïve mice. Additionally, the authors used an action suppression variant of the task to show that Pom-fpS1 axon activity was higher on successful trials over unsuccessful (‘miss’) trials regardless of the correct motor response required. During transition between task conditions, when mice perform at chance levels, the increase of Pom-fpS1 activity during correct trials is no longer seen. Finally, the authors use inhibitory optogenetic tools to suppress Pom activity, revealing a modest suppression in behavioral success. The authors conclude from these data that Pom-fpS1 axons preferentially “encode and influence correct action selection” during tactile goal-oriented behavior.

This study presents several interesting findings, particularly with respect to the change in activity of Pom-fpS1 axons during successful execution of a trained behavior. Additionally, the similarity in responses of POm-fpS1 on both the ‘goal-directed action’ and ‘action suppression’ tasks provides convincing evidence that Pom-fpS1 activity is not likely to encode the motor response. Overall, these results have important implications for how activity in higher order thalamic nuclei corresponds to learning a sensorimotor behavior, and the authors use several clever experiments to address these questions. Yet, the major claim that Pom encodes ‘correct performance’ should be defined more clearly. As is, there are alternative explanations that could be raised and should be discussed in more depth (Points 1), especially as it relates to any causal role the authors ascribe to Pom (Point 2). In addition some clarification as to which types of signals (i.e. frequency of active axons vs. amplitude of signal in the active axons) the authors feel are most informative would be helpful (Point 3).

We thank the reviewer for their helpful comments and assessment of our study. We have now addressed all comments and revised the manuscript accordingly.

1) The authors argue that Pom activity reflects ‘correct task performance’ and that the increased activity of Pom-fpS1 axons in the response epoch is not due to sensory encoding. An alternative explanation is that Pom-fpS1 axons do convey sensory information, and these connections are facilitated with learning – meaning the activity of pathways conveying sensory signals that are correlated with task success could be facilitated with training, and this facilitation could be disrupted during the switching task. In this sense, the activity profiles do not encode 'correct action' per se, but rather represent the sensory responses whose correlation to rewarded action have been reinforced with training (which would also be a very interesting finding). This would be quite distinct from the “cognitive functions” they ascribe to this pathway (line 341). It might have helped to introduce a delay period in between the sensory stimulus and response epoch to try to distinguish responses that encode information about the sensory stimulus from those that might be involved in encoding task performance. However, as is, it is difficult to distinguish between these two scenarios with this data, and thus the interpretations the authors present could be rephrased with alternatives discussed in more depth.

We agree that it would have been beneficial to separate the stimulus from the response period in the behavioral paradigm. However, to avoid engaging working memory, we did not wish to enforce a delay period in this study.

Furthermore, we have now performed analysis on a subset of Pom axons (n = 107 axons) where the ca2+ transients have been further sub-categorized as either occurring during the stimulus and response epochs during the goal-directed task. Here, only a small portion of the axons (6 %) were active only during the stimulus, therefore, although we agree that there is a subset of axons that are stimulus driven, they are a minority during expert behavior. This is information is now included in the revised manuscript.

In this study, we attempted to disentangle stimulus and choice encoding by comparing the Pom axonal activity with the different behavioral performance (HIT or MISS). Here, the same stimulus is always presented (tactile, 200 Hz), however, the mouse response differs. Despite receiving the same tactile stimulus, Pom signaling in forepaw S1 is significantly increased during correct HIT trials compared with MISS trials in both the action and suppression task. Therefore, combined with the small number of axons that were stimulus driven, our results illustrate that Pom axonal activity is predominantly encoding behavioral information in this task.

2) Similarly, while the authors attempt to establish a causal role for Pom in task performance by optogenetically inhibiting Pom during the response epoch, the results are also consistent with a deficit in sensory processing, and cannot be interpreted strictly as a disruption of the encoding of ‘correct action’ task performance signals. Furthermore, these perturbation studies do not demonstrate that the Pom-fpS1 projections they are studying are implicated in the modest behavioral deficits. As the authors state, Pom projects to many targets (lines 63-66), and similar sensory-based, goal-directed behaviors do not require S1 (lines 302-305). In light of these points, some of the statements ascribing a causal role for these projections in task success could be rephrased (e.g. line 33 “to encode and influence correct action selection”, line 252 “a direct influence”, line 340 “plays an active role during correct performance”).

We agree that the decrease in correct performance during optogenetic inhibition of Pom cell bodies may also be explained by a deficit in sensory processing. However, in this study, we went to great lengths to illustrate that the Pom is encoding correct action, and not sensory information (detailed in response to 1). This is further expanded upon in the revised manuscript. We also agree that the perturbation studies do not directly demonstrate that the Pom to S1 projections are driving the behavioral deficits. We attempted to use pathway specific DREADDs to specifically inactivate the Pom input that targets S1. Using dual injection of retrograde-cre into S1, and cre-dependent DREADDs into Pom, this strategy should have enabled us to specifically inactivate the S1-targeting Pom pathway during the task. Unfortunately, after much effort, we were unable to confirm specificity of the labelling and therefore were ultimately unable to persist with this dataset. We therefore only refer to the Pom itself when discussing the influence on behavior and we have now revised the manuscript accordingly.

3) Event amplitude and probability were both quantified, but were not consistently reported throughout the manuscript and figures. For example, Figure 1 reports both probability and amplitude (Figure 1G and H), whereas Figure 2 only reports probability. Thus, it was not always clear as to whether the authors were ascribing biological significance to one or both of these measures, given that in some cases differences were found in one and not the other, and which of the measures were reported was occasionally switched. It would be helpful for the authors to clarify the significance they assign to each measure, and report both measures side by side for all experiments if they interpret them both as relevant.

We thank the reviewer for this observation and have now included a statement discussing the reporting of Ca2+ transient probability and/or amplitude in the methods. Throughout the Figures, we typically illustrated probability of an evoked transient as this is a reliable measure which was dramatically altered within this study. We now report the Ca2+ transient peak amplitudes during HIT and MISS trials for direct comparison of both measures (Figure 2).

4. It was unclear why the authors did not attempt to use deconvolution and report spike probabilities, especially when considering the kinetics of GcaMP6f and the results presented in Figure 4, where event amplitude and event probability changed in opposing directions, which could reflect a change in burst firing since spikes in short high frequency bursts can appear as a single large amplitude event compared to single spikes. The authors could consider performing further analysis or discuss the caveats of analyzing the Ca signal across these large time windows.

We agree that it would be informative to deconvolve our calcium transients, however, since we don’t know the ground truth, we would not be confident in reporting convoluted spike probabilities. It is for this reason that we report both evoked rates and amplitudes, and we now include a statement on the caveats of analyzing the calcium signal across large time windows in the revised manuscript.

5. It would be helpful to clarify the basis upon which boutons or ROIs were excluded when determining the ‘axon subset’ of ROIs. How did Ca event probability, event amplitude, and event duration compare between ROIs that were assessed as being from the same axon? It was unclear what was deemed as ‘similar activity profiles’ for the exclusion of ROIs. It might help to include an additional figure supplement to Figure 1 showing the ROI correlations and exclusion criteria with images showing ‘similar’ or ‘dissimilar’ ROIs marked on an example field of view.

In this study, ROIs were excluded if they had greater than 95% of same temporal activity pattern (evoked probability) as a neighboring ROI in the same FOV. In other words, the activity patterns were essentially identical, but we allowed a small window for varying signal-to-noise ratios in the different boutons. We now include this additional information in the methods, and include an additional Figure Supplement to Figure 1 to clearly illustrate our exclusion criteria (Figure 1 —figure supplement 4).

6. The heterogeneity of Pom axons was briefly shown (Figure 1) but not discussed or explored in depth. It was unclear how the authors interpret the observation that a subset of Pom-fpS1 axons showed a larger increase in the reward epoch compared to the stimulus and response epochs. While this diversity in responses could be relevant to their claim that Pom-fpS1 axons encode task performance, the authors did not perform experiments inhibiting Pom during the reward epoch, leaving unclear the interpretation of what the reward epoch responses might mean. More discussion of the interpretation of Pom-fpS1 activity during the reward epoch would be helpful, given that several sections of the Results are dedicated to this point. Also, a clearer descriptions of row sorting in the figures (e.g. Figure 2B) would enable more direct comparison of activity of the same axon across different trial types.

We have now performed new experiments where we photo-inactivate the Pom during the reward epoch in the tactile goal-directed task. Here, we see no change in the task performance (see new panels in Figure 5). Together with the finding that Pom axons within S1 are less active during the reward epoch compared with the response epoch, our findings suggest that the Pom does not predominantly signal reward during expert behavior.

We now provide more detail in the Figure captions and methods regarding row sorting of the ca2+ activity patterns (heatmaps) to improve clarity and interpretation.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Although reviewers acknowledge improvement in the clarity of the manuscript, key experiments that were requested were not performed, such as inhibiting during different epochs or in naïve animals. Moreover, a major issue that was raised in the first round was to clarify what function the authors are ascribing to POm. Although the reviewers acknowledge improvement in language, the reviewers all felt that it is still not clear why Pom is needed to signal correct performance, and the data do not exactly support the conclusion that POm=correct.

This study directly investigates the role of higher order somatosensory thalamus (POm) in the forepaw area of the primary somatosensory cortex during sensory-based goal-directed behavior, and highlights the important role this thalamic input has on behavioral performance and learning. In this new revised manuscript, we have now addressed all of the comments/suggestions from the reviewers and made the following major changes:

1) Optogenetically inhibit the POm during learning of the tactile goal-directed behavior. Here, we found learning was severely disrupted and mice took approximately two-times longer to learn the task when the POm was photo-inactivated. This data is now included in new Figure 6.

2) To assess the role of POm activity during the reward delivery, we photo-inactivated the POm during the reward epoch in expert mice. Here we found no influence of POm photo-inactivation during reward delivery on mouse performance. We also performed control experiments where mice were instead injected with GFP. This result, which corresponds to the lower rates in the activity of POm axons within S1 during reward delivery, is now included in Figure 5.

3) Included an additional Figure Supplement to Figure 1 to clearly illustrate our POm axonal inclusion and exclusion criteria (Figure 1 —figure supplement 4).

4) Assessed the efficacy of POm axonal photoinhibition in vitro. This data is now included as a new panel in Figure 5 —figure supplement 1.

5) Included a more direct analysis in the revised manuscript illustrating the relationship between the licking response and POm activity. This analysis shows there is no correlation between licking and POm axonal activity, further suggesting that POm axonal activity is not simply due to licking behavior.

6) Report the POm axonal ca2+ transient peak amplitudes during HIT and MISS trials for direct comparison (Figure 2).

7) Compared the incorrect trials across the three task types and included this information in the revised manuscript.

8) Overall, the manuscript has been revised and rewritten to improve the reporting of the results.

Taken together, these new additions to the manuscript have strengthened the findings that the POm encodes and influences correct action selection in learnt behavior, and additional experiments have highlighted an important role of the POm during learning.

https://doi.org/10.7554/eLife.77177.sa2

Article and author information

Author details

  1. Danilo La Terra

    Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Australia
    Contribution
    Conceptualization, Data curation, Data curation, Investigation, Methodology, Visualization, Writing – original draft, Writing – review and editing
    Competing interests
    No competing interests declared
  2. Ann-Sofie Bjerre

    Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Australia
    Contribution
    Investigation, Data curation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6032-6502
  3. Marius Rosier

    1. Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Australia
    2. School of Biochemistry and Immunology and Trinity College Institute for Neuroscience, Trinity College Dublin, Dublin, Ireland
    Contribution
    Data curation, Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9732-5543
  4. Rei Masuda

    Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Australia
    Contribution
    Data curation, Investigation
    Competing interests
    No competing interests declared
  5. Tomás J Ryan

    1. Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Australia
    2. School of Biochemistry and Immunology and Trinity College Institute for Neuroscience, Trinity College Dublin, Dublin, Ireland
    3. Child & Brain Development Program, Canadian Institute for Advanced Research (CIFAR), Toronto, Canada
    Contribution
    Funding acquisition, Supervision
    Competing interests
    No competing interests declared
  6. Lucy M Palmer

    Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Australia
    Contribution
    Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review and editing
    For correspondence
    lucy.palmer@florey.edu.au
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3676-657X

Funding

National Health and Medical Research Council (APP1130716)

  • Lucy M Palmer
  • Tomás J Ryan

National Health and Medical Research Council (APP1063533)

  • Lucy M Palmer

National Health and Medical Research Council (APP1085708)

  • Lucy M Palmer

Australian Research Council (DP160103047)

  • Lucy M Palmer

Sylvia and Charles Viertel Charitable Foundation

  • Lucy M Palmer

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank members of the Palmer laboratory and Matthew Larkum for their helpful discussions and comments on the manuscript. We would also like to thank Verena Wimmer for her POm expertise and Ronny Bergmann, Viktor Bahr, Jens Kremkow, and Robert Sachev for use of their pupil-tracking software.

Ethics

All procedures were approved by the Florey Institute of Neuroscience and Mental Health Animal Care and Ethics Committee and followed the guidelines of the Australian Code of Practice for the Care and Use of Animals for Scientific Purpose.

Senior Editor

  1. Joshua I Gold, University of Pennsylvania, United States

Reviewing Editor

  1. Ishmail Abdus-Saboor, Columbia University, United States

Publication history

  1. Preprint posted: July 6, 2020 (view preprint)
  2. Received: January 18, 2022
  3. Accepted: March 2, 2022
  4. Accepted Manuscript published: March 8, 2022 (version 1)
  5. Accepted Manuscript updated: March 9, 2022 (version 2)
  6. Version of Record published: March 21, 2022 (version 3)

Copyright

© 2022, La Terra et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,732
    Page views
  • 360
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Danilo La Terra
  2. Ann-Sofie Bjerre
  3. Marius Rosier
  4. Rei Masuda
  5. Tomás J Ryan
  6. Lucy M Palmer
(2022)
The role of higher-order thalamus during learning and correct performance in goal-directed behavior
eLife 11:e77177.
https://doi.org/10.7554/eLife.77177
  1. Further reading

Further reading

    1. Neuroscience
    Andrew P Davison, Shailesh Appukuttan
    Insight

    Artificial neural networks could pave the way for efficiently simulating large-scale models of neuronal networks in the nervous system.

    1. Neuroscience
    Jonathan Nicholas, Nathaniel D Daw, Daphna Shohamy
    Research Article

    A key question in decision making is how humans arbitrate between competing learning and memory systems to maximize reward. We address this question by probing the balance between the effects, on choice, of incremental trial-and-error learning versus episodic memories of individual events. Although a rich literature has studied incremental learning in isolation, the role of episodic memory in decision making has only recently drawn focus, and little research disentangles their separate contributions. We hypothesized that the brain arbitrates rationally between these two systems, relying on each in circumstances to which it is most suited, as indicated by uncertainty. We tested this hypothesis by directly contrasting contributions of episodic and incremental influence to decisions, while manipulating the relative uncertainty of incremental learning using a well-established manipulation of reward volatility. Across two large, independent samples of young adults, participants traded these influences off rationally, depending more on episodic information when incremental summaries were more uncertain. These results support the proposal that the brain optimizes the balance between different forms of learning and memory according to their relative uncertainties and elucidate the circumstances under which episodic memory informs decisions.