A stable, distributed code for cue value in mouse cortex during reward learning

David J. Ottenheimer; Madelyn M. Hjort; Anna J. Bowen; Nicholas A. Steinmetz; Garret D. Stuber

doi:10.7554/eLife.84604.2

Introduction

Association of environmental stimuli with rewards and the subsequent orchestration of value-guided reward-seeking behavior are crucial functions of the nervous system linked to the pre-frontal cortex (PFC) (Klein-Flügge et al., 2022; Miller and Cohen, 2001). PFC is heterogeneous, with many studies noting subregional differences in both neural coding of (Hunt et al., 2018; Kennerley et al., 2009; Sul et al., 2010; Wang et al., 2020a) and functional impact on (Buckley et al., 2009; Dalley et al., 2004; Kesner and Churchwell, 2011; Rudebeck et al., 2008) value-based reward seeking in primates and rodents. Further, functional manipulations of PFC subregions exhibiting robust value signals do not always cause a discernible impact on reward-guided behavior (Chudasama and Robbins, 2003; Dalton et al., 2016; St. Onge and Floresco, 2010; Verharen et al., 2020; Wang et al., 2020a), encouraging investigation of differences between value signals across PFC. Within individual PFC subregions, multiple studies have observed evolving neural representations across time, calling into question the stability PFC signaling (Hyman et al., 2012; Malagon-Vina et al., 2018). A systematic comparison of coding properties across rodent PFC and related motor and sensory regions, as well as across days and stimulus sets, is necessary to provide a full context for the contributions of PFC subregions to reward processing.

Identifying neural signals for value requires a number of considerations. One issue is that other task features can vary either meaningfully or spuriously with value. In particular, action coding is difficult to parse from value signaling, given the high correlations between behavior and task events (Musall et al., 2019; Zagha et al., 2022) and widespread neural coding of reward-seeking actions (Steinmetz et al., 2019). Additionally, without a sufficiently rich value axis, it is possible to misidentify neurons as ‘value’ coding even though they do not generalize to valuations in other contexts (Hayden and Niv, 2021; Stalnaker et al., 2015; Zhou et al., 2021). Because reports of value have come from different experiments across different species, it is difficult to compare the presence of value signaling even across regions within prefrontal cortex (Hayden and Niv, 2021; Hunt et al., 2018; Kennerley et al., 2009; Namboodiri et al., 2019; Otis et al., 2017; Stalnaker et al., 2015; Sul et al., 2010; Wang et al., 2020a; Zhou et al., 2021).

In this work, we sought to address the existing ambiguity in the distribution and stability of value signaling. We implemented an olfactory Pavlovian conditioning task that permitted identification of value correlates within the domain of reward probability across two separate stimulus sets. With acute in vivo electrophysiology recordings, we were able to assess coding of this task across 11 brain regions, including five PFC subregions, as well as olfactory and motor cortex, in a single group of mice, permitting a well-controlled comparison of coding patterns across a large group of task-relevant regions in the same subjects. Unexpectedly, in contrast to the graded cue and lick coding across these regions, the proportion of neurons encoding cue value was more consistent across regions, with a slight enrichment in PFC but with similar value decoding performance across all regions. To assess coding stability, we performed 2-photon calcium imaging of neurons in PFC for multiple days and determined that the cue and lick codes we identified were stable over time. Our data demonstrate universality and stability of cue-reward coding in mouse cortex.

Results

Distributed neural activity during an olfactory Pavlovian conditioning task

We trained mice on an olfactory Pavlovian conditioning task with three cue (conditioned stimulus) types that predicted reward on 100% (‘CS+’), 50% (‘CS50’), or 0% (‘CS-’) of trials (Fig. 1A). Each mouse learned two odor sets (odor sets A and B), trained and imaged on separate days and then, for electrophysiology experiments, presented in six alternating blocks of 51 trials during recording sessions (Fig. 1B). Mice developed anticipatory licking (Fig. 1C-D), and the rate of this licking correlated with reward probability (Fig. S1), indicating that subjects successfully learned the meaning of all six odors.

Electrophysiology and calcium imaging during olfactory Pavlovian conditioning.
(A) Trial structure in Pavlovian conditioning task.
(B) Timeline for mouse training.
(C) Mean (+/− standard error of the mean (SEM)) lick rate across mice (n = 5) on each trial type for each odor set during electrophysiology sessions. CS50(r) and CS50(u) are rewarded and unrewarded trials, respectively. Inset: mean anticipatory licks (change from baseline) for the CS+ and CS50 cues for every session, color-coded by mouse. F (1, 66) = 36.6 for a main effect of cue in a two-way ANOVA including an effect of subject.
(D) Same as (C), for the third session of each odor set (n = 5 mice). t(4) = 5.4 for a t-test comparing anticipatory licks on CS+ and CS50 trials.
(E) Neuropixels probe tracks labeled with fluorescent dye (red) in cleared brain (autofluorescence, green). AP, anterior/posterior; ML, medial/lateral; DV, dorsal/ventral. Allen CCF regions delineated in gray. Outline of prelimbic area in purple.
(F) Reconstructed recording sites from all tracked probe insertions (n = 44 insertions, n = 5 mice), colored by mouse.
(G) Sample histology image of lens placement. Visualization includes DAPI (blue) and GCaMP (green) signal with lines indicating cortical regions from Allen Mouse Brain Common Coordinate Framework.
(H) Location of all lenses from experimental animals registered to Allen Mouse Brain Common Coordinate Framework. Blue line indicates location of lens in (A). The dotted black line represents approximate location of tissue that was too damaged to reconstruct an accurate lens track. The white dotted line indicates PL borders.
(I) ML and DV coordinates of all neurons recorded in one example session, colored by region, and spike raster from example PL neurons.
(J) ROI masks for identified neurons and fluorescence traces from 5 example neurons.

Using Neuropixels 1.0 and 2.0 probes (Jun et al., 2017; Steinmetz et al., 2021), we recorded the activity of individual neurons in PFC, including anterior cingulate area (ACA), frontal pole (FRP), prelimbic area (PL), infralimbic area (ILA), and orbital area (ORB) (Laubach et al., 2018; Wang et al., 2020b). We also recorded from: secondary motor cortex (MOs), including anterolateral motor cotex (ALM), which has a well-characterized role in licking (Chen et al., 2017); olfactory cortex (OLF), including dorsal peduncular area (DP), dorsal taenia tecta (TTd), and anterior olfactory nucleus (AON), which receive input from the olfactory bulb (Igarashi et al., 2012; Mori and Sakano, 2021); and striatum, including caudoputamen (CP) and nucleus accumbens (ACB) (Fig. 1E-F), which are major outputs of PFC (Heilbronner et al., 2016). In a separate group of mice, we performed longitudinal 2-photon calcium imaging through a Gradient Refractive Index (GRIN) lens to track the activity of individual neurons in PL across several days of behavioral training (Fig. 1G-H). Both techniques permitted robust measurement of the activity of neurons of interest and generated complementary results (Figs. 1I-J, S2).

Graded cue and lick coding across the recorded regions

In the electrophysiology experiment, we isolated the spiking activity of 5332 individual neurons in regions of interest across 5 mice (449-1550 neurons per mouse, Figs. 2A, S3A). The activity of neurons in all regions exhibited varying degrees of modulation in response to the six cues (Fig. 2B). Broadly, there was strong modulation on CS+ and CS50 trials that appeared to be common to both odor sets (Fig. S3B). Across regions, there was heterogeneity in both the magnitude and the timing of the neural modulation relative to odor onset (Fig. S3C).

To quantify the relative contribution of cues and conditioned responding (licking) to the activity of neurons in each region, we implemented reduced rank kernel regression (Steinmetz et al., 2019), using cues, licks, and rewards to predict neurons’ activity on held out trials (Figs. 2C, S5A). To determine the contribution of cues, licks, and rewards to each neuron’s activity, we calculated unique variance explained by individually removing each predictor from the model and calculating the reduction in model performance (Fig. 2D).

We identified individual neurons encoding cues, licks, or rewards as those for which that predictor uniquely contributed to 2% or more of their variance (a cutoff permitting no false positives and identifying neurons with robust task modulation, see Methods and Fig. S4). Neurons encoding cues (24% of all neurons), licks (11%), or both (16%) were most common. Neurons with any response to reward (independent of licking) were rare (5%) (Horst and Laubach, 2013). Cue neurons were characterized by sharp responses aligned to odor onset; in contrast, lick neurons’ responses were delayed and peaked around reward delivery (Fig. S5B-C), consistent with the timing of licks (Fig. 1C). The activity of cue neurons on rewarded and unrewarded CS50 trials validated our successful isolation of neurons with cue but not lick responses (Fig. S5D). The spatial distributions of cue and lick cells were noticeably different (Fig. 2E). The differences could be described as graded across regions, with the most lick neurons in ALM, and the most cue neurons in olfactory cortex and ORB, though each type of neuron was observed in every region (Fig. 2F-G, S6). Thus, our quantification of task encoding revealed varying proportions of cue and lick signaling across all regions.

Cue value coding is present in all regions

To expand upon our analysis identifying cue-responsive neurons, we next assessed the presence of cue value coding in this population. The three cue types (CS+, CS50, or CS-) in our behavioral task varied in relative value according to the predicted probability of reward (Eshel et al., 2016; Fiorillo et al., 2003; Winkelmeier et al., 2022). We reasoned that a neuron encoding cue value should have activity that scaled with the relative value of the cues (Fig. 3A). We modeled this relationship on a per-neuron basis by scaling a single cue kernel by its reward probability (0, 0.5, or 1, see Methods, Fig. 3B). This model describes cue activity as similar across odors of the same value, and scaling in magnitude according to each odor’s value. To consider alternative cue coding patterns, we also fit each neuron with 152 additional models containing all possible permutations of these values across the six cues, as well as models with selective responses for 1, 2, 3, 4, 5, or 6 cues, and determined which model best fit each neuron (Fig. S7). If cue responses were exclusively sensory and followed known olfactory coding properties (Pashkovski et al., 2020; Stettler and Axel, 2009), there would be no bias toward the ranked value model (CS+ > CS50 > CS-). We found, however, that this model was the most frequent best model, accounting for 14% of cue neurons (Fig. 3C). We refer to these neurons as value cells. There were two additional patterns that emerged across the population of cue neurons. First, there was a large fraction best explained by the model with equivalent responses to all 6 cues, which we term untuned cells (14% of cue neurons). Second, many of the alternative models had coding patterns that were similar to the ranked value model, and these appeared to be overrepresented among cue neurons, as well. We quantified the similarity to ranked value by correlating the values assigned to each cue in each model with those assigned to the cues in the ranked value model; this approach revealed an enrichment in neurons best fit by models most similar to ranked value (35% of cue neurons, Fig. 3C-D). We refer to neurons best fit by models most similar to the value model as value-like cells.

Robust value encoding and decoding among cue cells.
(A) Normalized activity of an example value cell with increasing modulation for cues with higher reward probability.
(B) For the same neuron, model-fit cue kernel for the original value model and with one of the 152 alternatively-permuted cue coding models.
(C) Distribution of best model fits across all cue neurons. Light blue is value model, purple is value-like models, gray is untuned model, and the remaining models are dark blue. Value-like models are shaded according to their correlation with ranked value, as illustrated in (D). Dashed line is chance proportion when assuming even distribution.
(D) Schematic of value assigned to each of the 6 cues for many of the cue coding models (full schematic in Fig. S7). Value-like models are sorted by their correlation with the ranked value model.
(E) Left: normalized activity of every value cell, sorted by mean firing 0 - 1.5 s following odor set A CS+ onset. Right: mean normalized activity of all value cells, grouped by whether peak cue activity (0 - 2.5 s) was above (top) or below (bottom) baseline in held out trials. Number of neurons noted for each plot.
(F) As in (E), for value-like cells.
(G) Accuracy (mean +/− SEM across neurons) of decoded cue identity for single neurons of value, value-like, and untuned neurons. * indicates where value, value-like, and untuned neurons significantly differed from each other and baseline (all *p <* 0.001, Bonferroni corrected). All pairwise comparisons in Supplemental Table 2.
(H) Accuracy (mean+/−std across bootstrapped iterations) of decoded cue identity using different numbers of neurons.
(I) Left: estimated value (mean +/− SD across 1000 bootstrapped iterations) of held out CS+ (top) and CS- (bottom) trials using linear models trained on the activity of value, value-like, or untuned neurons. Right: accuracy (mean+/−std across bootstrapped iterations) of decoded cue value using these value estimates. * indicates where the accuracy of value neurons exceeded value-like and untuned neurons (all *p <* 0.016, bootstrapped). All pairwise comparisons in Supplemental Table 2.

Interestingly, the frequency of value cells was similar across the recorded regions (Fig. 4A). Despite the regional variability in number of cue cells broadly (Fig. 2F-G), there were very few regions that statistically differed in their proportions of value cells (Figs. 4A, S9). Overall, there were slightly more value cells across all of PFC than in motor and olfactory cortex (Figs. 4A, S9). Although olfactory cortex had the most cue cells, these were less likely to encode value than cue cells in other regions (Fig. S10). Value-like cells were also widespread; they were less frequent in motor cortex as a fraction of all neurons, but they were equivalently distributed in all regions as a fraction of cue neurons (Figs. 4B, S9, S10).

Widespread cue value coding.
(A) Fraction of neurons in each region and region group classified as value cells (blue) and other cue neurons (gray), as well as fraction (+/− 95% CI) estimated from a linear mixed effects model with random effect of session (see Methods). PFC has more value cells than motor (p = 0.002) and olfactory (p = 0.00005) cortex. All pairwise comparisons in Supplemental Table 3.
(B) As in (A), for value-like cells. Motor cortex has fewer value-like cells than PFC (p = 8 10⁻⁶) and olfactory cortex (p = 4 10⁻⁸). All pairwise comparisons in Supplemental Table 3.
(C) First principal component value cells from all regions.
(D) As in (C), for value-like cells.
(E) Accuracy of decoded cue value (mean +/− SD across 1000 bootstrapped iterations) as in Fig. 3I, using 5 (with replacement) value cells from each region (left) and 25 value cells from each region group (right) using cue-evoked (blue) and baseline (black) activity. No regions or region groups significantly differed from each other (*p >* 0.46, Bonferroni corrected). All pairwise comparisons in Supplemental Table 3.

We next investigated the robustness of the value representation in each of our recorded regions. Principal component analysis on value and value-like cells from each region revealed similarly strong value-related dynamics across motor, prefrontal, and olfactory regions (Fig. 4C-D). We quantified the robustness of value coding in each region by decoding cue value using selections of value cells from each region and found similar performance across all regions (Fig. 9E). Taken together, these data illustrate that, in contrast to cue and lick coding broadly, value coding is similarly represented across the regions we sampled. In fact, this observation extended to the striatal regions we sampled as well, indicating that such value coding is widespread even beyond cortex (Fig. S11).

Because cue valuations can be influenced by preceding reward outcomes, we next considered whether the cue value signaling we detected was sensitive to the history of reinforcement (Nakahara et al., 2004; Ottenheimer et al., 2020; Winkelmeier et al., 2022). To estimate the subjects’ trial-by-trial cue valuation, we fit a linear model predicting the number of anticipatory licks on each trial using cue type, overall reward history, and cue type-specific reward history as predictors. We found a strong influence of cue type-specific reward history and a more modest influence of overall reward history (Fig. 5A). We used the model prediction of licks per trial as our estimate of trial value; the effects of reward history on lick rate were apparent when grouping trials by the value estimates from the trial value model (Fig. 5B).

A subset of cue cells incorporate reward history.
(A) Coefficient weight (+/− standard error from model fit) for reward outcome on the previous 10 trials of any type (left) and on the previous 10 trials of the same cue type (right) for the ‘trial value’ model: a linear model predicting the number of anticipatory licks on every trial of every session. Lick rates were normalized so that the maximum lick rate for each session was equal to 1. Colored lines are models fit to each individual mouse.
(B) Mean (+/− SEM) lick rate across mice (n = 5 mice) on trials binned according to value estimated from the trial value model.
(C) Normalized activity of an example history value cell with increasing modulation for cues of higher value.
(D) For the same neuron, model-predicted activity with the original value model (left) and with the history model, which uses trial-by-trial value estimates from the trial value model (right).
(E) For the same neuron, model-predicted activity using licks. Inset: variance explained using licks versus history for history neurons.
(F) The activity of all cells in each category projected onto the coding dimension maximally separating CS- and CS+ for trials binned by value estimated from the trial value model.
(G) The mean (+/− std across 5000 bootstrapped selections of neurons) activity (1 - 2.5 s from odor onset) along the coding dimension maximally separating CS- and CS+ for trials binned by value estimated from the lick model.
(H) The mean (+/− std across 5000 bootstrapped selections of neurons) slope of the activity on CS50 trials regressed onto the trial value model estimate for those trials. History and lick cells had greater slopes than the other groups (*p <* 0.0003, see Supplemental Table 4).
(I) Fraction of neurons in each region and region group classified as history cells (light blue) and other cue neurons (gray), as well as estimated fraction (+/−95% CI) with random effect of session (see Methods). PFC had more history cells than motor (p = 0.0016) and olfactory (p = 0.00053) cortex. All pairwise comparisons in Supplemental Table 4.

We therefore investigated whether value cells showed similar trial-by-trial differences in their cue-evoked firing rates (Fig. 5C). To test this, we compared the fit of our original cue coding models (Fig. 3B-D) with an alternative model in which the kernel scaled with the pertrial value estimates from our trial value model (Fig. 5D). Overall, 5% of cue cells, including 15% of value cells, were best fit by the history model. Although the number of anticipatory licks per trial was used to generate the trial value estimates, the precise licking pattern on those trials was a poorer predictor of neural responses than the trial value-scaled cue kernel model (Fig. 5E). To further evaluate the history component of these neurons, we calculated these neurons’ activity on CS50 trials of varying value estimates from the trial value model and projected it onto the population dimension maximizing the separation between CS+ and CS-. We hypothesized that high value CS50 trials would be closer to CS+ activity while low value CS50 trials would be closer to CS-activity. Indeed, history cells (and lick cells) demonstrated graded activity along this dimension, in contrast to non-history value, value-like, and untuned cells (Fig. 5F-H). Finally, we examined the regional distribution of history cells and found low numbers across all regions, but with higher prevalence overall in PFC than in motor and olfactory cortex (Fig. 5I), lending additional support for slightly enhanced value coding in PFC.

Cue coding emerges along with behavioral learning

To determine the timescales over which these coding schemes emerged and persisted, we performed longitudinal 2-photon calcium imaging and tracked the activity of individual neurons across several days of behavioral training (Fig. 6A). We targeted a GRIN lens to PL, a location with robust cue and lick coding (Fig. 2F) and where cue responses were predominantly value or value-like (Figs. 3E-F, S10). Mice (n = 8) developed anticipatory licking during the first sessions of odor set A (A1) that differentiated CS+ trials from CS50 (t(7) = 3.2, p = 0.015) and CS-(t(7) = 7.0, p = 0.0002) trials and CS50 trials from CS- (t(7) = 3.7, p = 0.008) trials (Fig. 6B-C). Visualizing the normalized activity across the imaging plane following CS+ presentation early and late in session A1 revealed a pronounced increase in modulation across this first session (Fig. 6D-E). Individual neurons (n = 705, 41-165 per mouse) also displayed a notable increase in modulation in response to the CS+ after task learning (Fig. 6F).

Acquisition of conditioned behavior and cue encoding in PFC.
(A) Training schedule for 5 of the mice in the calcium imaging experiment. An additional 3 were trained only on odor set A.
(B) Mean (+/− SEM) licking on early (first 60) and late (last 60) trials from day 1 of odor set A (n = 8 mice).
(C) Mean (+/− SEM) baseline-subtracted anticipatory licks for early and late trials from each day of odor set A. Thin lines are individual mice (n = 8 mice).
(D) Standard deviation of fluorescence from example imaging plane.
(E) Normalized activity of each pixel following CS+ presentation on early and late trials of session A1.
(F) Normalized deconvolved spike rate of all individual neurons on early and late trials of session A1.
(G) Proportion of neurons classified as coding cues, licks, rewards, and all combinations for each third of session A1.
(H) Mean(+/− SEM) unique variance explained by cues, licks, and rewards for neurons from each mouse. Thin lines are individual mice. Unique variance was significantly different across session thirds for cues (F (2, 21) = 3.71, p = 0.04) but not licks (F (2, 21) = 0.37, p = 0.69) or reward (F (2, 21) = 0.65, p = 0.53, n = 8 mice, one-way ANOVA).
(I) Mean (+/− SEM) normalized deconvolved spike rate for cells coding cues, licks, both, or neither on early and late trials, sorted by whether peak cue activity (0 - 2.5 s) was above (top) or below (bottom) baseline for late trials.

To determine whether this increase in activity was best explained by a cue-evoked response, licking, or both, we again used kernel regression to fit and predict the activity of each neuron for early, middle, and late trials in session A1. The number of individual neurons encoding cues more than doubled from early to late A1 trials (Fig. 6G). The unique variance cues increased across this first session, in contrast to licks and reward (Fig. 6H). This stark change in cue coding was also noticeable when plotting neurons encoding cues, licks, or both, as defined at the end of the session, on both early and late trials (Fig. 6I). These data indicated that PFC neural activity related to cues (but not licks) rapidly emerges during initial learning of the behavioral task.

Cue and lick coding is stable across days

We next assessed whether cue and lick coding were stable across days. By revisiting the same imaging plane on each day of training, we were able to identify neurons that were present on all three days of odor set A training (n = 371, 20-65 per mouse) (Fig. 7A-B). There was remarkable conservation of task responding across days, both on an individual neuron level (Fig. 7C) and across all imaged neurons (Fig. 7D). In fact, neurons were much more correlated with their own activity on the subsequent day than would be expected by chance (Figs. 7E, S12A). To further quantify coding stability, we fit our kernel regression to the activity of each neuron on session A3 (Fig. 7F) and then used these models to predict activity on early, middle, and late trials on sessions A1-3. Session A3 model predictions were most highly correlated with true activity during A3, but they outperformed shuffle controls at all time points, demonstrating preservation of a learned coding scheme (Figs. 7G, S12B). We then asked more specifically whether cells coding cues, licks, and both maintained their coding preferences across days. For each group of cells, we calculated their unique cue, lick, and reward variance at each time point. The preferred coding of each group, as defined on session A3, was preserved in earlier days (Fig. 7H). Thus, cue and lick coding are stable properties of PFC neurons across multiple days of behavioral training.

A subset of mice (n = 5) also learned a second odor set (odor set B), presented on separate days. Activity was very similar for both odor sets, evident across the entire imaging plane (Fig. 8A), for individual tracked neurons (n = 594, 81-153 per mouse) (Fig. S2B), and for kernel regression classification of these neurons (Fig. 8B). Notably, odor set A models performed similarly well at predicting both odor set A and odor set B activity (Fig. 8C). Moreover, cue, lick and both neurons maintained their unique variance preference across odor sets (Fig. 8D). Finally, to investigate the presence of value coding across odor sets over separate days, we fit tracked cue neurons with the value model and its shuffles. Even with odor sets imaged on separate days (days 5 and 6 of training, A3 and B3), we again found that the value and value-like models were the best models for sizable fractions (9% and 47%, respectively) of cue neurons, demonstrating that value coding is conserved across stimulus sets on consecutive days (Fig. 8E-G). Given the prominence of value-like signals in this imaged population, we then assessed the stability of cue cells with preferential CS+ responses across the tracked A1-3 sessions and found conservation of a value-like coding pattern (Fig. 8H) and, as with the whole population (Fig. 7G), greater correlation in activity across days than expected by chance (Fig. 8I).

Cue and lick coding in separately trained odor sets.
(A) Normalized activity of all pixels in the imaging plane following CS+ presentation on the third day of each odor set (A3 and B3, days 5 and 6 of training).
(B) Fraction of neurons coding for cues, licks, rewards, and their combinations in A3 and B3 (days 5 and 6).
(C) Mean (+/− SEM, across mice) correlation between activity predicted by odor set A3 models and its training data (A3, cross-validated) or activity in B3, for true (black) and trial shuffled (gray) activity. Thin lines are individual mice. F (1, 16) = 3.2, p = 0.09 for main effect of odor set, F (1, 16) = 135, *p <* 10⁻⁸ for main effect of shuffle, F (1, 16) = 2.2, p = 0.16 for interaction, n = 5 mice, two-way ANOVA.
(D) Mean (+/− SEM, across mice) unique cue, lick, and reward variance for cells classified as coding cues, licks, both, or neither for odor set A. For each category, odor set A unique variance preference was maintained for odor set B (*p <* 0.04) except for both cells, for which lick and reward variance were not different in odor set B (p = 0.22, Bonferroni-corrected, n = 5 mice).
(E) Distribution of best model fits across all cue cells, with colors from Fig. 3C. Dashed line is chance proportion when assuming even distribution.
(F) Left: normalized activity of every value cell, sorted by mean firing 0 - 1.5 s following odor set A CS+ onset. Right: mean normalized activity of all value cells, grouped by whether peak cue activity (0 - 2.5 s) was above (top) or below (bottom) baseline in held out trials. Number of neurons noted for each plot.
(G) As in (E), for value-like cells.
(H) Mean (+/− SEM, across neurons) activity of cue cells tracked across A1, A2, and A3 with preferential CS+ firing, defined on half of A3 trials and plotted for the other half of A3 trials and all of A1 and A2 trials.
(I) For neurons in (H), correlation between a neuron’s activity in one session and its own activity in the subsequent session, quantified as a percentile out of correlations with the activity of all other neurons on the subsequent day. Plotted as the median for each subject (n = 7 with CS+ preferring cue cells) and the mean (+/− SEM) across these values. Real data was more correlated than shuffled data (p = 0.016 A1:A2, p = 0.031 A2:A3, Wilcoxon signed-rank test).

Discussion

Our experiments assessed how coding for reward-predicting cues and reward-seeking actions differed across brain regions and across multiple days of training. We found coding for cues and licks in all regions we sampled, but their proportions varied in a graded way across those regions. In contrast to regional differences in the proportion of cue-responsive neurons, cue value cells were present in all regions and value could be decoded from them with similar accuracy regardless of region. Coding for cue value was greatly overrepresented compared to alternative cue coding schemes and, in a subset of neurons, incorporated the recent reward history. Cue coding was established within the first day of training and neurons encoding cues or licks maintained their coding preference across multiple days of the task; the value characteristics of cue cells were also maintained across days. These results demonstrate widespread value coding and stability of cue and lick codes in PFC.

Graded cue and lick coding across regions

We found robust and separable coding for licks and cues (and combined coding of both) in all regions using electrophysiology and in PL using calcium imaging. The widespread presence of lick coding is consistent with recent reports of distributed movement and action coding (Musall et al., 2019; Steinmetz et al., 2019; Stringer et al., 2019); however, we saw sizable differences in the amount of lick coding across recorded regions. Notably, ALM had the greatest number of lick neurons, as well as the fewest cue neurons, perhaps reflecting its specialized role in the preparation and execution of licking behavior (Chen et al., 2017). Conversely, the olfactory cortical regions DP, TTd, and AON had the most cue neurons (especially non-value coding cue neurons), suggesting a role in early odor identification and processing (Mori and Sakano, 2021). PFC subregions balanced lick and cue coding, consistent with their proposed roles as association areas (Klein-Flügge et al., 2022; Miller and Cohen, 2001), but there was variability within PFC as well. In particular, ORB had a greater fraction of cue cells than any other subregions, consistent with its known dense inputs from the olfactory system (Ekstrand et al., 2001; Price, 1985; Price et al., 1991). Thus, our results establish that the neural correlates of this Pavlovian conditioned behavior consist of a gradient of cue and response coding rather than segmentation of sensory and motor responses.

Widespread value signaling

Value signals can take on many forms and occur throughout task epochs. In our experiments, we focused on the predicted value associated with each conditioned stimulus, which is crucial for understanding how predictive stimuli produce motivated behavior (Berridge, 2004). Surveys of value coding in primate PFC have found individual neurons correlated with stimulus-predicted value in many subregions, with the strongest representations typically in ORB (Hunt et al., 2018; Kennerley et al., 2009; Roesch and Olson, 2004; Sallet et al., 2007). In rodents, there is also a rich literature on value signaling in ORB (Kuwabara et al., 2020; Namboodiri et al., 2019; Schoenbaum et al., 2003; Stalnaker et al., 2014; Sul et al., 2010; van Duuren et al., 2009; Wang et al., 2020a), but there have also been many reports of value-like signals in frontal cortical regions beyond ORB (Allen et al., 2019; Kondo and Matsuzaki, 2021; Otis et al., 2017; Wang et al., 2020a). In our present experiment, we sought to expand upon these rodent results by separating cue activity from licking, which tracks value and may confound interpretation, by including more than two cue types, which provided a rich space to assess value coding, and by sampling from many frontal regions in the same experiment.

When considering the number of neurons responsive to cues rather than licks, our data confirmed the importance of ORB, which has more cue-responsive neurons than motor and other prefrontal regions, but, beyond cue responsivity, we were interested in identifying specific cue coding patterns pertaining to value. By analyzing the activity of cue-responsive neurons across all 6 odors predicting varying probabilities of reward, we were able to isolate neurons coding value, as well as those with value-like signals that could easily be misconstrued as value-coding in a task with fewer cues and value levels. Included in the value-like models are coding patterns that bias their activity for higher value odors without fitting our strict linear ranked value criteria; for instance, selective firing for one or two of the CS+ odors. The enrichment of these models among cue responsive neurons, even in olfactory cortex, indicates the prevalence of value-biased coding schemes for odor-responsive neurons across brain regions. The question remains of where odor information is first shaped according to value. There have been multiple reports of some association-related modification of odor representations as early as the olfactory bulb (Chu et al., 2016; Doucette et al., 2011; Koldaeva et al., 2019; Li et al., 2015). Considering we detected value and especially value-like coding in AON, DP, and TTd, perhaps these regions are a crucial first step in processing and amplifying task-related input from the olfactory bulb. Because they provide input to PFC (Bhattarai et al., 2021; Igarashi et al., 2012), they may be an important source of the cue coding we observed there.

The distribution of cue cells with linear coding of value was mostly even across regions, with slight enrichment overall in PFC compared to motor and olfactory cortex, but no subregional differences in PFC. Importantly, cue value could be decoded from value cells in each region with similar accuracy. One consequence of a widely distributed value signal is that manipulating only one subregion would be less likely to fully disrupt value representations, which is consistent with the results of studies comparing functional manipulations across PFC (Chudasama and Robbins, 2003; Dalton et al., 2016; St. Onge and Floresco, 2010; Verharen et al., 2020; Wang et al., 2020a). Different subregional impacts on behavior may reveal biases in how the value signal in each region contributes to reward-related behaviors, for instance during learning or expression of a reward association (Namboodiri et al., 2019; Otis et al., 2017; Wang et al., 2020a). A related interpretation is that, in this task, there may be other properties that correlate with cue value, and the homogeneous value representation we observed across regions masks regional differences in tuning to these other correlated features, such as motivation (Roesch and Olson, 2004) and a host of related concepts, including salience, uncertainty, vigor, and arousal (Hayden and Niv, 2021; Stalnaker et al., 2015; Zhou et al., 2021), which can have different contributions to behavior. This interpretation is consistent with broader views that observations of ‘value’ signals are often misconstrued (Zhou et al., 2021) and that pure abstract value may not be encoded in the brain at all (Hayden and Niv, 2021). Although the identification of value in our task was robust to three levels of reward probability across two stimulus sets, the fact that this signal was widespread contributes to the case for revisiting the definition and interpretation of value to better understand regional specialization.

In our analysis, we uncovered a distinction between neurons encoding the overall value of cues and those with value representations that incorporated the recent reward history. Neurons with history effects were rare and most frequent in PFC. These neurons may have a more direct impact on behavioral output in this task, because the lick rate also incorporated recent reward history. Notably, the impact of reward history on these neurons was noticeable even prior to cue onset, consistent with a previously proposed mechanism for persistent value representations encoded in the baseline firing rates of PFC neurons (Bari et al., 2019).

Stability of PFC codes

Previous reports have observed drifting representations in PFC across time (Hyman et al., 2012; Malagon-Vina et al., 2018), and there is compelling evidence that odor representations in piriform drift over weeks when odors are experienced infrequently (Schoonover et al., 2021). On the other hand, it has been shown that coding for odor association is stable in ORB and PL, and that coding for odor identity is stable in piriform (Wang et al., 2020a), with similar findings for auditory Pavlovian cue encoding in PL (Grant et al., 2021; Otis et al., 2017) and ORB (Namboodiri et al., 2019). We were able to expand upon these data in PL by identifying both cue and lick coding and showing separable, stable coding of cues and licks across days and across sets of odors trained on separate days. We were also able to detect value coding common to two stimulus sets presented on separate days, and conserved value features across the three training sessions. Notably, the model with responses only to CS+ cues best fit a larger fraction of imaged PL neurons than the ranked value model, a departure from the electrophysiology results. It would be interesting to know if this is due to a bias introduced by the calcium imaging approach, the slightly reduced CS50 licking relative to CS+ licking in the imaging cohort, or the shorter imaging experimental timeline.

The consistency in cue and lick representations we observed indicates that PL serves as a reliable source of information about cue associations and licking during reward seeking tasks, perhaps contrasting with other representations in PFC (Hyman et al., 2012; Malagon-Vina et al., 2018). Interestingly, the presence of lick, but not cue coding at the very beginning of the first session of training suggests that lick cells in PL are not specific to the task but that cue cells are specific to the learned cue-reward associations. Future work could expand upon these findings by examining stimulus-independent within session value coding across many consecutive days.

Overall, our work emphasizes the importance of evaluating regional specialization of neural encoding with systematic recordings in many regions using the same task. Future work will clarify whether cue value is similarly widely represented in other reward-seeking settings and whether there are regional differences in the function of the value signal.

Supporting information

Supplemental Table 2

Supplemental Table 3

Supplemental Table 4

Supplemental Table 5

Supplemental Table 1

Acknowledgements

Thank you to Vijay Namboodiri and Charles Zhou for assistance with the imaging. Thank you to Noam Roth for the spike sorting quality control metrics and feedback on the decoding analysis. This work was supported by National Institutes of Health grants F32DA053714 (D.O.), F31DA053706 (M.H.), T32DK007247 (A.B.), R37DA032750 (G.S.), and P30DA048736 (G.S.), a UW Center for the Neurobiology of Addiction, Pain, and Emotion 2-photon pilot project grant (D.O.), a Klingenstein-Simons Fellowship in Neuroscience (N.S.), and the Pew Biomedical Scholars Program (N.S.).

Author contributions

Conceptualization: D.O., M.H., N.S., G.S.; data collection: D.O., M.H., A.B.; data curation: D.O., A.B.; data interpretation: D.O., M.H., N.S., A.B., G.S.; formal analysis: D.O.; visualization: D.O., M.H., A.B.; writing - original draft: D.O., M.H., A.B.; writing - review & editing: D.O., M.H., A.B., N.S., G.S.; funding acquisition: N.S., G.S..

Declaration of interests

The authors declare no competing interests.

Data and code availability

The data and code for this manuscript are publicly available at https://doi.org/10.5281/zenodo.6686927 (Ottenheimer et al., 2022).

Methods

Subjects

Subjects (n = 5 for electrophysiology, n = 8 for calcium imaging) were male and female C57BL/6 mice single-housed on a 12hr light/dark cycle and aged 12-28 weeks at the time of recordings. Imaging experiments were performed during the dark cycle, electrophysiology during light cycle. Mice were given free access to food in their home cages for the duration of the experiment. Mice were water restricted for the duration of the experiments and maintained at around 85% of their baseline weight (Guo et al., 2014a). All experimental procedures were performed in strict accordance with protocols approved by the Animal Care and Use Committee at the University of Washington.

Surgical procedures

Mice were anesthetized with isoflurane (5%) and maintained under anesthesia for the duration of the surgery (1-2%). Mice received injections of carprofen (5 mg/kg) prior to incision.

Electrophysiology

A brief (1 h) initial surgery was performed, as previously described (Guo et al., 2014b; Steinmetz et al., 2017, 2019), to implant a steel headbar (approximately 15 3 0.5 mm, 1 g) for head fixation and a 3D-printed recording chamber exposing the skull for subsequent craniotomies. Briefly, an oval incision was made extending from the interparietal bone to the frontonasal suture, skirting the ocular area. The skin and periosteum were removed to expose the entire dorsal surface of the skull. Skull yaw, pitch, and roll were leveled, and exposed bone was texturized with a brief application of green activator (Super-Bond C&B, Sun Medical). The incision was secured to the skull with application of cyanoacrylate (VetBond; World Precision Instruments), and the 3D-printed recording chamber was attached to the skull with L-type radiopaque polymer (Super-Bond C&B). A thin layer of cyanoacrylate was applied to the skull inside the chamber and allowed to dry. Multiple (2-4) thin layers of UV-curing optical glue (Norland Optical Adhesives #81, Norland Products) were applied to the skull inside the chamber and cured with UV light to protect the exposed bone. The headbar was attached to the skull over the interparietal bone posterior to the chamber with Super-Bond polymer, and more polymer was applied around the headbar and chamber. Following recovery, a second brief (15-30 min) surgery was conducted to perform craniotomies for Neuropixels probe insertion. Briefly, following induction of anesthesia a small (2 1.5 mm (w h)) craniotomy was made over frontal cortex (+2.5 - 1 mm AP, 2.5 - 0.3 mm ML) with a handheld dental drill. The craniotomy was covered with a soft silicone gel (DOWSIL 3-4680) and the recording chamber was covered with a 3D-printed lid sealed with Kwik-Cast elastomer to protect craniotomy from dust.

Calcium imaging

A Gradient-Refractive Index (GRIN) lens and metal headcap were implanted following previously described procedures (Namboodiri et al., 2019) with the following modifications. In most mice, once the dura was removed from the craniotomy, we performed two injections of 0.5 µL of virus (1 µL total) containing the GCaMP gene construct (AAVDJ-CamKIIa-GCaMP6s, 5.3 10¹² viral particles/mL from UNC Vector core lot AV6364) using a glass pipette microinjector (Nanoject II) at Bregma +1.94 mm AP, 0.3 and 1.2 mm ML, -2 mm DV. Ten minutes elapsed before microinjector withdrawal to allow virus to diffuse away from each infusion site. Then, mice were implanted with a 1×4mm GRIN lens (Inscopix) aimed at +1.94 mm AP, 0.6 mm ML, -1.8 mm DV. A subset of mice did not receive viral injections; instead, a lens with the imaging face coated 1 µL of the GCaMP6s virus mixed with 5 percent aqueous silk fibroin solution (Jackman et al., 2018) was implanted at the same coordinate. GCaMP expression and transients were similar in both preparations. Mice were allowed to recover for at least 5 weeks before experiments began.

Behavioral training

Mice were headfixed during training and recording sessions using either a headring (imaging experiments) or headbar (electrophysiology experiments). After initial habituation to head fixation, mice were first trained to lick for 2.5µL rewards of 10% sucrose solution, delivered every 8 - 12 s through a miniature inert liquid valve (Parker 003-0257-900). After 4 - 5 days of lick training, mice experienced their first odor exposure (without reward delivery). Odors were delivered for a total of 1.5 s using a 4-channel olfactometer (Aurora 206A) with 10% odor flow rate and 800 SCCM overall flow rate of medical air. Odors were randomly assigned to sets and cue identities, counterbalanced across mice. Odors were -carvone, -limonene, alpha- pinene, butanol, benzaldehyde, and geranyl acetate (Sigma Aldrich 124931, 218367, 147524, 281549, 418099, 173495, respectively), selected because of they are of neutral valence to naive mice (Devore et al., 2013; Saraiva et al., 2016). Odors were diluted 1:10 in mineral oil and 10 µL was pipetted onto filter paper within the odor delivery vials (Thermo Fisher SS246-0040) prior to each session. Airflow was constant onto the mouse’s nose throughout the session and switched from clean air to scented air for the 1.5 s duration odor delivery on each trial.

On days 1 - 2 of Pavlovian conditioning, mice received 50-75 trials each of 3 odor cues (odor set A), followed by reward on 100% (CS+), 50% (CS50), or 0% (CS-) of trials, 2.5 s following odor onset, with 8 - 12 s between odor presentations. On days 3-4 mice then received training for 2 days with a second odor set (odor set B) with three new odors. For electrophysiology experiments, the odors were subsequently presented in the same sessions in 6 blocks of 51 trials. Odor set order alternated and was counterbalanced across days. For imaging experiments, mice received the third day of odor set A on day 5 and the third day of odor set B on day 6 of conditioning. An additional 3 imaging mice were only trained on one odor set.

Electrophysiological recording and spike sorting

During recording sessions, mice were headfixed. Recordings were made using either Neuropixels 1.0 or Neuropixels 2.0 electrode arrays (Jun et al., 2017; Steinmetz et al., 2021), which have 384 selectable recording sites. Recordings were made with either 1.0 (1 shank, 960 sites, 2.1 (1 shank, 1280 sites) or 2.4 (4 shanks, 5120 sites) probes, depending on the regions of interest. Probes were mounted to a dovetail and affixed to a steel rod held by a micromanipulator (uMP4, Sensapex Inc.). For later electrode track localization within the brain, probes were coated with a fluorescent dye (DiI, ThermoFisher Vybrant V22888) by holding 2 µl in a droplet on the end of a micropipette and painting the probe shank. In each session, one or two probes were advanced through the silicone gel covering the craniotomy over frontal cortex, then advanced to their final position at approximately 3 µm s⁻¹. Electrodes were allowed to settle for around 15 min before starting recording. Recordings were made in internal reference mode using the ‘tip’ reference site, with a 30 kHz sampling rate. Recordings were repeated at different locations on each of multiple subsequent days, performing an additional craniotomy over contralateral frontal cortex. The resulting data were automatically spike sorted with Kilosort2.5 and Kilosort3 (https://github.com/MouseLand/Kilosort). Extracellular voltage traces were preprocessed with common-average referencing by subtracting each channel’s median to remove baseline offsets, then subtracting the median across all channels at each time point to remove common electrical artifacts. Sorted units were curated using automated quality control (International-Brain-Laboratory et al., 2022): exclusions were based on spike floor violations (the estimated proportion of spikes that were missed because they fell below the noise level of the recording, i.e. estimated false negative rate), and refractory period violations (the estimated proportion of spikes arising from the non-primary neuron, i.e. the estimated false positive rate due to contamination, with a 10% cutoff). Quality control accuracy was assessed by manually reviewing a subset of the data using the phy GUI (https://github.com/kwikteam/phy). Because Kilosort2.5 and Kilosort3 use different clustering algorithms that can be advantageous for different types of recordings (stability, region, number of channels), for each session, we used units sorted with either Kilosort2.5 or Kilosort3 depending on which yielded the greatest number of high quality units for that session. Brain regions were only included for subsequent analysis if there were recordings from at least three subjects and a total over 100 neurons in the region. When we analyzed all of motor cortex together, we included ALM and MOs neurons. When we analyzed all of olfactory cortex, we included DP, TTd, AON, and other neurons in PIR, EPd, and OLF. We relabeled PIR and EPd as OLF because there were not enough neurons to analyze them as separate regions.

Imaging and ROI extraction

During imaging sessions, mice were headfixed and positioned under the 2-photon microscope (Bruker Ultima2P Plus) using a 20x air objective (Olympus LCPLN20XIR). A Spectra-Physics InSight X3 tuned to 920nm was used to excite GCaMP6s through the GRIN lens. Synchronization of odor and 10 percent sucrose delivery, lick behavior recordings, and 2-photon recordings was achieved with custom Arduino code. After recording, raw TIF files were imported into suite2p (https://github.com/MouseLand/suite2p). We used their registration, region-of-interest (ROI) extraction, and spike deconvolution algorithms, inputting a decay factor of τ = 1.3 to reflect the dynamics of GCaMP6s, and manually reviewed putative neuron ROIs for appropriate morphology and dynamics. To find changes in activity across the entire imaging plane, found the mean pixel intensity for frames in the time of interest (2 to 2.5 s from CS+), subtracted the mean intensity of each pixel prior to cue onset (-2 to 0 s from all cues), and divided by the standard deviation for each pixel across those frames prior to cue onset.

Histology

Animals were anesthetized with pentobarbital or isoflurane. Mice were perfused intracardially with 0.9% saline followed by 4% paraformaldehyde (PFA).

Electrophysiology

Brains were extracted immediately following perfusion and post-fixed in 4% paraformaldehyde for 24 h. In preparation for light sheet imaging brains were cleared using organic solvents following the 3DISCO protocol (Ertürk et al., 2012) (https://idisco.info/), with some modification. Briefly, on day 1 brains were washed 3X in PBS and dehydrated in a series of increasing MeOH concentrations (20%, 40%, 60%, 80%, 100%, 100%; 1-h each) then incubated overnight for lipid extraction in 66% dichloromethane (DCM) in MeOH. On day 2 brains were washed 2X twice in 100% MeOH for 1-h each, then bleached overnight in 5% H₂O₂ in MeOH at 4^◦C. On day 3 brains were washed 2X in 100% MeOH, then final lipid extraction was accomplished in a series of DCM incubations (3-h in 66% DCM in MeOH, 2X 100% DCM for 15 min each) before immersion in dibenzyl ether (DBE) for refractive index matching. Brains were imaged on a light sheet microscope (LaVIsion Biotec UltraScope II) 2-7 days after clearing. Brains were immersed in DBE in the imaging well secured in the horizontal position, and illuminated by a single light sheet (100% width, 4 µm thick) from the right. Images were collected through the 2X objective at 1X magnification, from the dorsal surface of the brain to the ventral surface in 10 µm steps in 488 nm (autofluorescence, 30% power) and 594 nm (DiI, 2-10% power) excitation channels. The 1000 raw TIF images were compiled into a single multi-image file with 10 µm voxels, then spatially downsampled to 25 µm voxels for transformation to the Allen common-coordinate framework (CCF) volume (Wang et al., 2020b) using the Elastix algorithm (Shamonin et al., 2014). CCF-transformed volumes were used to generate CCF fluorescent probe tract locations (pixel coordinates along the probe tract) using Lasagna (https://github.com/SainsburyWellcomeCentre/lasagna). Probe tract CCF pixel coordinates (origin front, top, left) were transformed to bregma coordinates (origin bregma, x==ML, y==AP and z==DV) in preparation for final integration with electrophysiology recordings using the International Brain Lab electrophysiology GUI (Faulkner M, Ephys Atlas GUI; 2020. https://github.com/int-brain-lab/iblapps/tree/master/atlaselectrophysiology). For recording alignment, sorted spikes and RMS voltage on each channel were displayed spatially in relation to the estimated channel locations in Atlas space from the tracked probe. The recording sites were then aligned to the Atlas by by manually identifying a warping such that recording sites were best fit to the electrophysiological characteristics of the brain regions (e.g. matching location of ventricles or white matter tracts with low firing activity bands). This procedure has been estimated to have 70 µm error (Liu et al., 2021; Steinmetz et al., 2019). Individual neuron locations were determined using the recording channel brain coordinates of each unit’s maximum-amplitude waveform. We additionally assigned MOs neurons to anterolateral motor cortex (ALM) if they were within a 0.75 mm radius of 2.5 mm AP, 1.5 mm ML (Chen et al., 2017).

Calcium imaging

Following perfusion, intact heads were left in PFA for an additional week before brain extraction. Brains were then sliced on a Leica Vibratome (VT1000S) at 70 µm before mounting and nuclear staining via Fluoroshield with DAPI (Sigma-Aldrich F6057-20ML). Slices with GRIN lens tracks were then imaged on a Zeiss Axio Imager M2 Upright Trinocular Phase Contrast Fluorescence Microscope with ApoTome. The resulting images were manually aligned to the to the Allen Brain Atlas to reconstruct the location of each GRIN lens.

Neuron tracking

To identify the same neurons across imaging sessions, we used two approaches. To track neurons across the two odor sets on days 5 and 6, we concatenated the TIF files from each session and extracted ROIs simultaneously. To track neurons across training days 1 - 3 for a single odor set, we manually identified ROIs from the ROI masks outputted by suite2p. We linked the ROIs using a custom Python script that permitted selection of the same ROI across the 3 imaging planes using OpenCV and saved the coordinates on each day. The tracking results across days 1 - 3 from one subject is displayed in Fig. 7B.

Behavioral analysis

For electrophysiology experiments, the subject was illuminated with infrared light (850 nm, CMVision IR30) and eye and face movements were monitored. The right eye was monitored with a camera (FLIR CM3-U3-13Y3M-CS) fitted with a zoom lens (Thorlabs MVL7000) and long-pass filter (Thorlabs FEL0750), recording at 70 fps. Face movements were monitored with another camera (FLIR CM3-U3-13Y3M-CS, zoom lens Thorlabs MVL16M23, long-pass filter Thorlabs FEL0750) directed at a 2 3 cm mirror reflecting the left side of the face, recording at 70 fps. Licks were detected from the face video by thresholding the average intensity of an ROI centered between the lips and the lick spout, calculated for every frame. Interlick intervals were thresholded at 0.083 s for a maximum lick rate of 12 licks s⁻¹. For calcium imaging experiments, eye and face movements were not monitored, and licks were detected with a capacitance sensor (MPR121, Adafruit Industries) connected to an Arduino board. To determine the impact of cues and previous outcomes on anticipatory licking, we fit a linear model on all electrophysiology sessions simultaneously (and for each mouse). We predicted the number of licks 0 to 2.5 s from odor onset using cue identity, outcomes on previous 10 trials, outcomes on previous 10 of that cue type, and total number of presentations of that cue type so far (to account for cue-specific satiety) using ‘fitlm’ in MATLAB. When dividing sessions into ‘early’ and ‘late’, we used the first 60 and last 60 trials of the session. When dividing sessions into thirds for the GLM (‘early’, ‘middle’, ‘late’), we used even splits of trials into thirds.

PSTH creation

Peri-stimulus time histograms (PSTHs) were constructed using 0.1s bins surrounding cue onset.

Electrophysiology

Neuron spike times were first binned into 0.02s bins and smoothed with a half-normal causal filter (σ = 300 ms) across 50 bins. PSTHs were then constructed in 0.1s bins surrounding each cue onset. Each bin of the PSTH was z-scored by subtracting the mean firing rate and dividing the standard deviation across the 0.1s bins in the 2s before all trials. When splitting responses by polarity (above/below baseline, Figs. 2B, 3E-F, 8H, S5B), we used even trials to determine polarity and plotted the mean across odd trials for cross validation.

Calcium imaging

Frames were collected at 30Hz with 2-frame averaging, so the fluorescence for each neuron and the estimated deconvolved spiking were collected at 15Hz. We interpolated the smoothing filter from the electrophysiology analysis (which was calculated at 50Hz) and applied it to the deconvolved spiking traces. We then constructed PSTHs in 0.1s bins surrounding each cue onset and z-scored (same as electrophysiology).

Licks

Licking PSTHS were constructed in 0.1s bins surrounding cue onset. Each trial was then smoothed with a half-normal causal filter (σ = 800 ms). For the GLM, the lick rate was calculated across the whole session by first counting licks in either the 0.02s (electrophysiology) or 15Hz (imaging) bins, smoothed with a half-normal causal filter over 25 bins, and then converted to 0.1s bins relative to each cue.

Kernel regression

To identify coding for cues, licks, and rewards in individual neurons, we fit reduced rank kernel-based linear model (Steinmetz et al., 2019).

Data preparation

The discretized firing rates f_n(t) for each neuron n were calculated as described above for PSTH creation. We used the activity -1 to 6.5 s from each cue onset on every trial for our GLM analysis.

Predictor matrix

The model included predictor kernels for cues (CS+, CS50, and CS- for each odor set, as relevant), licks (individual licks, lick bout start, and lick rate), and reward (initiation of consummatory bout). The cue kernels were supported over the window 0 to 5s relative to stimulus onset. The lick predictor kernels were supported from -0.3 to 0.3 s relative to each lick, from -0.3 to 2 s relative to lick bout start, and lick rate was shifted from -0.4 to 0.6 s in 0.2 s increments from original rate. The reward kernel was supported 0 to 4s relative to first lick following reward delivery. For electrophysiology experiments, the model also included 6 constants that identified the block number, accounting for tonic changes in firing rate across blocks. Because not all cues were present in every block, this strategy prevented the cue kernels from being used to explain baseline changes across blocks. For each kernel to be fit we constructed a Toeplitz predictor matrix of size T × l, in which T is the total number of time bins and l is the number of lags required for the kernel. The predictor matrix contains diagonal stripes starting each time an event occurs and 0 otherwise. The predictor matrices were horizontally concatenated to yield a global prediction matrix P of size T × L containing all predictor kernels. Rate vectors of all N neurons were horizontally concatenated to form F, a T × N matrix.

Reduced-rank regression

To prevent noisy and overfit kernels we implemented reduced-rank regression (Steinmetz et al., 2019), which allows regularized estimation by factorizing the kernel matrix K into the product of a L × r matrix B and a r × N matrix W, minimizing the total error: E = ||F PBW||². The T × r matrix PB consists of a set of ordered temporal basis functions that can be linearly combined to estimate the neuron’s firing rate over the whole training set and which results in the best possible prediction from any rank r matrix. To estimate each neuron’s kernel functions we generated the reduced rank predictor matrix PB for r = 20, estimated the weights w_n to minimize the squared error E_n = |f_n − PBw_n|² with elastic net regularization (using the MATLAB function ‘lassoglm’) with parameters α = 0.5 and λ = [0.001, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5], using 4-fold cross-validation to determine the optimal value for λ for each neuron. The kernel functions for neuron n were then unpacked from the L-length vector obtained by multiplying the first r = 20 columns of B by w_n (i.e. k_n = B_1:L,1:20W_n).

Predictor unique contributions

To assess the importance of each group of kernels for predicting a neuron’s activity we first fit the activity of each neuron using the full reduced-rank regression procedure, then fit a reduced model (with 4-fold cross-validation), holding out the kernels of the predictor to be tested (cues, licks, or rewards). If the difference in variance explained between the full and held-out model was > 2%, and the total variance explained by the full model was > 2%, the neuron was deemed selective for those predictors (Steinmetz et al., 2019). We validated this cutoff by comparing our results when adjusting the cutoff from 0.5%to5% (Fig. S4). The pattern of results was similar regardless of the cutoff. When we refit the reduced ranks to neural activity with the onset time of each trial shuffled, the 2% cutoff was the smallest that allowed no false positive identification of any neurons uniquely coding any variable (Fig. S4B). Using a higher cutoff led to mislabeling neurons with clear cue responses as non-coding (Fig. S4E).

Cue coding models

To assess cue coding schemes, we fit a new set of models focusing on a more restricted time window (-1 to 2.5 s from cue onset) using only cues and licks as predictors. Cue and lick neurons were identified as before, and subsequent cue characterization was performed on neurons with only a unique contribution of cues. To identify value coding among cue neurons, we fit a new kernel model with a single cue kernel that scaled according to cue as well as 6 block constants (as above) with full rank. We inputted cue value as 1, 0.5, and 0 for each CS+, CS50, and CS-, respectively, ranked according to their reward probability. We fit 152 additional models with alternative configurations of cue value: all permutations of 1, 1, 0.5, 0.5, 0, 0 across the six cues, as well as all permutations of high responses (1) for 6 (we call this the ‘untuned’ model), 5, 4, 3, 2, or 1 cues, with other cues set to 0. Among the 153 total models, some were more similar to the ranked value model, which we quantified by correlating the 6 cue values of each alternative model with the ranked model. We termed all models with > 0.5 correlation as ‘value-like.’ For each neuron, we found the model that best explained its activity. The models, their correlation with value, and the proportion of cue neurons best fit by each model are illustrated in Fig. S7. To verify the robustness of value coding in the neurons best fit by the ranked value model, we fit each of those neurons with 1000 iterations of the cue value model with shuffled cue order to create a null distribution. The fits of the original value model exceeded the 98th percentile of the null for all value neurons.

History model

For a more nuanced estimation of the value of the cue on each trial, we generated per trial value predictions using the lick linear model (described in section ‘Behavioral analysis’) with cue type, 10 previous outcomes, and 10 previous cue outcomes as predictors. These values were used to scale the height of cue kernel on each trial and were, on average, 0.05, 0.35, and 0.5 for CS-, CS50, and CS+, respectively, but varied on each trial according to the specific reward history. We compared the performance of this model to all the other cue coding models for value and value-like neurons to find neurons better explained by the history model. For neurons better fit by the history model, we also fit 1000 additional models with shuffled trial values within each cue (disrupting the trial history effect). Neurons exceeding the 95th percentile of this null distribution were deemed history neurons.

Coding stability

In the calcium imaging experiments, we used a number of approaches to assess the stability of neural coding. First, for neurons tracked across the first 3 sessions of odor set A, we took the trial-averaged activity of a given neuron for CS+, CS50, and CS- trials in one session and correlated it with the same neuron’s trial-averaged activity from the subsequent session. We quantified the strength of the correlation as its percentile among correlations between that neuron in the first session and every other neuron on the subsequent session and compared this value to shuffle control (neuron identity shuffled) (Figs. 7E and S12A). To assess coding stability of these neurons, we used the kernels resulting from fitting the full model on day 3 and the predictors from each session third to predict neural activity at those time points. We assessed the accuracy of the prediction by correlating it with the true activity versus the correlation with trial-shuffled data and present this data in original form (Fig. S12B) and normalize to model performance when predicting the (cross-validated) training data (Fig. S7G). This shuffle maintains the temporal dynamics of each trial but shuffles the link between predictors on a given trial and the neural activity for that trial; correlation of predictors (like licks) across trials preserves some prediction accuracy even with this shuffle. We also trained models with data from the third day of odor set A training (A3, day 5) and tested on training days A3 and B3 (days 5 and 6). To determine how unique variable contributions (cues, licks, rewards) evolved over time, we fit our kernel regression model independently to each session third (early, middle, late) of session 1-3 for neurons tracked across the three odor set A sessions (Fig. 7H). To assess value coding across the third sessions of odor set A and B (A3 and B3, days 5 and 6) we fit the 153 cue coding models described in Cue coding models to the neurons imaged on separate days (Fig. 8E), concatenating the data from each odor set and adding a constant for each day to account for day differences found found the model with the best fit for each neurons. We also looked at the stability of value-like signals across the three days of odor set A training by identifying CS+-preferring cue cells using half of trials in session A3 and plotting the activity of those neurons for the remaining A3 trials and all trials from A1 and A2 (Fig. 8H).

Principal component analysis

To visualize the dominant firing pattern of PL neurons (Fig. S2), and of value and value-like cells (Fig. 4), irrespective of direction (excitation or inhibition), we performed principal component analysis (‘PCA’ in MATLAB) on the concatenated PSTHs across all 6 cues for the neurons of interest, with each neuron’s activity normalized by peak modulation so that each neuron’s concatenated PSTH peaked at -1 or 1. We then plotted the score of the first components.

Decoding cue identity

We adapted the approach in (Ottenheimer et al., 2018) to use single units as well as random selections of pseudoensembles to decode cue identity (out of 6 options) (Fig. 3G). First, we binned the activity of each neuron into 0.25 s bins spanning -0.5 to 2.5 s from the onset of every cue in the session. These bins were labeled as 1-6 corresponding to the 6 different cues. For all decoding, we randomly selected 50 trials of each cue to use (most sessions had 51 of each cue, but a few had only 50). For single unit decoding of cue identity, we used five-fold cross-validation to train a linear discriminant model (’fitcdiscr’ in MATLAB) on 80% of the data and tested correct classification of the 6 cues on the remaining 20%. We plotted the mean +/− SEM performance over time for value, value-like, and untuned neurons, and compared their performance using an ANOVA with fixed effects of neuron type and time point and random effect of session, making pairwise comparisons with Bonferroni correction. For population decoding, we pooled the activity between 1 and 2.5 s from cue onset (a period with stable decoding in the single unit analysis) and randomly selected groups of 1, 5, 10, 25, 50, 75, 100, or 200 value, value-like, or untuned neurons from all regions. We used the same linear discriminant model (with regularization γ = 1 in ’fitcdiscr’) and five-fold cross-validation. We performed 1000 selections of neurons at each level, plotted the mean and standard deviation of classification accuracy across these iterations, and made pairwise comparisons across groups by calculating the number of iterations where the second group was greater or equal to the first; we repeated this one-way test for both directions of all pairs of groups and used a Bonferroni corrected α. Pattern of results was unchanged when population activity was standardized with PCA. Pattern of results was also unchanged when we trained on one odor set and tested on the other.

Decoding cue value

Data were prepared as for population decoding of cue identity, but with cues labeled as 0, 0.5, or 1 for CS-, CS50, and CS+ trials, respectively. Instead of a linear discriminant model, we used a linear model (elastic net, α = 0.5) to regress cue value onto the activity of pseudoensembles of neurons. To balance our model, we used 50 of each cue type for training and tested on 50 held out trials for a cue never seen by the model; this setup thoroughly tested the idea that value is encoded on a linear scale and thus should be able to generalize to a new cue in same value domain. For example to predict the value of 50 CS+ in odor set B trials, we used for training 50 trials of CS+A, 0 trials of CS+B, 25 trials of CS50A, 25 trials of CS50B, 25 trials of CS-A, and 25 trials of CS-B, maximizing coverage of the data while maintaining a balanced design. These models produced predicted value for each cue. We plot the predicted value for CS+ and CS- cues on the left in Fig. 3H. To convert these predictions to an accuracy score, we labeled values from -0.25 to 0.25 as CS-, 0.25 to 0.75 as CS50, and 0.75 to 1.25 as CS+ (values outside this range were automatically labeled incorrect). We performed this analysis on random groups of 1, 5, 10, 25, 50, 75, 100, or 200 value, value-like, or untuned neurons (Fig. 3H), as well as random groups of 5 neurons (with replacement) from each region and 25 neurons (with replacement) from each region group (Fig. 4E). We compared region decoding to decoding using a baseline window of -0.5 to 0 s from odor onset using neurons from each region. We performed 1000 selections of neurons at each level, plotted the mean and standard deviation of classification accuracy across these iterations, and made pairwise comparisons as for cue identity.

Cue coding dimension

To project population activity onto the coding dimensions separating CS- activity from CS+ and CS50 activity, respectively, we adapted an approach from Li et al. (2016). We first max normalized the odor set A PSTH activity of each neuron to prevent neurons with particularly large z-score values from dominating the dimension. We then defined coding dimensions from the even trials of odor set A. To find the ‘consensus’ cue-difference coding dimension across the group defined by each neuron’s maximal difference across cue responses, we found the 0.5s bin in the range 0 to 2.5s from cue onset with the peak difference between CS- and CS+ activity or CS- and CS50 activity, for each neuron. This comprised a difference vector of length N defining the maximum cue difference coding across the neuron group. This difference vector was then multiplied by the original z-score values of each neuron’s peak difference bin to find the values of peak CS+ vs CS- coding; these values were used to transform the data onto a 0 (CS-) to 1 (CS+) relative cue coding scale. To transform population activity onto the CS- to CS+ dimension at each moment in CS+, CS50, and CS- trials, we multiplied the activity of all neurons in each 0.1s bin of remaining odd odor set A trials (z-score) by the difference vector and used the same 0 to 1 scale conversion (‘same odor set’). We also multiplied the activity of neurons for cues in odor set B by the difference vector (‘other odor set’). We repeated the same process for CS- and CS50 activity. To find the angle between the CS+ and CS50 projections, we bootstrapped the vectors that connected baseline activity to peak activity of CS50 and CS+ along the CS- / CS+ and CS- / CS50 axes and found the angle between these vectors. To find population activity along the CS+ / CS- dimension at each moment for CS50 trials of various values, we multiplied the activity (z-score) of all neurons in each 0.1s bin of the CS50 PSTHs (grouped by value estimated from the lick linear model) by the difference vector and used the same conversion to 0 to 1 scale. To estimate the distribution of values along the CS+ / CS- dimension for each CS50 value condition, we bootstrapped (5000 iterations, with replacement) the population projection and took the mean 1 - 2.5 s from odor onset. We calculated the slope of the activity on CS50 trials by linearly regressing the estimated position of the population on the CS+ / CS- dimension against the value from the lick linear model used to group those trials (5000 iterations, with replacement). To compare slopes across cell groups, we generated a p-value by calculating the number of iterations where the second group was greater or equal to the first; we repeated this one-way test for both directions of all pairs of groups and used a Bonferroni corrected α.

Statistics

All statistical tests were performed in MATLAB (MathWorks). To compare the fraction of neurons of a specific coding type across regions, we fit a generalized linear mixed-effects model (‘fitglme’ in MATLAB) with logit link function and with fixed effects of intercept and region and a random effect of session and then found the estimated mean and 95% confidence interval for each region. For pairwise comparisons across regions, we used a specific contrast for each pair of regions (’coefTest’ in MATLAB) to find the p-value that these regions differed from each other and used a Bonferroni-corrected alpha for significance. To compare the number of anticipatory licks on different trial types, we found the mean number of anticipatory licks for each cue in each session and then performed a two-way ANOVA with effects of cue and subject and session as our n (Fig. 1C). To compare variance explained during each third of the first session, we found the mean value across neurons from each mouse and then performed a one-way ANOVA on those means with mouse as our n (Fig. 6H). To compare day 3 model performance on true and shuffled data across each time point (Fig. 7F), we found the mean value across neurons from each mouse at each time point and then performed a two-way ANOVA with main effects of shuffle and time point, with mouse as our n. We then calculated pairwise statistics using ‘multcompare’ in MATLAB with Bonferroni correction. To compare cue, lick, and reward unique variance at each time point for each cell category (determined on day 3, Fig. 7G), we found the mean from the cells in that category in each mouse at each time point and performed a two-way ANOVA with main effects of variable and day, with mouse as our n. We then calculated pairwise statistics using ‘multcompare’ in MATLAB with Bonferroni correction.

Significance of findings

Strength of evidence

Abstract

Summary

Introduction

Results

Distributed neural activity during an olfactory Pavlovian conditioning task

Electrophysiology and calcium imaging during olfactory Pavlovian conditioning.

Graded cue and lick coding across the recorded regions

Graded cue and lick coding across the recorded regions.

Cue value coding is present in all regions

Robust value encoding and decoding among cue cells.

Widespread cue value coding.

A subset of cue cells incorporate reward history.

Cue coding emerges along with behavioral learning

Acquisition of conditioned behavior and cue encoding in PFC.

Cue and lick coding is stable across days

Cue and lick coding is stable across days.

Cue and lick coding in separately trained odor sets.

Discussion

Graded cue and lick coding across regions

Widespread value signaling

Stability of PFC codes

Supporting information

Acknowledgements

Author contributions

Declaration of interests

Data and code availability

Methods

Subjects

Surgical procedures

Electrophysiology

Calcium imaging

Behavioral training

Electrophysiological recording and spike sorting

Imaging and ROI extraction

Histology

Electrophysiology

Calcium imaging

Neuron tracking

Behavioral analysis

PSTH creation

Electrophysiology

Calcium imaging

Licks

Kernel regression

Data preparation

Predictor matrix

Reduced-rank regression

Predictor unique contributions

Cue coding models

History model

Coding stability

Principal component analysis

Decoding cue identity

Decoding cue value

Cue coding dimension

Statistics

Anticipatory licking during the electrophysiology sessions.

Similar neural activity in prelimbic area using electrophysiology and calcium imaging.

Task-related neural activity across brain regions.

Validation of variance cutoff for variable coding.

Identification of cue and lick cells with GLM.

Comparing proportions of cue and lick neurons across regions.

Schematic of value model shuffles.

Additional analysis of value coding schemes.

Relative proportions of value and value-like cells across regions.

Value coding as a proportion of cue cells.

Comparing PFC and striatum.

Correlation across days in PL.

References

Article and author information

Author information

David J. Ottenheimer5

Madelyn M. Hjort5

Anna J. Bowen5

Nicholas A. Steinmetz

Garret D. Stuber

Version history

Cite all versions

David J. Ottenheimer

Madelyn M. Hjort

Anna J. Bowen