1. Neuroscience
Download icon

Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum in mice

Research Article
Cited
2
Views
2,149
Comments
0
Cite as: eLife 2017;6:e21886 doi: 10.7554/eLife.21886

Abstract

Dopamine neurons are thought to encode novelty in addition to reward prediction error (the discrepancy between actual and predicted values). In this study, we compared dopamine activity across the striatum using fiber fluorometry in mice. During classical conditioning, we observed opposite dynamics in dopamine axon signals in the ventral striatum (‘VS dopamine’) and the posterior tail of the striatum (‘TS dopamine’). TS dopamine showed strong excitation to novel cues, whereas VS dopamine showed no responses to novel cues until they had been paired with a reward. TS dopamine cue responses decreased over time, depending on what the cue predicted. Additionally, TS dopamine showed excitation to several types of stimuli including rewarding, aversive, and neutral stimuli whereas VS dopamine showed excitation only to reward or reward-predicting cues. Together, these results demonstrate that dopamine novelty signals are localized in TS along with general salience signals, while VS dopamine reliably encodes reward prediction error.

https://doi.org/10.7554/eLife.21886.001

eLife digest

New experiences trigger a variety of responses in animals. We are surprised by, move towards, and often explore new objects. But how does the brain control these responses?

Dopamine is a molecule that controls many processes in the brain and plays critical roles in various mental disorders, diseases that affect movement, and addiction. Rewarding experiences (like a glass of cold water on a hot day) can trigger dopamine neurons and studies have also shown that dopamine neurons respond to new experiences. This suggested that novelty may be rewarding in itself, or that novelty may signal the potential for future reward. On the other hand, it may be that different groups of dopamine neurons play different roles in responding to new or rewarding experiences.

In 2015, it was reported that dopamine neurons connected to the rear part of an area in the brain called the striatum receive signals from different parts of the brain than most other dopamine neurons. The dopamine neurons connected to this “tail” of the striatum preferentially received inputs from regions involved in arousal rather than reward, suggesting that they may have a unique role and transmit a different type of information.

Now, Menegas et al. have shown that dopamine signals in different areas of the striatum separate reward from novelty and other signals in mice. The results demonstrate that new odors activate dopamine neurons projecting to the tail of the striatum, but that this activity fades as the novelty wears off (as the mice learn to associate the odor with a particular outcome). By contrast, dopamine neurons projecting to the front of the striatum do not respond to novelty, but rather become more active as mice learn which odors accompany rewards (only responding to odors that predict reward). The experiments also show that dopamine neurons in the tail of the striatum encode information about the importance of a stimulus.

Together, these findings indicate that some of the roles dopamine plays in the brain may not be related to reward, but are instead linked to the novelty and importance of the stimulus. The next challenge will be to find out how the separate reward and novelty signals in dopamine neurons relate to the animals’ behavior. This may help us to better understand dopamine-related psychiatric conditions, such as depression and addiction.

https://doi.org/10.7554/eLife.21886.002

Introduction

Animals respond to new stimuli in a characteristic way across species, historically characterized as an ‘orienting reflex’ or a ‘what is it reflex’ (Pavlov and Anrep, 1927; Sechenov, 1935; Sokolov, 1963). Detection of novel stimuli is advantageous for survival because novel stimuli can signal potential rewards or potential threats. Orienting towards a novel stimulus and understanding it through exploration can allow future exploitation of potential rewards. In addition to behavioral advantages, novelty detection is fundamental for computation in our brain. For example, novelty detectors, or ‘novelty filters’ (Kohonen and Oja, 1976; Marsland et al., 2002), can reduce the amount of total information so that we can focus on unexpected perceptions as inputs to pay attention to and to learn from. Indeed, behavioral studies have repeatedly shown that both humans and other animals have enhanced memory for novel items (Kishiyama et al., 2009; Restorff, 1933).

Physiologically, it is widely accepted that novelty responses are distributed over a network of many brain areas (Courchesne et al., 1975; Kishiyama et al., 2009; Knight, 1996). Among these, single unit recordings have shown that dopamine neurons in the midbrain increase their firing in response to the presentation of a novel stimulus in several species and behavioral paradigms (Horvitz et al., 1997; Ljungberg et al., 1992; Schultz, 2015; Steinfels et al., 1983).

As animals experience the repeated association of a stimulus and a reward, they learn to expect the reward when the stimulus is presented (Pavlov and Anrep, 1927). Dopamine neurons are thought to be the neural substrate underlying this type of learning because they signal reward prediction error: the difference between actual and expected reward values (Bayer and Glimcher, 2005; Bromberg-Martin et al., 2010; Clark et al., 2012; Cohen et al., 2012; Schultz et al., 1997). These neurons are thought to guide decision-making by broadcasting this information to many regions of the forebrain and reinforcing behaviors that lead to reward (Barto et al., 1999; Dayan and Niv, 2008; Haber, 2014; Montague et al., 2004; Steinberg et al., 2013).

Novelty responses in dopamine neurons (Horvitz et al., 1997; Schultz, 2015) were initially puzzling because animals cannot know whether a novel stimulus will reliably predict an outcome with a positive or negative value. One hypothesis was that dopamine neurons take an optimistic approach toward novel stimuli, assuming that they will predict a valuable outcome until proven wrong (Hazy et al., 2010; Kakade and Dayan, 2002). This ‘optimistic initialization’ in dopamine neurons may have advantages. For example, the novelty responses in dopamine neurons may induce orienting behaviors towards novel stimuli, similar to dopamine responses to reward or reward-predicting cues that induce orienting behaviors (Hazy et al., 2010; Kakade and Dayan, 2002). Further, dopamine novelty responses may allow computational exploration (Dayan and Sejnowski, 1996), or storage of the novel stimulus in working memory (Braver and Cohen, 1999), so that animals have a better chance to associate novel stimuli to potential rewards (Hazy et al., 2010; Kakade and Dayan, 2002). However, these hypotheses do not necessarily fit with conflicting observations of animals’ behavioral responses to novel stimuli (Gershman and Niv, 2015). Indeed, depending on the experimental context, animals sometimes approach and sometimes avoid novel options compared to familiar ones (Gershman and Niv, 2015).

One explanation for why some dopamine neurons respond to novel stimuli could be that some subpopulations of dopamine neurons are not strictly related to reward prediction error coding. Recent studies have shown that there is substantial diversity among dopamine neurons at the molecular level (Grimm et al., 2004; Lacey et al., 1989; Lammel et al., 2008; Roeper, 2013) as well as in their activity (Brischoux et al., 2009; Bromberg-Martin et al., 2010). For example, single unit recordings in monkeys showed that some dopamine neurons are inhibited by aversive outcomes and others are excited by them (Matsumoto and Hikosaka, 2009). This suggests that there are distinct types of dopamine neurons and that some do not encode pure value. Instead, the data suggest that some dopamine neurons encode value and others might encode ‘motivational salience’ (the absolute value of ‘value’).

Recent anatomical studies have revealed that dopamine neurons with different projection targets are embedded in separate circuits. The entire set of inputs to dopaminergic nuclei includes a large number of regions (Geisler et al., 2007; Geisler and Zahm, 2005). Neural circuit tracing using a modified rabies virus (Wickersham et al., 2007) enabled us to specifically label the monosynaptic inputs onto dopamine neurons, revealing that the ventral tegmental area (VTA) and the substantia nigra compacta (SNc) dopamine neurons receive slightly different inputs (Watabe-Uchida et al., 2012). More recent studies have shown that dopamine neurons with different projection targets receive different inputs (Beier et al., 2015; Lerner et al., 2015; Menegas et al., 2015). Specifically, we found that dopamine neurons projecting to the posterior ‘tail’ of the striatum (TS) have unique inputs compared to dopamine neurons projecting to many other brain regions, including the ventral striatum (VS), suggesting that these neurons could have a distinct function (Menegas et al., 2015).

Based on our previous anatomical findings, in this study, we compared the dopamine axon activity in VS and TS while mice learned new odor-outcome associations (we will call the bulk calcium signal that we observed from the axons of DAT+ midbrain dopamine neurons in the striatum ‘VS dopamine’ and ‘TS dopamine’ in the following sections). Our results revealed opposite dynamics for learning new cue-outcome associations in VS dopamine and TS dopamine. We observed a large response to novel cues in TS dopamine which subsequently decreased over the course of associative learning. On the other hand, we saw no response to novel cues in VS dopamine. Instead, VS dopamine gradually developed responses to reward-predicting cues during learning. These findings revealed that dopamine novelty coding is localized to the posterior part of the striatum, while VS dopamine faithfully encodes reward prediction error. Thus, novelty responses in dopamine may be better formalized separately from the reward prediction error (RPE) framework, rather than being included in the RPE framework.

Results

Recording activity from dopamine axons in the striatum

We used optical fiber fluorometry (fiber photometry) (Kudo et al., 1992) to record bulk calcium signals from the axons of midbrain dopamine neurons projecting to several regions of the striatum (Kim et al., 2016; Parker et al., 2016). We chose four regions: the ventral striatum (VS), dorsomedial striatum (DMS), dorsolateral striatum (DLS), and the posterior tail of the striatum (TS) (Figure 1). To measure activity from dopamine axons in these regions, we infected midbrain dopamine neurons with a genetically encoded calcium indicator, GCaMP6m (Akerboom et al., 2012; Chen et al., 2013). To target dopamine neurons specifically, we injected a cre-dependent virus (AAV-flex-GCaMP6m) into both the VTA and SNc of transgenic mice expressing Cre recombinase under the control of dopamine transporter (DAT-cre mice) (Bäckman et al., 2006) crossed with reporter mice expressing red fluorescent protein (tdTomato) (Jackson Lab).

Figure 1 with 4 supplements see all
Recording dopamine activity across the striatum using fiber fluorometry.

(A) Schematic of GCaMP virus injection and optic fiber implantation sites. Detailed schematic of recording setup is shown in Figure 1—figure supplement 1. Sample raw data are shown in Figure 1—figure supplement 2. (B) Distribution of optic fibers (sagittal max-projection) used for recording labeled red (VS), orange (DMS), blue (DLS), and purple (TS) with dotted lines denoting ½ mm increments. Coronal sections are shown in Figure 1—figure supplement 3. (C) Schematic of the basic trial structure. An odor cue (CS) (1 s duration) is followed by an outcome (US) or no outcome after 1 s delay, followed by a random inter-trial interval (ITI) of 6–12 s. At a low frequency, unexpected outcomes are also delivered. (D) Licking in response to odors predicting reward (blue) or nothing (grey). Odor onset is t = 0 and water delivery time is t = 2, so anticipatory licking occurs between t = 0 and t = 2 (quantified on the right). (E) An example of GCaMP virus infection. Green indicates AAV-flex-GCaMP6m infection (top), red indicates genetically encoded tdTomato in DAT-cre-expressing neurons (middle), and the bottom panel is an overlay of the two signals. Labeled axons in the striatum are shown in Figure 1—figure supplement 4. (F) Example single trial responses to unpredicted water from GCaMP (top) and tdTomato (middle) from a single session in a mouse with a fiber implanted in VS. The average GCaMP signal across trials in that session are plotted in the bottom panel.

https://doi.org/10.7554/eLife.21886.003

We chronically implanted optic fibers into the striatum of these mice to deliver 473 nm and 561 nm light and collect GCaMP and tdTomato signals (Chen et al., 2012; Gunaydin et al., 2014; Kim et al., 2016) (Figure 1A, Figure 1—figure supplement 1). For these experiments, we continuously excited GCaMP with 473 nm light and continuously recorded GCaMP emission (Figure 1—figure supplement 2). We recorded from 65 fibers total, targeted to either VS (n = 25), DMS (n = 8), DLS (n = 8), or TS (n = 24) (Figure 1B, Figure 1—figure supplement 3). In fixed tissue, we observed fluorescence from GCaMP6m+ axons primarily in the striatum of these mice (Figure 1—figure supplement 4).

Mice were presented with odors paired with water delivery or no outcome (Figure 1C). In some experiments, odors were paired with an aversive air puff or a mild tone. Infrequently, mice also received unpredicted water, air puff or tone (≤10% of trials). After training, mice licked with an increased frequency in response to the reward-predicting odor (anticipating the reward), but not in response to odors that predicted no outcome (Figure 1D), indicating that mice had learned an association between an odor and reward. We observed large responses to unpredicted reward in GCaMP, but not tdTomato, signals (example traces shown in Figure 1E–F and Figure 1—figure supplement 1B) and recorded from the same fibers over the course of several weeks with relative stability (Figure 1—figure supplement 1C).

We identified the fiber implant sites by clearing brains using CLARITY (Chung and Deisseroth, 2013), imaging them as intact volumes using a light sheet microscope (Tomer et al., 2014), and aligning them to a single reference space (Menegas et al., 2015). We categorized the location of fibers in the dorsal striatum into the DMS, DLS, or TS based on their medial-lateral and anterior-posterior positions (see Materials and methods). VS fibers were spread throughout the core and lateral shell of the ventral striatum (Figure 1—figure supplement 3). TS fibers were located near the posterior end of the dorsal striatum (Figure 1—figure supplement 3). We will focus on VS and TS dopamine, because VS and TS dopamine displayed the most contrasting input patterns in our previous anatomical study (Menegas et al., 2015).

Excitation to novel cues in TS dopamine

In order to examine dopamine activity in VS and TS during associative learning, we recorded both during the initial learning of new odor-outcome associations (first time association, Figure 2 and Figure 3) and also during repeated learning where animals experienced new associations every day (Figure 4).

Figure 2 with 4 supplements see all
Responses to novel odors in VS and TS dopamine.

Comparison of VS and TS responses to novel odors in naïve animals. (A) Average response to the first presentation of a novel odor in VS dopamine, with SE bars. (B) Average responses over the course of the first 30 trials are shown in bins of 5 trials. (C) A heat map of responses to a novel odor over the course of a single session (each row is one trial) with yellow indicating an increase in signal and cyan indicating a decrease in signal. (DF) TS dopamine responses to novel odors, plotted as in AC. (G) Comparison of first-trial water responses in VS and TS (left) and first-trial responses to novel odors (right). See Materials and methods. (H) Time course of responses to a novel odor in VS (circles) and TS (squares) over the course of 30 trials in bins of 5 trials. This data was analyzed based on odor decay rates to show that there was no large effect of odor decay in Figure 2—figure supplement 1. Motion artifacts were examined in Figure 2—figure supplement 3. GCaMP signal decay was measured in Figure 2—figure supplement 2. Finally, response latencies are shown in Figure 2—figure supplement 4.

https://doi.org/10.7554/eLife.21886.008
Figure 3 with 1 supplement see all
Opposite dynamics of VS and TS dopamine during initial learning of new odor-reward associations.

Learning dynamics for VS dopamine (A) and TS dopamine (B) over the course of 3 weeks of training, as naïve animals learn an association between an odor and reward. Odor onset (CS) and water delivery time (US) are shown as dotted lines. Responses are compared on day 1, day 7, day 14, and day 21. The average traces are plotted in blue (predicted reward) and black (unpredicted reward), with the standard error of the mean (SEM). Individual animals’ responses can be found in Figure 3—figure supplement 1. (C) Average licking in response to reward-predicting odor (blue) compared to average licking in response to unexpected reward (black). (D) A quantification of the CS and US responses in VS from the above traces, over training compared to responses to unexpected water (black). (E) A quantification of the CS and US responses in TS from the above traces, over training, compared to responses to unexpected water (black). (F) The average number of anticipatory licks in the period between odor presentation and water delivery, compared over days of training.

https://doi.org/10.7554/eLife.21886.013
Figure 4 with 3 supplements see all
Opposite dynamics of VS and TS dopamine during repeated learning of new odor-reward associations.

Responses to new cues predicting nothing (AC) or water (DF) in VS. Responses to new cues predicting nothing (GI) or water (JL) in TS. Responses to aversive air puffs are found in Figure 4—figure supplement 1 and quantified in Figure 4—figure supplement 2. In the panels on the left, trials are color-coded such that red indicates the first trial and blue indicates the last trials of the session. Trials were quantified in bins of 5 trials. The middle panels show the average CS (magenta) and US (green) responses over the course of a session, again quantified in bins of 5 trials. The panels on the right are heat maps, where every line is a single trial. In these heat maps, yellow indicates an increase in signal and cyan indicates a decrease. Odor discrimination latency is quantified in Figure 4—figure supplement 3.

https://doi.org/10.7554/eLife.21886.015

For initial training, mice were first habituated in a recording set-up with head-fixed preparation for 2–3 days (see Materials and methods). After this initial habituation, animals were presented with four randomly interleaved trial types: (1) unpredicted water, (2) odor predicting water followed by water delivery, (3) odor predicting water followed by no outcome (omission), or (4) odor predicting no outcome.

We first compared VS dopamine and TS dopamine responses to an odor predicting no outcome over the course of the session (Figure 2). At the time of novel odor presentation, TS dopamine was excited very strongly by new odors (Figure 2D). By contrast, VS dopamine did not respond to novel odors – not even on the very first trial (Figure 2A). We examined 13 animals with fibers implanted into VS and 12 animals with fibers implanted into TS. TS dopamine showed significant excitation above baseline following the presentation of novel odors (p = 1.76x10−6, t-test, n = 12 animals, Figure 2G), whereas responses to novel odors in VS dopamine were not significantly different from baseline (p = 0.354, t-test, n = 13 animals, Figure 2G).

The responses to novel odors in TS dopamine decreased significantly over the course of the first 30 odor presentations (p = 5.10x10−11, repeated measures ANOVA, n = 12 animals, Figure 2E–F). The responses after five odor representations (6–10 trials) were significantly smaller than the responses in the first five trials (p = 1.45x10−4, paired t-test, n = 12 animals). To determine whether this decrease in signal was caused by a decrease in effective odor concentration within sessions, we compared responses to fast-decaying and slow-decaying odors (Figure 2—figure supplement 1). TS dopamine responses to novel odors, both fast-decaying and slow-decaying odors, decreased over the course of a session (Figure 2—figure supplement 1F), whereas TS dopamine responses to familiar odors did not change (Figure 2—figure supplement 2C). In VS, no responses were seen over the course of the session (Figure 2B–C). Both VS dopamine and TS dopamine showed excitation to water (responses to the first presentation of unpredicted water are shown in Figure 2G).

To determine whether the signals we observed could have been caused by a movement-related artifact, we used video analysis to quantify the total body movements of mice performing the task (Figure 2—figure supplement 3). We found that mice did not show major body movements in response to odors predicting no outcome (Figure 2—figure supplement 3B) or novel odors (Figure 2—figure supplement 3C), although they performed a stereotypical approach behavior in response to odors predicting reward (Figure 2—figure supplement 3A). To determine whether changes in signal intensity observed within a session were likely to have been related to bleaching, we compared responses to free water and familiar odors over the course of sessions (Figure 2—figure supplement 2). We found that both VS dopamine responses (Figure 2—figure supplement 2A) and TS dopamine responses (Figure 2—figure supplement 2B) to free water remained constant within sessions. Similarly, TS dopamine responses to familiar odors predicting no reward remained constant as well (Figure 2—figure supplement 2C).

Opposite initialization to novel cues in VS and TS dopamine

On the first day of odor-outcome association learning, TS dopamine was strongly excited by new odors (Figure 3, Figure 3—figure supplement 1). The responses to novel odors that predicted water were significantly larger than responses to unpredicted water itself (p = 1.36x10−4, paired t-test, n = 19 animals, Figure 3B). Dopamine responses to odors gradually decreased over 21 days (p = 1.095x10−8, n = 19 animals, repeated measures ANOVA). The responses to water-predicting odors in TS dopamine were significantly smaller on the seventh day of training than on the first day (p = 1.93x10−5, n = 19 animals, day 1 vs day 7, Figure 3B). On the other hand, responses to predicted water did not change significantly (p = 0.641 paired t-test, n = 19 animals, day 1 vs day 7, Figure 3B).

By contrast, VS dopamine did not respond to novel odors predicting water (Figure 3, Figure 3—figure supplement 1). Instead, we observed an initially large excitation in response to water itself (Figure 3A). Over the course of the first 7 days, responses to predicted water (US responses) significantly decreased (p = 1.96x10−4, paired t-test, n = 10 animals, Figure 3A). Notably, this was independent of any CS response developing. VS dopamine did not display significant responses to odor cues that predicted reward on the first day (p = 0.099, t-test, n = 10 animals, CS responses compared to baseline on day 1) or on day 7 (p = 0.054, t-test, n = 10 animals, CS responses compared to baseline on day 7) (Figure 3D). In fact, responses to reward-predicting cues appeared only after 2 weeks of training (Figure 3A). Of note, responses to unpredicted water remained constant over the course of learning (Figure 3A–B). Anticipatory licking gradually increased in frequency over the course of learning (p = 0.0052, repeated measures ANOVA, n = 10 animals, Figure 3C).

We tested whether repeated training affected the observed pattern for novel cues and reward signaling in VS and TS dopamine. We trained nine mice with VS fiber implants and 11 mice with TS fiber implants by introducing a new odor paired with a reward or no outcome every day for a week, and then measured dopamine activity while learning new odor-water or odor-nothing associations (Figure 4). We found that repeated training with odor-reward associations did not change responses to new odors in VS dopamine or TS dopamine. VS dopamine did not respond to new odors (p = 0.8749, t-test, n = 9 animals, trial one or p = 0.322, t-test, n = 9 animals, trial 1–5 vs baseline, Figure 4B–E) and TS dopamine strongly responded to new odors (p = 0.0059, t-test, n = 11 animals, trial one or p = 0.0027, t-test, n = 11 animals, trial 1–5 vs baseline, Figure 4H–K). Indeed, TS dopamine showed excitation to 91% of new odor presentations (response in trial one vs baseline). Dopamine axon signals in mice repeatedly trained on learning odor-outcome contingencies displayed the same trends in the dynamics, but at a much faster rate: within a single session (Figure 4) rather than over the course of weeks (Figure 3). VS dopamine showed a decrease in US response followed by an increase in CS response, with no response to novel stimuli (Figure 4A–F). TS dopamine decreased responses to either novel odor (nothing-predicting or water-predicting) (Figure 4G–L).

To better understand the dopamine response to novel odors, we also paired new odors with an aversive air puff in these well-trained mice (Figure 4—figure supplement 1). As in the cases of novel odors predicting water or nothing, VS dopamine showed no odor responses (Figure 4—figure supplement 1A–C) and TS dopamine responded strongly to the novel odor (Figure 4—figure supplement 1D–F). Notably, the decrease in TS dopamine response to novel odors predicting air puff was much smaller than the decrease of TS dopamine response to novel odors predicting no outcome (Figure 4—figure supplement 2), indicating that the dynamics of the response depend on what the novel odor cue predicts.

Anticipatory licking in response to the rewarded odor increased after a few trials (p = 0.0387, paired t-test, n = 20 animals, trial 1–5 water CS lick vs baseline) (Figure 5, Figure 5—figure supplement 1). The animals showed differences in anticipatory licking frequency depending on cues within 10 trials of training (p = 0.000259, paired t-test, n = 20 animals, trial 6–10 water CS lick vs nothing CS lick), indicating learning of the outcomes of the odor cue (Figure 5C). VS dopamine did not show differences in responses to cues before 15 trials (p = 0.7736, paired t-test, n = 9 animals, water CS vs nothing CS trial 11–15, Figure 5D), whereas responses to predicted water decreased quickly (p = 0.020, paired t-test, n = 9 animals, trial 1–5 vs 6–10, Figure 5E). Plotting the CS and US responses as a function of anticipatory licks (rather than time) showed that mice behaviorally responded to reward-predicting odors faster than VS dopamine CS responses developed, while TS dopamine CS responses were present in all trials (Figure 5—figure supplement 2). TS dopamine decreased responses to cues depending on what the cue predicted within five trials (p = 0.0213, paired t-test, n = 11 animals, water CS vs nothing CS trial 1–5, Figure 5D) and the difference became smaller later in the session.

Figure 5 with 2 supplements see all
Dynamics of anticipatory licking behaviors and VS and TS dopamine.

Licking in response to new odors predicting reward (A) or no outcome (B) over the course of a session, in animals that have been trained with many new odor associations, as in Figure 4 (see Materials and methods). Separate plots for VS-implanted mice and TS-implanted mice are shown in Figure 5—figure supplement 1. (C) A quantification of the number of anticipatory licks elicited by each odor in VS-implanted animals (left) and TS-implanted animals (right). The difference between licks following a rewarding odor and an unrewarding odor are shown as open circles. (D) A comparison of the CS responses to rewarding and unrewarding new odors in VS dopamine (left) and TS dopamine (right). (E) A comparison of the US responses to either predicted water or predicted nothing in VS dopamine (left) and TS dopamine (right). The relationship between GCaMP responses in VS and TS and anticipatory licking is shown in Figure 5—figure supplement 2.

https://doi.org/10.7554/eLife.21886.019

We examined the temporal dynamics of the responses to novel odors in TS dopamine. The median onset latency of responses to novel odors in TS dopamine was 140 ms (Figure 2—figure supplement 4) and the median onset latency of discrimination between novel odors and familiar odors in TS dopamine was 170 ms (Figure 4—figure supplement 3).

In summary, dopamine axon signals in VS and TS showed opposite initialization while learning stimulus-outcome relationships. Dopamine axon signals in TS showed strong excitation to novel cues that gradually decreased, whereas dopamine axon signals in VS did not respond to cues with an unknown outcome and instead gradually developed cue responses to odors reliably predicting reward. Further, TS dopamine quickly discriminated cues, resulting in differential decrease rates of responses to novel cues depending on the predicted outcome (air puff, water, or nothing).

Responses to rewarding and non-rewarding stimuli in VS and TS dopamine

In order to understand the relationship between novelty responses and value coding, we next examined responses to rewarding and non-rewarding stimuli in VS dopamine and TS dopamine. Mice were trained to associate odors with water or with no outcome. After several weeks of this training, in some sessions, trials with odors predicting either a mild tone (55 dB) or an air puff were interleaved in addition to trials with water and trials with no outcome. We chose a very mild tone with a similar intensity to the background noise in the room to try to minimize the aversiveness of this stimulus. To estimate the aversiveness of auditory stimuli, we measured the behavioral responses to tones of different volumes in a different set of mice (Figure 6—figure supplement 1). We found that quiet tones did not cause freezing. When comparing the VS dopamine and TS dopamine responses to all stimuli, we observed that VS dopamine showed excitation only to reward and reward-predicting cues (Figure 6A–C, Figure 6—figure supplement 2), while TS dopamine was excited in response to a variety of stimuli including water, tone, air puff, odor cues predicting any of these outcomes, and also odor cues predicting no outcome (Figure 6D–F, Figure 6—figure supplement 2).

Figure 6 with 2 supplements see all
Responses to rewarding, aversive and neutral stimuli in VS and TS dopamine.

Dopamine responses to water (A), tone (B), and air puff (C) in the ventral striatum and the posterior tail of the striatum (DF). Plots of average traces from each region contain dotted lines indicating odor (CS) and outcome (US) delivery times. (A, D) Responses to unpredicted reward (cyan), predicted reward (blue), omitted reward (purple), and nothing odor (grey) are plotted in the left panels. For each trace, a quantification of the average peak response to the CS / US is shown on the right. (B, E) Responses to unpredicted tone (dark green), predicted tone (light green), omitted tone (yellow), and nothing odor (grey) are plotted in the left panels. For each trace, a quantification of the average peak response to the CS / US is shown on the right. (C, F) Responses to unpredicted air puff (red), predicted air puff (orange), omitted air puff (yellow), and nothing odor (grey) are plotted in the left panels. For each trace, a quantification of the average peak response to the CS / US is shown on the right. Data from individual sessions is shown in Figure 6—figure supplement 2. Behavioral responses to the tone are shown in Figure 6—figure supplement 1.

https://doi.org/10.7554/eLife.21886.022

We next examined reward prediction error coding, which consists of three key characteristics: (1) larger responses to reward-predicting cues than unrewarded cues, (2) smaller responses to predicted rewards than unpredicted rewards, and (3) a decrease in activity following reward omission. With respect to reward prediction, both VS and TS dopamine had a larger excitatory response to reward-predicting cues than cues predicting nothing (VS: p = 1.8x10−10, TS: p = 6.5x10−6, t-test, Figure 6A,D). Similarly, both VS and TS dopamine had a smaller response to a predicted reward than an unpredicted reward (VS: p = 5.1x10−11, TS: p = 1.0x10−4, t-test, Figure 6A,D). However, there was a significant difference in the response to the omission of a predicted reward: whereas VS dopamine showed a dip below baseline following reward omission (VS: p = 4.8x10−7, t-test, Figure 6A,D), TS dopamine axon signal was still significantly higher than baseline following omission (p = 2.2x10−8, t-test, Figure 6A,D). In TS, reward prediction elicited sustained activity over the interval between odor presentation and reward onset, and reward delivery caused only a small increase above this level. In fact, although the average peak response was slightly higher in rewarded trials than unrewarded trials (p = 0.0014), the total response (area under each curve) after the outcome (reward delivery or omission) did not differ significantly (p = 0.81).

Interestingly, we found that the signals observed in TS dopamine displayed components of prediction error in response to non-rewarding stimuli as well. For example, TS dopamine showed less excitation to predicted air puff or tone than unpredicted air puff or tone (p = 0.00016, p = 0.00074, Figure 6E–F). Additionally, TS dopamine cue responses to air puff or tone predicting cues were larger than the responses to cues predicting no outcome (air puff: p = 6.8x10−4, tone: p = 3.4x10−4, Figure 6E–F). Similar to water omission, the omission of an expected tone or expected air puff did not cause a dip or increase in signal (Figure 6E–F).

In summary, VS dopamine encodes RPE whereas TS dopamine responds to salient stimuli in general. TS dopamine encodes the prediction of salient stimuli, and decreases the responses to salient stimuli once they are predicted, which are characteristics of prediction error. Notably, however, we did not observe clear responses to the omission of expected salient stimuli.

Organization of dopamine novelty responses in VS and TS

Finally, we examined the relationship between different responses in dopamine and location in the striatum more carefully. For this purpose, in addition to VS and TS, dopamine axon signals in more anterior parts of the dorsal striatum (DMS and DLS) (Figure 1B) were recorded. Signals in each animal were pooled across sessions and the average was compared in relation to the location of recording sites (Figure 7). We first examined whether responses to novel odors were localized within VS or TS. VS consists of multiple sub-nuclei (Zahm and Brog, 1992) and it is suggested that there are functional differences between medial VS and lateral VS (Ikemoto, 2007). We did not observe systematic differences of novelty responses along dorsal-ventral or medial-lateral axis within VS (Figure 7A, anterior), although our spatial resolution could not completely distinguish each sub-nucleus. We did not observe noticeable differences between novelty responses in different sub-regions of TS either (Figure 7A, posterior).

Figure 7 with 2 supplements see all
Maps of dopamine responses in VS and TS.

The distribution of responses to novelty (A), reward (B), familiar odor predicting nothing (C), air puff (D), and reward omission (E). In the left panels, a 3D view of the average response from each animal. Novelty responses are the first responses to a novel odor either in the naïve case (i.e. Figure 2) or the trained case (i.e. Figure 4). Reward responses are the average response to unpredicted reward. Nothing responses are the response to cues predicting no outcome. Air puff responses are the average response to unpredicted air puff. Omission responses are the average response to the omission of expected water. In the middle panels, coronal max projections are shown from the 3D view. On the right, the correlation between signals from the fibers and their positions on the A-P axis is shown, along with a yellow line indicating the best fit. The plots of these responses are shown for VS, DMS, DLS, and TS in Figure 7—figure supplement 2. Examples of the whole-brain images used to find recording sites are shown in Figure 7—figure supplement 1.

https://doi.org/10.7554/eLife.21886.025

We examined how novelty responses were localized in the striatum. The observed distribution supported the idea that dopamine novelty responses are localized in TS. Novelty responses were correlated with location along the anterior-posterior axis (r = −0.92, p = 1.68x10−9, n = 40 animals, Pearson’s correlation). We next examined responses to cues predicting rewarding, neutral or aversive outcomes, and responses to the omission of reward. Responses for all these factors were correlated with the location along anterior-posterior axis to various degrees (water: r = 0.61, p = 5.76x10−5, nothing: r = −0.16, p = 2.61x10−7, air puff: r = −0.76, p = 4.58x10−8, and omission: r = −0.38, p = 1.68x10−9, n = 40 animals). Finally, novelty responses were positively correlated with responses to nothing, air puff-predicting cues, and reward omission (nothing: p = 8.76x10−7, air puff: p = 1.01x10−14, omission: p = 9.02x10−10, n = 40 animals). By contrast, water responses were observed in both VS and TS, although the amplitudes of water responses were slightly anti-correlated with novelty responses (p = 0.0138, n = 40 animals). We observed that water responses were found in all parts of the striatum (Figure 7—figure supplement 2).

Together, the differences in novelty responses and value coding between VS and TS dopamine suggested that novelty responses and value coding in dopamine axons are at least partially segregated in the striatum; lack of inhibitory responses to reward omission and excitatory responses to neutral or aversive stimulus are localized in TS and coincide with excitatory responses to novel stimuli.

Discussion

In the present study, we examined dopamine novelty responses in different parts of the striatum. Our data demonstrate dramatic differences between dopamine axon signals in the ventral striatum (VS) (hereafter, VS dopamine) and dopamine axon signals in the tail of the striatum (TS) (hereafter, TS dopamine). Over the course of associative learning, TS dopamine and VS dopamine exhibited opposite initialization to novel cues. Large TS dopamine responses to novel cues gradually decayed, whereas VS dopamine had no response to novel cues and gradually developed responses to cues that predicted reward. Our previous study showed that dopamine neurons that project to the TS are anatomical outliers compared to many other dopamine neurons including dopamine neurons that project to the VS (Menegas et al., 2015). These new results suggest that dopamine neurons that project to TS are not only anatomically, but also functionally, distinct compared to dopamine neurons that project to VS.

Opposite initialization to novel cues in VS and TS dopamine

The novelty responses we observed in TS dopamine, and especially the lack of novelty responses in VS dopamine, stand in stark contrast to popular theories explaining how novelty responses could fit into the dopamine RPE framework (Hazy et al., 2010; Kakade and Dayan, 2002). These theories proposed that dopamine neurons show ‘optimistic initialization’ to novel cues, promoting physical and/or computational exploration in search of potential rewards (often referred to as a ‘novelty bonus’). However, our results suggest that the ‘pure value’-coding VS dopamine axon signal does not include a novelty response, while the salience-coding TS dopamine axon signal does include a novelty response.

Over the course of classical conditioning, dopamine axon signals in VS and TS showed different coding principles. For a cue with an unknown outcome, VS dopamine initialized with no value prediction and gradually accumulated evidence for value, whereas TS dopamine began with a large excitation to novel cues and gradually decreased its responses. These dynamics can be conceptualized such that CS-value in VS dopamine was initialized with no value, while CS-salience in TS dopamine was initialized with very high salience.

The opposite dynamics in VS and TS dopamine during learning are reminiscent of two different views of associative learning: US associability and CS associability. Many models, including delta-rule, Rescorla-Wagner and temporal difference (TD) models (Rescorla and Wagner, 1972; Sutton and Barto, 1998; Widrow and Hoff, 1960), emphasize US associability; when an animal learns to associate a CS to a US, the nature of the US determines how well the animal can learn the association. In these models, US associability is determined by the prediction error of the US (i.e. the discrepancy between prediction and reality). Once the prediction error decreases, the US loses its effectiveness for creating an association with a CS. By contrast, other models (Mackintosh, 1975; Pearce and Hall, 1980) emphasize CS associability; when an animal learns to associate a CS to a US, the nature of the CS determines how well the animal can learn the association. In these models, CS associability alters attention to a stimulus, or the storage of a stimulus in working memory, to promote association with a US.

Dopamine dynamics in VS are well suited to serve as US associability, the teaching signal of the Rescorla-Wagner model (Rescorla and Wagner, 1972). TS dopamine does not follow the dynamics that the Rescola-Wagner model predicts. During the learning of novel odor-outcome associations, rather than US responses, CS responses were predominant and then decreased with learning. The dynamics of TS dopamine suggest that they may serve as CS associability, providing the attention signals for learning.

Novel stimuli may excite TS dopamine because of the unpredictability of the novel stimulus itself (i.e. it occurs with no cue or context predicting it) or because the outcome of these novel stimuli is unknown (i.e. they could predict a positive outcome, a neutral outcome, or a negative outcome). These two phenomena have been implemented in the framework of CS associability. For example, Wagner (Wagner, 1978) proposed that CS associability is correlated with the weakness of CS-context association (or the ‘unpredictability’ of the CS in the context). On the other hand, Pearce and Hall (Pearce and Hall, 1980) defined the CS associability as the unpredictability of the US (by the CS) in the previous trial. In either case, the novelty would promote learning because of high CS associability, although the latter model did not define CS associability on the very first trial. Our data could be explained with either framework, but one observed phenomenon prefers the latter model. The responses to a novel odor decreased more slowly when the odor was associated with some outcome (water or air puff) than when it was associated with no outcome. If novelty responses are determined solely by odor-context association, responses to both odors should decrease at the same speed. Furthermore, this would lead to an equal response to all familiar odors, which we did not observe.

Importantly, previous learning models incorporated teaching signals in the Rescorla-Wagner model and attention signals by novelty responses in one dopamine RPE system, with novelty responses as an exception or a bonus to the system (Hazy et al., 2010; Kakade and Dayan, 2002). Here, we propose that CS associability, or the ‘attention’ term in dopamine signals may not be an exception in RPE, but instead may be a separate system – localized in particular brain regions such as in TS. Thus, dopamine in TS signals novelty and salience, and dopamine in VS signals RPE, although both systems may co-exist in some brain areas. In general, one big limitation of the current reinforcement learning algorithms is the so-called ‘curse of dimensionality’. As the number of stimuli in an environment becomes large as in most natural environments, it quickly becomes difficult to properly assign credit to relevant objects. Attention would be critical to reduce the amount of information for learning to a more realistic amount. Dopamine in TS may be specialized for this function. In line with this idea, a recent study suggested that putative dopamine neurons in lateral SNc in monkey represented ‘cognitive salience’, which was correlated with working memory load (Matsumoto and Takada, 2013). Behaviorally, previous studies suggested that SNc is important for the acquisition of enhanced CS associability (Lee et al., 2008, 2006). Mechanistically, dopamine in the prefrontal cortex has been modeled to serve a ‘gating’ function to provide flexible updating in working memory (Braver and Cohen, 1999), and a similar mechanism may apply to dopamine in TS. Of note, different from adjacent striatal areas, TS, categorized in caudal extreme (Hintiryan et al., 2016), does not receive projections from sensorimotor cortex, suggesting a functional distinction from other areas in the striatum. All in all, TS dopamine may function to ‘pre-process’ sensory inputs to weigh potentially important stimuli to make reinforcement learning more efficient in a complex environment.

Thus, during the learning of an association between a stimulus and a reward, dopamine signals in VS and TS may cooperate. Salience prediction error in TS dopamine may serve as the CS associability of the stimulus, whereas value prediction error in VS dopamine may serve as the US associability of the reward. Alternatively, similar to value prediction error in VS dopamine during stimulus-reward association, salience prediction error in TS dopamine may reinforce stimulus-stimulus associations.

Different time-courses of CS and US responses in dopamine axons during learning

In contrast to TS dopamine, VS dopamine activity appeared to faithfully signal RPE. Several theories have arisen to explain how the reward prediction error could be computed in the brain. Of these, popular models such as Houk’s model (Houk and Adams, 1995) and temporal difference (TD) models (Sutton, 1988; Sutton and Barto, 1998) assume that a single system controls both CS-related and US-related dopamine firing. Under these models, higher expectation causes CS responses to become larger and US responses to become smaller at the same time. However, we observed that changes in CS and US responses were not simultaneous. In VS, the decrease in response to predicted rewards (US) was faster, occurring over the course of a single session, whereas the increase in response to reward-predicting odors (CS) required weeks of training. Over-training accelerated the time course of these events, but not their temporal order.

The development of CS responses in VS dopamine was also much slower than the associative learning observed at the behavioral level. By contrast, US responses decreased as anticipatory licking increased. These results demonstrate that US responses in VS dopamine are well suited as prediction error signals in the Rescorla-Wagner learning model, whereas CS responses are not time-locked to this learning.

The time course of CS responses we observed in VS dopamine is not easily explained by simple TD models. In these models, during learning, RPE signals gradually transfer from the timing of the reward to the timing of the preceding stimulus (Schultz et al., 1997). Although the step-wise transfer may explain a delay between the decrease of US responses and emergence of CS responses, such a gradual transfer has not been observed in single neuron recording of dopamine neurons (Pan et al., 2005). With any learning rate longer than one trial, transferred signals may become temporally smeared until they become time-locked to the CS (Pan et al., 2005; Schultz et al., 1997), which may not be detected in recording of single units. This theory would predict that we would observe some increase in signal between the stimulus and outcome (either smeared or sharp) during learning because the monitoring of population activity likely provides more reliable detection of small signals. However, this type of increase was not apparent with our bulk recording method. Instead, we observed a gradual development of a CS response directly following cue presentation. It is possible that distinct mechanisms could cause VS dopamine excitation in response to a reward-predicting CS and VS dopamine suppression in response to a predicted US, as proposed in several previous models (Brown et al., 1999; O'Reilly et al., 2007).

Relation to previous studies

There are different types of novelty (Schomaker and Meeter, 2015). One example is spatial novelty, which could be signaled by different arrangements of objects/stimuli in the environment. This kind of environmental novelty is known to induce exploration of animals and accelerate learning in this environment (Li et al., 2003; Otmakhova et al., 2013). Another example is stimulus novelty, which is associated with objects/stimuli that animals have never encountered or do not remember. It has been reported that dopamine activity in VTA and dopamine in the ventral striatum increased in the former case, in novel environments (Segovia et al., 2010; Takeuchi et al., 2016). On the other hand, in the present study, we focused on the latter type of novelty. Dopamine axon responses to novel stimuli were localized in TS. One potential explanation is that depending on the training history and environments, a given type of novelty may cause the animals to expect potential rewards, resulting in the excitation of the value system. Interestingly, previous studies found that different brain areas are responsible for different kinds of novelty (Schomaker and Meeter, 2015). Further studies are needed to determine how dopamine in different striatal areas responds to different kinds of novelty in different training environments.

In our previous study, dopamine neurons that project to TS were mainly observed in the lateral SNc of mice (Menegas et al., 2015). A previous study proposed that putative dopamine neurons in the lateral SNc of monkeys encode ‘motivational salience’, which is the absolute value of positive or negative ‘value’ (Matsumoto and Hikosaka, 2009). On the other hand, another study proposed that excitation of dopamine neurons in response to non-rewarding stimuli encodes the stimulus intensity, regardless of value (Fiorillo et al., 2013). In the present study, the excitation of TS dopamine elicited by various neutral stimuli suggested that the responses in TS could be related to more general salience rather than motivational salience, although we cannot rule out the possibility that the tone and odor predicting nothing had positive or negative motivational values. On the other hand, the fact that signals encoded by TS dopamine are modulated by prediction suggests that they are not encoding pure physical salience (i.e. stimulus intensity). Instead, TS dopamine appears to encode general stimulus salience prediction error, which includes prediction-dependent suppression and prediction. The novelty responses we observed may be the extreme case of salience prediction error, causing large excitation because of minimum prediction, rather than an exception.

A recent study found that putative dopamine neurons which project to the tail of the caudate (part of posterior striatum) in monkey formed another group of dopamine neurons. These dopamine neurons did not respond to water reward but encoded ‘sustained values’ of visual cues, whereas putative dopamine neurons projecting to the anterior caudate encoded ‘updating values’, when cue-outcome contingency was frequently changed (Kim et al., 2015). However, our results indicate that the difference between VS and TS dopamine extends beyond their flexibility. The dynamics between them are different in nature, not only in learning speed. Most importantly, we found that TS dopamine did not encode ‘values’.

Where do salience signals come from? How are salience signals regulated by novelty and experiences? A map of monosynaptic inputs to TS-projecting dopamine neurons should provide critical information (Menegas et al., 2015). Previous studies showed that various brain areas including olfactory and visual systems are modulated by experience (Boehnke et al., 2011; Kato et al., 2012). Whether dopamine neurons receive this processed information from sensory systems or whether more abstract information about salience and novelty is passed to dopamine neurons and sensory systems in parallel is an open question. Of note, behavioral responses to novel odors are very quick, within one respiration cycle in rats (Wesson et al., 2008). The responses to novel odors in TS dopamine that we observed began within 200 ms, most likely within one respiration cycle, suggesting a potential contribution at the early stages of novelty.

Technical considerations

Optical fiber fluorometry (fiber photometry) was developed by Kudo et al. (Kudo et al., 1992) and has been applied in many studies to record the population activity of neurons from cell bodies, dendrites, or axons (Adelsberger et al., 2005; Davis and Schmidt, 2000; Murayama et al., 2007). In this study, we recorded the population activity of dopamine axons in the striatum using GCaMP6m (Akerboom et al., 2012; Chen et al., 2013; Kim et al., 2016; Parker et al., 2016). We should point out several limitations associated with the present technique. First, previous studies (Fiorillo et al., 2013; Schultz, 2015) proposed that there is a temporal separation of two signals (stimulus intensity and value) in single dopamine neurons. However, we may only be able to measure the sum of these signals because of the limited temporal resolution of our method (population calcium imaging using GCaMP6m). Second, a recent study found that dopamine axons with distinct signals (locomotion and reward) coexist in the dorsal-most part of the dorsal striatum (Howe and Dombeck, 2016). Axons signaling different information might also co-exist in other areas of the striatum, and this could not be resolved with our bulk-imaging method (because such signals would effectively be ‘averaged’). Third, because the spatial resolution of z-axis is large with fluorometry (~500 µm), we have to be careful in interpreting the analysis of differences along dorsal-ventral axis.

Dopamine axons passing through and below the ventral striatum to the cortex (Aransay et al., 2014) may have contributed to the signals in VS dopamine, although calcium transients in passing axons are smaller than in axon terminals and boutons (Koester and Sakmann, 2000; Llano et al., 1997). Finally, the activity of axons of dopamine neurons may not directly correspond to amounts of dopamine release at synapses or spike activities in cell bodies. Dopamine neurons that project to the ventral striatum (mainly medial shell) are able to co-release glutamate (Stuber et al., 2010). Neuronal activities can be modulated locally at axon terminals in the striatum by cholinergic neurons (Threlfell et al., 2012). Most importantly, observed calcium transients may not reflect spike counts, because of autofluorescence, bleaching, motion artifacts and inevitable normalization. Although we only applied baseline normalization (i.e. signals were subtracted with and then divided by the average signal in a 1 s period before CS in each trial) in this study, additional methods using activity-independent wavelength of excitation (Kudo et al., 1992; Lerner et al., 2015) or examination of emission spectrum (Cui et al., 2013) may improve fidelity, especially in freely moving animals.

Conclusion

We found that dopamine responses to novel stimuli are more localized than previously believed. We propose to revise current RPE models so that novelty-driven and salience-driven attention is attributed to TS dopamine, rather than added to the RPE framework as a bonus (Kakade and Dayan, 2002). Thus, TS dopamine may be specialized for functions apart from value, such as attentional orientation (Redgrave et al., 1999), working attention, and/or as a filter for learning (Braver and Cohen, 1999; Dayan and Sejnowski, 1996; Matsumoto and Takada, 2013; Pearce and Hall, 1980). Further, our proposal includes another important point: RPE is not contaminated or distorted in VS dopamine. VS dopamine purely signals RPE, increasing the validity of the original ideas regarding dopamine’s role in reinforcement learning (Schultz et al., 1997).

Materials and methods

Animals

85 male adult mice were used. These mice were the result of a cross between DAT (Slc6a3)-Cre mice (recombinase under the control of the dopamine transporter, DAT) (B6.SJL-Slc6a3tm1.1(cre)Bkmn/J, Jackson Laboratory; RRID:IMSR_JAX:006660) (Bäckman et al., 2006) and tdTomato mice such that they were heterozygous for DAT-Cre and also heterozygous for tdTomato (Gt(ROSA)26Sortm9(CAGtdTomato)Hze, Jackson Laboratory). Animals were housed on a 12 hr dark/12 hr light cycle (dark from 07:00 to 19:00), one to a cage, and performed the task at the same time each day. All procedures were performed in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and approved by the Harvard Animal Care and Use Committee.

Viral injections, fiber implants, and head-plate installation

To prepare animals for recording, we performed a single surgery with three key components: (1) AAV-FLEX-GCaMP virus infection into the midbrain, (2) head-plate installation, and (3) one or more optic fiber implants into the striatum. At the time of surgery, all mice were 2–3 months old. All surgeries were performed under aseptic conditions with animals anesthetized with isoflurane (1–2% at 0.5–1.0 l/min). Analgesia (ketoprofen, 5 mg/kg, I.P.; buprenorphine, 0.1 mg/kg, I.P.) was administered for 3 days following each surgery.

To express GCaMP specifically in dopamine neurons, we unilaterally injected 250 nl of AAV5-CAG-FLEX-GCaMP6m (1 × 1012 particles/ml, Penn Vector Core) into both the VTA and SNc (500 nl total). To target the VTA, we injected virus at Bregma −3.0, Lateral 0.6, at all depths between 4.5 and 4.0 mm. To target SNc, we injected virus at Bregma −3.3, Lateral 1.6, at all depths between 4.0 and 3.5 mm. Virus injection lasted several minutes, and then the injection pipette was slowly removed over the course of several minutes to prevent damage to the tissue.

So that mice could be head-fixed during recording, we installed a head-plate onto each mouse. To do this, we removed the skin above the surface of the brain, dried the skull using air, and glued the head-plate onto the top of the skull with C and B Metabond adhesive cement. We used circular head-plates to ensure that the skull above the striatum would be accessible for fiber implants. Finally, during the same surgery, we also implanted optic fibers into the VS, DMS, DLS, and TS (1–4 fibers per mouse). To do this, we first slowly lowered optical fibers (either 200 µm or 400 µm diameter, Doric Lenses) into the striatum. The coordinates we used for targeting were as follows: (VS) Bregma 1.0, Lateral 1.25, Depth 4.1, (DS) Bregma 0.0, Lateral 1.5, Depth 2.25, (DLS) Bregma −0.5, Lateral 2.75, Depth 2.5, (TS) Bregma −2.0, Lateral 3.25, Depth 2.5. Once fibers were lowered, we first attached them to the skull with UV-curing epoxy (Thorlabs, NOA81), and then a layer of black Ortho-Jet dental adhesive (Lang Dental). After waiting fifteen minutes for this glue to dry, we applied a very small amount of rapid-curing epoxy (Devcon, A00254) to attach the fiber cannulas even more firmly to the underlying glue and head-plate. After waiting fifteen minutes for the epoxy to dry, the surgery was complete.

Fiber fluorometry

Fiber fluorometry (photometry) (Kudo et al., 1992) allows for recording of the activity of genetically defined neural populations in behaving mice by expressing a genetically encoded Ca2+ indicator, GCaMP6m (Akerboom et al., 2012; Chen et al., 2013) and chronically implanting an optic fiber. The optic fiber (200 µm or 400 µm diameter, Doric Lenses) allows chronic, stable, minimally disruptive access to deep brain regions and interfaces with a flexible patch cord (Doric Lenses) on the skull surface to simultaneously deliver excitation light (473 nm and 561 nm, Laserglow Technologies) and collect GCaMP and tdTomato fluorescence emission (see Figure 1—figure supplement 1).

Activity-dependent fluorescence emitted by cells in the vicinity of the implanted fiber’s tip was spectrally separated from the excitation light using a dichroic, passed through a single band filter, and focused onto a photodetector connected to a current preamplifier (SR570, Stanford Research Systems). To record Ca2+ transients from dopamine terminals, we injected a Cre-dependent adeno-associated virus (AAV) carrying the GCaMP6m gene into the VTA and SNc of transgenic DAT-Cre mice and implanted 200 µm or 400 µm diameter optic fibers in the striatum.

During recording, optic fibers were connected (1–2 per recording session) to patch cables which delivered excitation light (473 nm and 561 nm) and collected all emitted light. The emitted light was subsequently split and filtered (see Figure 1—figure supplement 1) and collected by a photodetector connected to a current preamplifier. This preamplifier output a voltage signal which was collected by a NIDAQ board. The NIDAQ board was connected to the same computer that was used to control odor, water, tone, and air puff delivery with Labview, so GCaMP and tdTomato signals could be readily aligned to task events such as odor delivery or reward delivery.

Behavior

After surgery, mice were given three weeks to recover and become habituated to the installed head-plate and implanted optic fibers. Additionally, this allowed time for viral expression. After this recovery period, mice were handled for 2–3 days and water deprived. Weight was maintained above 90% of baseline body weight.

In the first 2–3 sessions, mice were head-fixed and given unexpected water at random intervals (randomly drawn, between 1 and 20 s, with a mean of 10 s and a normal distribution). This allowed mice to become habituated to being head-fixed and allowed us to determine the appropriate laser power (typically between 0.1 mW and 0.25 mW) to record >5% dF/F free water responses (typically between 10% and 50%). These sessions were important, so that recordings during odor-water association could begin from the very first odor presentation on the first day of data collection (see Experimental Timeline).

The volume of water was constant for all reward trials (predicted or unpredicted) in all conditions. Similarly, the same mild tone (15 kHz, 0.5 s, ~50 dB) was used in all tone trials and the same intensity air puff was used in all air puff trials. Each behavioral trial began with an odor cue (a puff of odor lasting 1 s), followed by a 1 s delay, and then an outcome (either water, nothing, tone, or air puff). Odors were delivered using a custom olfactometer (Uchida and Mainen, 2003). Each odor was dissolved in mineral oil at 1:10 dilution. 30 µl of diluted odor was placed inside a filter-paper housing (Thomas Scientific, Swedesboro, NJ). Example PID measurements are shown in Figure 2—figure supplement 1. Odors were selected pseudorandomly for each animal. Odorized air was further diluted with filtered air by 1:14 to produce a 1500 ml/min total flow rate. A variable inter-trial interval of 6–12 s (random) was placed between trials. All trial types were randomized in all of the sessions. Each day, the mice did about 300 trials over the course of about an hour. On a recording day, they performed the same task, and we recorded for ~45 min, which is approximately 250 trials, with constant excitation from the laser and continuous recording. Recordings from the same fiber were interspersed with at least two days of no recording.

Experimental timeline

On the first day of classical conditioning, odors were presented to mice for the first time, and either predicted no outcome or reward. We quantified the ‘novelty response’ as the response to the first odor presentation that the mouse experienced, which was associated with no outcome (for n = 13 VS-implanted mice and n = 12 TS-implanted mice). For comparison, the response to the first unpredicted water presentation in those sessions was quantified as well. These ‘novelty responses’ were the first trials of the first day of classical conditioning (Figure 2), while the average responses for these sessions are reported as ‘Day 1’ and compared with later sessions in Figure 3. Due to technical difficulties we encountered in recording the first response of a session, some of the first responses were not recorded. Therefore, the sample size (number of animals) is lower for first trial responses (n = 12 mice for TS) in Figure 2 than for average responses during ‘Day 1’ (n = 19 mice for TS) in Figure 3.

During classical conditioning, odor cues (also called ‘conditioned stimuli’ or ‘CS’) were associated with either reward or no outcome. In the case of reward trials, water (the ‘unconditioned stimulus’ or ‘US’) would follow odor presentation after 2 s, 90% of the time (i.e. 10% omission). In ~10% of trials, unpredicted water was delivered without odor presentation. During training, GCaMP responses were recorded at time points (Figure 3) rather than daily, to minimize bleaching or tissue damage. After 3 weeks of this classical conditioning training (with one water predicting odor, one nothing-predicting odor, and occasional unpredicted water), mice were introduced to new odor-outcome association types.

At this point in training, mice were also presented with odor-tone associations (20% of trials) or odor-air puff associations (20% of trials), in addition to the two familiar odors associated with water and with no outcome, allowing us to multiplex data from learning onto this data regarding value or salience coding and prediction error coding. Unpredicted tone or air puff was also delivered in ~5% of trials. Data from these sessions was used in Figure 6 and Figure 7, including data from DMS-implanted or DLS-implanted mice.

Finally, a subset of these mice (n = 11) were trained with two new odors each day (one associated with water and one associated with no outcome), every day for a week, until mice began to discriminate between odors behaviorally within a few trials (see Figure 5) rather than over the course of many days (see Figure 3). We referred to these mice as ‘overtrained mice’. After this training, we recorded GCaMP and licking signals from these overtrained mice as they learned either new odor-water (one third of sessions) or new odor-nothing associations (one third of sessions) in Figure 4. In one third of sessions, a new odor associated with air-puff was introduced in addition to the two familiar odors associated with water or with no outcome (Figure 4—figure supplement 3). We randomized whether the new odor of a session would predict water, nothing, or air puff to ensure that mice could not generalize that novel odors reliably predicted a particular outcome. Mice performed one session per day.

Fiber fluorometry and licking data analysis

GCaMP and tdTomato signals were collected as voltage measurements from the current preamplifiers using Labview (Figure 1—figure supplement 1). The 'dF/F' measurement was calculated by comparing the average signal in a 1 s period before each trial (‘F1’) with the signal at any given point during the trial (‘F2’). The calculation for each point in the trial (calculated in 1 ms bins) was then simply dF/F = (F2 – F1) / F1. We used this measurement because it readily normalized signals (i.e. in the case of low signal to noise ratio, the denominator would be larger). The average responses to a stimulus type within a session (often ~50 trials per stimulus type) were averaged, and these session averages were used as the data in each figure (individual session averages can be found in Figure 3—figure supplement 1 and Figure 6—figure supplement 1, and example individual single trial traces are shown in Figure 1 and Figure 1—figure supplement 1).

These session averages were compared across animals in two basic ways. (1) Traces were averaged and plotted (as the average of all session averages) along with the standard error (the total number of sessions being the sample size) as in Figure 5, left panels. (2) Peak responses to cues/outcomes were quantified by finding the point with the maximum absolute value during 2 s following cue/outcome for each trial, then comparing the averages between sessions as in Figure 5, right panels. Because traces were aligned using task events (i.e. cue on time) rather than behavioral events (i.e. first inhalation), comparing peak responses ensured that signals, which were slightly offset in time relative to odor presentation, could be compared.

To compute the latency of responses to novel cues, each trial was tested for difference from baseline in the first five novel odor trials using time bins of 50 ms. We called the ‘latency’ of the response (in each trial) the center of the first time bin where five consecutive time bins all showed significant difference from baseline. To compute the latency of novelty discrimination, the sessions were tested for significant difference between familiar and novel odors in the first five trials using time bin of 50 ms. We called the ‘latency’ of discrimination the center of the first time bin where five consecutive time bins all showed significance.

While recording GCaMP signals, we also recorded licking. To measure licking, we used a detector that output a voltage based on the disruption of its infra-red light path. We set a threshold for signal corresponding to a ‘lick’ and then made the signal binary by finding each time point where the signal crossed the threshold so that it could easily be quantified. Our main quantification for licking was counting the number of ‘anticipatory licks’, the licks following an odor (CS) and preceding the arrival of the outcome (US).

GCaMP responses and licking responses were collected through Labview during the training for offline analysis. Statistical analyses (i.e. t-tests, ANOVA) were run using Matlab (Mathworks). All analyses considered a value of p≤0.05 significant, and exact p-values are reported in the text. To quantify body movement, we used a video camera to capture images of the mice while they performed the task and made a rough estimate of total body movement by subtracting each frame of the video from the last frame, using the ‘imabsdiff’ function in Matlab. We reported these values, which we took to be a proxy for body movement, as ‘arbitrary units’ or ‘a.u.’ because they were measured in pixels.

Tissue clearing using CLARITY

Brains were cleared as previously described (Menegas et al., 2015) at 37°C for 2 days, with a constant current of 1.2 amps. A Niagra 120 V (Grey Beard Pumps #316, Mt Holly Springs, PA, United States) pump was used to circulate clearing solution. A Precision Adjustable 60 V/5A power supply (Korad Technology #KA6005D, Shenzhen, China) was used to provide current. A 5-gallon plastic container (US Plastic #97,028, Lima, Ohio, United States) was used as a clearing solution reservoir and tubing was run though a second 5-gallon plastic container filled with water to cool the solution flowing through it. Chambers were constructed as previously described (Chung and Deisseroth, 2013) using a Nalgene chamber (Nalgene 2118–0002, Rochester, NY, United States) and platinum wire (Sigma-Aldrich 267228, St. Louis, MO, United States). Clearing was done in a room held at 37°C.

Imaging using light-sheet microscope

Images were acquired with the Zeiss Z.1 Light-sheet microscope (Carl Zeiss, Jena, Germany). Brains were glued to the tip of a 1 ml syringe (without needle) such that the posterior tip of the cerebellum was in contact with the syringe. A 488 nm laser was used to excite GFP and a 647 nm laser was used to produce autofluorescence. Images were collected through a 5× objective with PCO-Edge scMOS 16 bit cameras (PCO, Kelheim, Germany) with 1920 × 1920 pixels. Each frame was 2000 × 2000 μm, so each pixel was roughly 1.04 μm. The step size between images was set to 5.25 μm, so the voxels were not isotropic. Brains were imaged horizontally from the dorsal side, and then rotated 180° for horizontal imaging from the ventral side. Each view was tiled with 7 × 6 tiles (14,000 × 12,000 μm) and the two views were combined to create a continuous image. Autofluorescence images were subsequently downsized to 1400 × 1200 × 700 pixels for alignment to the reference space. In these downsized images, voxels have 10 μm spacing in all three dimensions. Brains were aligned to a previously described reference space comprised of the average of 25 brains (Menegas et al., 2015). Alignment to this reference space was performed using Elastix (Klein et al., 2010). We performed affine alignment followed by B-spline alignment based on mutual information, as previously proposed for human magnetic resonance imaging (MRI) image registration (Metz et al., 2011). After alignment, fiber positions were manually determined by tracing the fiber paths to their termination points.

Classification of implant sites

After clearing and imaging each brain as a whole volume and aligning these images to determine the exact location of each fiber, we classified each fiber as either (1) an implant into VS, (2) an implant into DS, (3) an implant into TS, or (4) an incorrectly targeted fiber. Our analysis in the current paper focuses on comparing VS-implanted fibers to TS-implanted fibers. Most implants were successfully targeted to VS or TS. We classified implants as successful based on the following criteria. For VS: any fiber within the nucleus accumbens core or shell, between Bregma 2.0 and Bregma 0. For DS: any fiber within the striatum anterior to Bregma 1.5. For TS: any fiber within the striatum posterior to Bregma −1.5. We discarded data from eight animals which had fibers incorrectly targeted. The fibers in these animals were often in areas of cortex directly adjacent to the intended recording site. We observed very little or no signal (compared to our other recordings) in these cases, likely due to the relatively sparse dopaminergic innervation of cortex relative to striatum in mouse.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
    How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues
    1. J Brown
    2. D Bullock
    3. S Grossberg
    (1999)
    Journal of Neuroscience 19:10502–10511.
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
    Models of Information Processing in the Basal Ganglia
    1. JC Houk
    2. JL Adams
    (1995)
    249, A model of how the basal ganglia generate and use neural signals that predict reinforcement, Models of Information Processing in the Basal Ganglia.
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
    Two cell types in rat substantia nigra zona compacta distinguished by membrane properties and the actions of dopamine and opioids
    1. MG Lacey
    2. NB Mercuri
    3. RA North
    (1989)
    Journal of Neuroscience 9:1233–1241.
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
    Responses of monkey dopamine neurons during learning of behavioral reactions
    1. T Ljungberg
    2. P Apicella
    3. W Schultz
    (1992)
    Journal of Neurophysiology 67:145–163.
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
    Intrinsically Motivated Learning in Natural and Artificial Systems
    1. N Otmakhova
    2. E Duzel
    3. AY Deutch
    4. J Lisman
    (2013)
    235–254, The hippocampal-VTA loop: the role of novelty and motivation in controlling the entry of information into long-term memory, Intrinsically Motivated Learning in Natural and Artificial Systems, Springer.
  64. 64
  65. 65
  66. 66
    Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex
    1. IP Pavlov
    2. GV Anrep
    (1927)
    London: Oxford University Press: Humphrey Milford.
  67. 67
  68. 68
  69. 69
    Classical Conditioning II: Current Research and Theory
    1. RA Rescorla
    2. AR Wagner
    (1972)
    64–99, A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and non-reinforcement, Classical Conditioning II: Current Research and Theory.
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
    Selected Works: English Translation, State Publishing House for Biological and Medical Literature
    1. I Sechenov
    (1935)
    Reflexes of the brain (1863), Selected Works: English Translation, State Publishing House for Biological and Medical Literature, Moscow-Leningrad.
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
    Reinforcement Learning: An Introduction
    1. RS Sutton
    2. AG Barto
    (1998)
    Cambridge, Mass: MIT Press.
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
    Cognitive Processes in Animal Behavior
    1. AR Wagner
    (1978)
    177–209, Expectancies and the priming of STM, Cognitive Processes in Animal Behavior.
  88. 88
  89. 89
  90. 90
  91. 91
    Adaptive Switching Circuits
    1. B Widrow
    2. ME Hoff
    (1960)
    IRE WESCON Convention Record .
  92. 92

Decision letter

  1. Gary L Westbrook
    Reviewing Editor; Vollum Institute, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Opposite initialization to novel cues in dopamine signaling in ventral and posterior striatum" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation was overseen by Gary Westbrook as the Senior Editor. The following individual involved in review of your submission has agreed to reveal her identity: Kay M Tye (Reviewer #2).

Both reviewers were very positive about your work. The reviewers have discussed the reviews with one another and the Senior Editor has drafted a consolidated summary of the reviews to help you prepare a revised submission.

Summary:

Menegas and colleagues studied dopamine neuron responses during classical conditioning and compare the responses across different striatal subregions. They used fiber photometry to record calcium responses from DA axons and tracked these signals across learning. The major result is a dramatic difference between dopamine signals in the ventral striatum (VS) compared to the tail of the striatum (TS). While VS projecting DA neurons followed the classical reward prediction error patterns and did not respond to novel cues, TS projecting DA neurons responded strongly to novel cues and also to the overall salience of cues and outcomes. This is a beautiful manuscript reporting on a very timely and important topic. The design of the experiments is elegant and the observed difference between DA responses is striking. The manuscript is very well written and describes an important finding with attention to the points listed below.

Essential revisions:

1) It would be important to measure the timecourse of the odor signals to assess whether cue responses in the TS are concordant with what is expected based on adaptation in the olfactory bulb or piriform cortex. Please use PID measurements of olfactometer output. This would also inform the issue of whether TS implements a novelty filter or rather it inherits novelty signals.

2) Why did the authors use TdTomato as the control instead of the 405 emission? Did the authors also collect this? Do the authors have some evidence that different emission wavelengths produce identical mechanical deflections, or might there be chromatic aberrations that make this different? At this point, the appropriate controls for fiber photometry are still up for discussion, so it would be useful to have the authors directly discuss this issue.

3) The description of TS signals beyond novelty and their relationship to RPE is confusing. Would it be correct to simplify the responses as the absolute value of RPE plus a novelty bonus? Either way, please clarify.

4) It was not convincing that the neutral tone stimuli (Figure 6) are perceived as truly neutral. Why then do they provoke negative inflections in the RPE-type signal observed in VS? Perhaps video tracking of startle response would help to exclude this possibility.

5) The authors should not discuss the signals as "dopamine signals" as they are only looking at the calcium transients within dopamine neurons that may co-release glutamate. They should instead refer to these signals as "signals from VTA DAT+ neurons in the VS or TS" or something similar, and separately discuss what this can reflect.

6) It would be important to see histology showing representative expression of GCaMP-expressing fibers in each of the regions sampled

7) How much of the photometry signal is due to terminals versus axons of passage?

8) Regarding Figure 5: What happens if the authors sort the trials where there was licking versus no licking (in water trials) in terms of the response to the water delivery. No licking may indicate that the mouse did not recognize/detect/attend to the odor cue?

9) It would be nice to see the anatomy: if the authors retrogradely label VTA DA neurons that project to VS versus TS, where do these TS neurons reside? What is the distribution? This would be helpful information in terms of joining this information with the work by Stephan Lammel on the arrangement of VTA DA neurons projecting to mPFC or NAc m shell or NAc l shell.

10) Did the authors control for the order of all of the manipulations? Were the order of manipulations randomized for both the trial types within A, B, C and the epochs (particularly in Figure 6)?

https://doi.org/10.7554/eLife.21886.028

Author response

Essential revisions:

1) It would be important to measure the timecourse of the odor signals to assess whether cue responses in the TS are concordant with what is expected based on adaptation in the olfactory bulb or piriform cortex. Please use PID measurements of olfactometer output. This would also inform the issue of whether TS implements a novelty filter or rather it inherits novelty signals.

These are very important points, and we appreciate this advice. During odor-association learning, we observed that novel-odor-cue-evoked calcium signals in TS-projecting dopamine axons decreased within a session (Figure 2 for initial learning and Figure 4 for repeated learning), as well as over days (Figure 3).

One potential explanation for the within-session decrease in signal could be that the effective concentration of the odorants decreased over time within a session, and that the TS dopamine axon signals encode stimulus intensity and merely reflect the concentration change. To test this idea, we quantified the change in the effective concentration of the odorants used in our study over trials within a session, as well as the latency of odor delivery within single trials, using a photoionization detector (PID) (Aurora Scientific, 200B: miniPID Dispersion Sensor) (Figure 2—figure supplement 1). We found that some odors were slow to decay (Figure 2—figure supplement 1A-B), and others decayed significantly over the course of a session (Figure 2—figure supplement 1C-D). Therefore, we split our odors into two groups, based on whether they decayed slowly or quickly (Figure 2—figure supplement 1E). We then compared TS dopamine axon responses to novel odor cues based on the stability of the odorant over time. We found that responses to both types of odors decreased dramatically over the course of a session (Figure 2—figure supplement 1F). Another way to address this potential concern is to look at TS dopamine axon responses to a familiar odor over the course of a session. We performed this analysis and found that responses to familiar cues were stable over the course of a session (Figure 2—figure supplement 2C). Finally, we also observed a decrease in responses to novel odors across days (Figure 3), which would not be affected by any within-session decay of odor concentration. Taken together, we do not think that changes in odorant concentrations affected our main conclusions. We added sentences to the Results section, as specified below.

“To determine whether this decrease in signal was caused by a decrease in effective odor concentration within sessions, we compared responses to fast-decaying and slow-decaying odors (Figure 2—figure supplement 1). TS dopamine responses to novel odors, both fast-decaying and slow-decaying odors, decreased over the course of a session (Figure 2—figure supplement 1F), whereas TS dopamine responses to familiar odors did not change (Figure 2—figure supplement 2C).”

Whether the decrease of TS dopamine axon responses to novel cues is due to sensory adaptation is also a fundamental question meriting some discussion. First, studies using anesthetized animals have shown that neurons in the piriform cortex, but not in the olfactory bulb, quickly adapt to repeated odor presentations, and recover in a relatively short time period, normally within tens of seconds (Wilson 1998, Best and Wilson 2004). Our observation that dopamine novelty response decreases over days (Figure 3) cannot be explained simply by this type of adaptation. In addition, the decay in response was unique to novel odors and we did not observe similar decay in signal elicited by familiar odors (Figure 2—figure supplement 2C). Further, the speed of decrease in responses to novel odors depends on whether the odor predicted nothing, water, or air puff. We observed that TS dopamine axon responses to novel odors decreased more slowly when the novel odor predicted water (Figure 5) or airpuff (Figure 4—figure supplement 2) than when it did not predict any outcome. It is unlikely that these differences in the decay time course can be explained simply by sensory adaptation.

A recent study suggested that neurons in the olfactory bulb of awake mice (Kato et al., 2012) modulated their responses to the repetitive exposure of odors following more complicated principles than the simple adaptation observed in olfactory sensory neurons (Chaput, 2000). Kato et al. showed that neurons in the olfactory bulb quickly reduced their responses to odors following repetitive exposure (within a session) and that the effects of exposure lasted for over a month. Although we cannot directly compare the time-course because of the difference of tasks (i.e. the odor presentations are interspersed with other trial types in our task, rather than repeated consecutively as in their task), the dynamics in olfactory bulb are comparable to our observation in TS dopamine axon signals.

It is also important to discuss the latency of novelty detection within a trial. Behaviorally, novelty detection occurs within 200 ms; immediately after a single sniff (Wesson et al., 2008). In order to understand which brain area could function as a novelty filter, as the reviewers mentioned, it is very important to examine the temporal dynamics of neuronal signals in response to novelty. We found that the median latency of TS dopamine axon responses to novel odors in naïve mice was 140 ms (Figure 2—figure supplement 4) and the median latency of discrimination between novel and familiar odor in trained mice was 170 ms (Figure 4—figure supplement 3). We plotted the average of the responses to the first 5 presentations of a novel odor in 12 mice (Figure 2—figure supplement 4). The short latency of these responses means that the signal is likely to arise within the first sniff, based on a previously reported sniff rate of 3.4 Hz (294ms/cycle) (Kato et al., 2012). Because we used GCaMP, a calcium-based indicator, our temporal resolution is somewhat limited. This is also an issue with the previous studies that we mentioned (Kato et al., 2012, Wesson et al., 2008). Notably, the peak of the average response we observed in TS dopamine axon responses to novel odors was not until 600 ms after odor delivery (Figure 2—figure supplement 4).

In order to understand the precise temporal relationship between these signals, functional analysis and more precise measurements of timing (single unit and simultaneous awake recording), as well as consideration of temporal coding in the olfactory bulb (Cury and Uchida, 2010) is necessary. In short, our results demonstrate that TS dopamine axons signal novelty quickly (within the first sniff), but future studies will examine which brain area or areas function as novelty filters. We added a brief paragraph to the Discussion (subsection “Relation to previous studies”) to address this issue.

“Where do salience signals come from? How are salience signals regulated by novelty and experiences? A map of monosynaptic inputs to TS-projecting dopamine neurons should provide critical information (Menegas et al., 2015). Previous studies showed that various brain areas including olfactory and visual systems are modulated by experience (Boehnke et al., 2011; Kato et al., 2012). Whether dopamine neurons receive this processed information from sensory systems or whether more abstract information about salience and novelty is passed to dopamine neurons and sensory systems in parallel is an open question. Of note, behavioral responses to novel odors are very quick, within one sniff in rats (Wesson et al., 2008). The responses to novel odors in TS dopamine that we observed began within 200 ms, most likely within one sniff, suggesting a potential contribution at the early stages of novelty.”

2) Why did the authors use TdTomato as the control instead of the 405 emission? Did the authors also collect this? Do the authors have some evidence that different emission wavelengths produce identical mechanical deflections, or might there be chromatic aberrations that make this different? At this point, the appropriate controls for fiber photometry are still up for discussion, so it would be useful to have the authors directly discuss this issue.

Optical fiber fluorometry has long been used in various preparations (Kudo et al., 1992, Duff Davis and Schmidt, 2000, Adelsberger et al., 2005, Murayama et al., 2007). Signals collected from excitation with different wavelengths are sometimes used to normalize signals with the goal of removing the effects of autofluorescence, bleaching, and movement (Kudo et al., 1992, Lerner et al., 2015). This is an especially important consideration in freely moving animals, where the interface between the implanted fiber and the patch cord used to collect signal can cause additional artifacts. In this paper, we used a head-fixed preparation and reported the signal from GCaMP. We did not normalize by the tdTomato signal because we were not confident that the ratio of GCaMP: tdTomato would be constant between mice. Specifically, TdTomato levels were likely to be similar (because this was genetically encoded), but the virally-delivered GCaMP was likely to be present at slightly different levels in each animal. This problem would not likely be solved by using 405 nm excitation to separately collect a baseline as previously outlined (Lerner et al., 2015) because this autofluorescence signal would also not scale with GCaMP infection size. We added sentences in the Discussion (“Technical considerations”).

“Most importantly, calcium transients may not enhance spike counts, because of autofluorescence, bleaching, motion artifacts and inevitable normalization. Although we only applied baseline normalization in this study, additional methods using activity-independent wavelength of excitation (Kudo et al., 1992; Lerner et al., 2015) or examination of emission spectrum (Cui et al., 2013) may improve fidelity, especially in freely moving animals.”

In order to avoid drift in signals due to bleaching, we minimized laser power for each mouse (approximately 0.25 mW, subject to slight fluctuations in power between days) and ensured that our recording was limited to approximately 45 min per day. Additionally, although we trained mice every day, we recorded only at time points during training (Figure 3). We found that responses to free water did not change significantly within sessions in VS dopamine axons (Figure 2—figure supplement 2A) or TS dopamine axons (Figure 2—figure supplement 2B). We added this to our Results section as well (Results “Excitation to novel cues in TS dopamine”).

“To determine whether changes in signal intensity observed within a session were likely to have been related to bleaching, we compared responses to free water and familiar odors over the course of sessions (Figure 2—figure supplement 2). We found that both VS dopamine responses (Figure 2—figure supplement 2A) and TS dopamine responses (Figure 2—figure supplement 2B) to free water remained constant within sessions. Similarly, TS dopamine responses to familiar odors predicting no reward remained constant as well (Figure 2—figure supplement 2C).”

To determine whether our signals were likely to have been contaminated by movement-related artifacts, we collected video of mice (n=6) performing the task and analyzed body movement (Figure 2—figure supplement 3). We quantified total body movement from all regions of the mouse, and found that they were not time-locked with novel (Figure 2—figure supplement 3C) or familiar (Figure 2—figure supplement 3B) odor presentation. However, we did find that there was body movement time-locked to the delivery of odors predicting reward (Figure 2—figure supplement 3A). This is not surprising, because mice often perform a type of approach behavior (taking a few steps) when they expect reward, despite being head-fixed. We added a paragraph regarding motion artifacts to the results (Results “Excitation to novel cues in TS dopamine”). Moreover, the tdTomato traces in Figure 1F show that in the case of free water trials, when mice start licking at water delivery, no change in signal was observed in the tdTomato signal. Taken together, our analysis does not support the idea that the signals we discuss in the manuscript are contaminated by movement artifacts (gross body movements or licking).

“To determine whether the signals we observed could have been caused by a movement-related artifact, we used video analysis to quantify the total body movements of mice performing the task (Figure 2—figure supplement 3). We found that mice did not show gross body movements in response to odors predicting no outcome (Figure 2—figure supplement 3B) or novel odors (Figure 2—figure supplement 3C), although they performed a stereotypical approach behavior in response to odors predicting reward (Figure 2—figure supplement 3A).”

3) The description of TS signals beyond novelty and their relationship to RPE is confusing. Would it be correct to simplify the responses as the absolute value of RPE plus a novelty bonus? Either way, please clarify.

There is a controversy about the excitation of dopamine neurons in response to aversive stimuli, and we therefore tried to avoid oversimplifying our results.

Hikosaka and colleagues (Matsumoto and Hikosaka, 2009) found that some dopamine neurons are activated by both aversive and appetitive stimuli. They interpreted this result as dopamine neurons encoding “motivational salience” (the absolute value of RPE). In this definition, it is the “aversiveness” or “negative value” of the stimulus that makes these stimuli motivationally salient. However, Fiorillo (Fiorillo et al., 2013) acquired a very similar dataset, but proposed that the excitation of dopamine neurons was better explained by stimulus intensity. He argued that this excitation was not correlated with aversion, which he estimated based on choice preferences. Unfortunately, our task was not designed to distinguish between these possibilities. Nonetheless, our data do not perfectly match with either of the ideas. First, we observed similar amplitudes of excitation in response to water and a very quiet tone in thirsty mice. For comparison to other studies, a 90 dB tone did not affect the utility function in thirsty monkeys (Fiorillo et al., 2013), and we used a 55 dB tone with our thirsty mice. Second, we also observed large responses to novel cues. We think that it is important to distinguish these TS dopamine axon signals from those proposed in previous studies. We therefore propose that signals in TS are general salience prediction error signals (containing both salience and novelty) rather than conceptualizing them as the absolute value of salience plus a separate bonus for novelty.

We cannot formally exclude other possibilities regarding TS dopamine axon signals. For example, if novel odors and mild tones are both aversive, TS dopamine axons could be signaling the absolute value of RPE. Or, as a reviewer suggested, responses to novel cues might be differentially conceptualized as a “bonus” added to the signal rather than an aspect of salience. We rephrased parts of our Results and Discussion to try to make our proposals clearer (Discussion “Relation to previous studies”).

“A previous study proposed that putative dopamine neurons in the lateral SNc of monkeys encode ‘motivational salience’, which is the absolute value of positive or negative ‘value’ (Matsumoto and Hikosaka, 2009). On the other hand, another study proposed that excitation of dopamine neurons in response to non-rewarding stimuli encodes stimulus intensity, regardless of value (Fiorillo et al., 2013). In the present study, the excitation of TS dopamine elicited by various neutral stimuli suggested that the responses in TS could be related to more general salience rather than motivational salience, although we cannot rule out the possibility that the tone and odor predicting nothing had positive or negative motivational values. On the other hand, the fact that signals encoded by TS dopamine are modulated by prediction suggests that they are not encoding pure physical salience (i.e. stimulus intensity). Instead, TS dopamine appears to encode general stimulus salience prediction error, which includes prediction-dependent suppression and prediction. The novelty responses we observed may be the extreme case of salience prediction error, causing large excitation because of minimum prediction, rather than an exception.”

4) It was not convincing that the neutral tone stimuli (Figure 6) are perceived as truly neutral. Why then do they provoke negative inflections in the RPE-type signal observed in VS? Perhaps video tracking of startle response would help to exclude this possibility.

To address this concern, we have performed behavioral measurements of responses to tones of various sizes. We found that there was not a large behavioral response to the size of tone that we used. However, we cannot say with certainty that the tone was not slightly aversive, so we changed our language in the text from “neutral tone” to “mild tone” or “quiet tone”.

To try to minimize the aversiveness of the tone, we used a very small 55 dB tone for this experiment. For reference, ambient noise in the room where recordings were performed was approximately 50 dB. Startle responses have been observed in response to very loud tones, in experiments with background tones that were louder than the tone that we chose (105 dB startle stimulus with 65-80 dB background with monkeys, Schneider et al., 2013, 139 dB with 70 dB background with rats, Hoffman and Searle, 1967, 110 dB with mice, Ouagazzal et al., 2001, 118 dB startle stimulus with 72 dB background with mice, Sallinen et al., 1998). It should also be noted that startle experiments could also actually have a component of perceptual salience (or, “surprise”), rather than pure aversion.

Because we used a tone that was not much louder than the background noise in the room, and because mice became familiar with this tone, we do not think that it was perceived as aversive. To test this, we measured the behavioral responses of mice when presented with tones of different volumes: 55 dB to 105 dB (Figure 6—figure supplement 1). We found that mice did not generally respond to smaller tones (Figure 6—figure supplement 1A), and a louder tone was required to cause mice to freeze or remain motionless (Figure 6—figure supplement 1B). We plotted the relationship between tone intensity and motion, and found that freezing occurred in response to tones larger than approximately 80 dB in our condition (Figure 6—figure supplement 1C). Again, for comparison, the tone that we used in experiments was only 55 dB – and mice did not tend to freeze in response to this tone.

Fiorillo examined the effect of a louder tone (90 dB) on the utility function in a water-reward-choice tasks, and showed that their tone did not cause aversion in monkeys (Fiorillo et al., 2013). This means that the sound did not bias the monkey’s choice such that it would prefer an equally valuable option less often to prevent the sound from being played. Based on these (admittedly indirect) pieces of evidence, we suggest that signals in TS are likely to encode general salience rather than aversion or the absolute value of “value”. However, we cannot formally exclude the possibility that the 55 dB tone we chose was aversive to mice. Therefore, we rephrased the related sections of the paper. Specifically, we replaced “neutral” to “non-rewarding” in the Results and added a discussion on the behavioral responses we observed to this tone (Results section). Please also see Essential Revision #3.

“Responses to rewarding and non-rewarding stimuli in VS and TS dopamine

After several weeks of this training, in some sessions, trials with odors predicting either a mild tone (55 dB) or an air puff were interleaved in addition to trials with water and trials with no outcome. We chose a very mild tone with a similar intensity to the background noise in the room to try to minimize the aversiveness of this stimulus. To estimate the aversiveness of auditory stimuli, we measured the behavioral responses to tones of different volumes in a different set of mice (Figure 6—figure supplement 1). We found that quiet tones did not cause freezing.”

The question of why we observed decreases in VS dopamine in response to tone is interesting, and likely to be related. Our tasks include a high probability of reward (~50% of trials are rewarded). In such high reward contexts, neutral stimuli might be interpreted as a signal for “no reward” because they predict that reward will not come in the very near future (Fiorillo, 2013, Matsumoto et al., 2016). Because of this, we have to be careful when interpreting inhibition in VS caused by tone-predicting odor cues or by tones themselves. Notably, familiar odors predicting no outcome also caused a negative inflection in the RPE-type signal that we observed in VS that was similar in size (Figure 6).

5) The authors should not discuss the signals as "dopamine signals" as they are only looking at the calcium transients within dopamine neurons that may co-release glutamate. They should instead refer to these signals as "signals from VTA DAT+ neurons in the VS or TS" or something similar, and separately discuss what this can reflect.

We agree with the reviewer. We changed our wording in many parts of the paper (Abstract, Introduction and other places) to make it clear that the signals we observed are due to increases in calcium in the axons of midbrain dopamine neurons (DAT+) found in either VS or TS. We added a detailed explanation of our nomenclature in the Introduction and at the beginning of the Results section and added some discussion of the meaning of these axon signals in the Discussion section (Introduction, Results “Recording activity from dopamine axons in the striatum”, Discussion “Technical considerations”).

“We will call the bulk calcium signal that we observed from the axons of DAT+ midbrain dopamine neurons in the striatum “VS dopamine” and “TS dopamine” in the following sections”.

“We used optical fiber fluorometry (fiber photometry) to record bulk calcium signals from the axons of midbrain dopamine neurons projecting to several regions of the striatum (Kim et al., 2016; Kudo et al., 1992; Parker et al., 2016).”

“In this study, we recorded the population activity of dopamine axons in the striatum using GCaMP6m (Chen et al., 2013; Akerboom et al., 2012; Kim et al., 2016; Parker et al., 2016).”

6) It would be important to see histology showing representative expression of GCaMP-expressing fibers in each of the regions sampled

We added images of GCAMP-expressing fibers throughout the striatum in Figure 1—figure supplement 3. Specifically, we showed representative images from an animal with no virus injection (first row), VTA virus injection (second row), SNC virus injection (third row), and VTA+SNC virus injection (fourth row). All data in the paper was collected from animals with injections into both VTA and SNC.

7) How much of the photometry signal is due to terminals versus axons of passage?

It is difficult to exclude the possibility of having collected signals from passing axons, we estimated that most of the signals we observed come from axon terminals based on four arguments to this effect.

First, the striatum is the main target of dopamine neurons and projections to other areas are much weaker than projection to the striatum (Figure 1—figure supplement 3). Second, tracing studies of single dopamine neurons have shown that the striatum is the termination point of dopamine neurons, such that single dopamine neurons have a massive arborization in the striatum and that dopamine neurons which project to other brain areas do not pass through the striatum, with the notable exception of dopamine neurons that project to the cortex (Aransay et al., 2015, Cebrian and Prensa, 2010, Matsuda et al., 2009, Gauthier et al., 1999). Although the cortex shows much weaker labeling of dopamine axons than other areas (Figure 1—figure supplement 3), axons passing under VS potentially contributed to the signals in VS (Aransay et al., 2015). The axon tract from the midbrain to the forebrain runs along the ventral part of the brain, so any contamination of our signal was much more likely in VS than TS. Third, dopamine neurons that project to TS and VS are largely segregated and we did not observe axons passing through the opposite targets. Notably, dopamine neurons that target a particularly part of the striatum seldom pass through other parts of the striatum (Aransay et al., 2015, Cebrian and Prensa, 2010, Matsuda et al., 2009, Gauthier et al., 1999). Fourth, calcium transients in passing axons are much weaker than in axon terminals and boutons (Koester and Sakmann, 2000, Llano et 10, 1997). We added these points to our discussion (Discussion “Technical considerations”).

“Dopamine axons passing through and below the ventral striatum to the cortex (Aransay et al., 2014) may have contributed to the signals in VS dopamine, although calcium transients in passing axons are smaller than in axon terminals and boutons (Koester and Sakmann, 2000; Llano et al., 1997).”

8) Regarding Figure 5: What happens if the authors sort the trials where there was licking versus no licking (in water trials) in terms of the response to the water delivery. No licking may indicate that the mouse did not recognize/detect/attend to the odor cue?

We sorted our data based on anticipatory licking (rather than time) and plotted responses to new water-predicting odors and water delivery in VS and TS. We added a figure (Figure 5—figure supplement 2). Particularly this figure makes it clear that VS dopamine cue responses do not develop until after behavioral learning occurs. Additionally, it shows that TS cue responses were present before changes in behavior (Figure 5—figure supplement 2). This is in agreement with our proposal regarding “CS effectiveness” and “US effectiveness” (discussed below). Trial-to-trial correlation was also observed in a previous study of VTA dopamine neurons (Matsumoto et al., 2016). These data may support the relationship between responses and task engagement and/or learning, but we do not have direct evidence that they are related to attention and/or perception.

9) It would be nice to see the anatomy: if the authors retrogradely label VTA DA neurons that project to VS versus TS, where do these TS neurons reside? What is the distribution? This would be helpful information in terms of joining this information with the work by Stephan Lammel on the arrangement of VTA DA neurons projecting to mPFC or NAc m shell or NAc l shell.

We compared 8 subpopulations of dopamine neurons, including ones projecting to mPFC, and published this data in our previous paper (Menegas et al., 2015). VS-projecting DA neurons are mostly found in the VTA and TS-projecting DA neurons are mostly found in the lateral part of the SNC. This segregation along the medial-lateral axis could roughly coincide with the segregation of axons in the striatum along the anterior-posterior axis. We referenced this in the Discussion (“Relation to previous studies”).

“In our previous study, dopamine neurons that project to TS were mainly observed in the lateral SNc of mice (Menegas et al., 2015). A previous study proposed that putative dopamine neurons in the lateral SNc of monkeys encode ‘motivational salience’, which is the absolute value of positive or negative ‘value’ (Matsumoto and Hikosaka, 2009).”

10) Did the authors control for the order of all of the manipulations? Were the order of manipulations randomized for both the trial types within A, B, C and the epochs (particularly in Figure 6)?

Trial types in Figure 6 were randomized. Reward trials and nothing trials were used in all the sessions. In some sessions, either air puff trials or tone trials were added in addition to reward trials and nothing trials. Within a session, all trial types were randomized. Reward sessions, reward and air puff sessions, and reward and tone sessions were also randomized. We apologize for having poorly described our task parameters. We added an explanation of trial types and the order of sessions (Results (“Responses to rewarding and non-rewarding stimuli in VS and DS dopamine”) and Materials and methods (“Behavior”)). We also added a new section “Experimental Timeline” in the Materials and methods.

“After several weeks of this training, in some sessions, trials with odors predicting either a mild tone (55 dB) or an air puff were interleaved in addition to trials with water and trials with no outcome”.

“All trial types were randomized in all of the sessions”.

“After introduction of tone and air puff, reward sessions, reward and air puff sessions, and reward and tone sessions were randomized.”

https://doi.org/10.7554/eLife.21886.029

Article and author information

Author details

  1. William Menegas

    1. Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, United States
    Contribution
    WM, Conceptualization, Data curation, Formal analysis, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared.
  2. Benedicte M Babayan

    1. Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, United States
    Contribution
    BMB, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared.
  3. Naoshige Uchida

    1. Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, United States
    Contribution
    NU, Supervision, Writing—review and editing
    Competing interests
    NU: Reviewing editor, eLife
    ORCID icon 0000-0002-5755-9409
  4. Mitsuko Watabe-Uchida

    1. Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, United States
    Contribution
    MW-U, Conceptualization, Supervision, Writing—original draft, Writing—review and editing
    For correspondence
    1. mitsuko@mcb.harvard.edu
    Competing interests
    No competing interests declared.
    ORCID icon 0000-0001-7864-754X

Funding

National Institute of Mental Health (R01MH095953)

  • Naoshige Uchida

Harvard Mind Brain and Behavior

  • Naoshige Uchida

National Institute of Mental Health (R01MH101207)

  • Naoshige Uchida

National Institute of Mental Health (R01MH110404)

  • Naoshige Uchida

Fondation pour la Recherche Médicale (SPE20150331860)

  • Benedicte M Babayan

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank members of the Uchida lab for comments on the manuscript and helpful discussions. We thank S Matias, Z Mainen (Champalimaud Institute of Unknown) and C Burgess, M Andermann (Harvard Medical School) for advice on fiber fluorometry. Additionally, we would like to thank the staff of the Harvard Center for Biological Imaging, Edward Soucy (CBS neuro-engineering platform), and Louis Baum (Harvard physics department) for technical support. Finally, we would like to also thank Vivek Jayaraman, PhD, Rex A Kerr, PhD, Douglas S Kim, PhD, Loren L Looger, PhD, and Karel Svoboda, PhD from the GENIE Project, Janelia Farm Research Campus, Howard Hughes Medical Institute for AAV-GCaMP6m. This work was supported by National Institutes of Health grants R01MH095953 (NU), R01MH101207 (NU), R01MH110404 (NU), Harvard Mind Brain and Behavior faculty grant (NU) and Fondation pour la Recherche Medicale grant SPE20150331860 (BB).

Ethics

Animal experimentation: This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All of the animals were handled according to approved Harvard animal care and use committee (IACUC) protocols (#26-03) of Harvard University. All surgery was performed under isofluorane anesthesia, and every effort was made to minimize suffering.

Reviewing Editor

  1. Gary L Westbrook, Reviewing Editor, Vollum Institute, United States

Publication history

  1. Received: September 27, 2016
  2. Accepted: January 4, 2017
  3. Accepted Manuscript published: January 5, 2017 (version 1)
  4. Version of Record published: January 27, 2017 (version 2)

Copyright

© 2017, Menegas et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,149
    Page views
  • 879
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Developmental Biology and Stem Cells
    Tang Hai et al
    Research Article