Introduction

The prefrontal cortex (PFC) contributes to a variety of higher cognitive functions, achieving the flexible control of behaviors that enables animals to adapt to a changing environment (Miller and Cohen, 2001; Fuster, 2015). The PFC is involved, for instance, in stimulus selection, working memory, rule switching, and decision making (Miller and Wallis, 2009). PFC processing and circuits are highly sensitive to neuromodulators, including dopamine (Seamans and Yang, 2004; Arnsten et al., 2012). Indeed, studies using pharmacological or optogenetic manipulation of dopamine signaling have suggested roles of dopamine in gating sensory signals (Popescu et al., 2016; Vander Weele et al., 2018), maintaining working memory (Sawaguchi and Goldman-Rakic, 1994), and relaying decisions to motor structures (Ott et al., 2014). Consistently, dysregulation of dopamine signaling in the PFC may underlie a wide array of neuropsychiatric disorders, including schizophrenia, depression, attention-deficit/hyperactivity disorder, and post-traumatic stress disorder (Okubo et al., 1997; Lindstrom et al., 1999; Granon et al., 2000; Arnsten and Dudley, 2005; Howes and Kapur, 2009; Hoexter et al., 2012; Grace, 2016).

The PFC receives dopaminergic inputs from a subset of dopamine neurons in the midbrain, but the information encoded by these neurons in vivo remains unclear. Decades of investigations have revealed that midbrain dopamine neurons in the ventral tegmental area (VTA) generally encode reward prediction errors (Schultz et al., 1997): the neurons increase their firing to unexpected reward delivery and shift their response to cues that precede reward delivery after instrumental learning or classical conditioning (Rescorla and Wagner, 1972; Sutton and Barto, 1981). However, several studies have reported that a subpopulation of dopamine neurons show phasic responses to aversive stimuli as a part of salience signaling (Chiodo et al., 1980; Mantz et al., 1989; Guarraci and Kapp, 1999; Matsumoto and Hikosaka, 2009), implying that midbrain dopamine neurons may not be functionally homogeneous. Depending on the projection target, dopamine neurons can have distinct molecular, anatomical, and electrophysiological features (Lammel et al., 2008; Poulin et al., 2016). In particular, dopamine neurons that project to the PFC show unique genetic profiles (Poulin et al., 2016), and optogenetic stimulation of these neurons does not reinforce specific actions (Popescu et al., 2016; Ellwood et al., 2017; Vander Weele et al., 2018). In addition, these neurons might respond not only to rewarding stimuli but also to aversive stimuli. Microdialysis, amperometry, and voltammetry measurements in the PFC have demonstrated an increase of dopamine in response to appetitive stimuli (Hernandez and Hoebel, 1990; Ahn and Phillips, 1999; St Onge et al., 2012), aversive stimuli (Thierry et al., 1976; Abercrombie et al., 1989; Finlay et al., 1995; Vander Weele et al., 2018), or both (Bassareo et al., 2002). Similarly, measurements of the bulk calcium activity of mesocortical dopaminergic fibers have shown responses to appetitive (Ellwood et al., 2017) and aversive (Kim et al., 2016) stimuli. This apparent discrepancy is difficult to reconcile because none of these approaches could investigate the activity of individual dopamine neurons. Moreover, most previous studies evaluated the effects of either a rewarding or an aversive stimulus, rather than both. Consequently, it remains unknown whether the same or different mesocortical dopamine neurons respond to behaviorally opposing stimuli. It is also not known how these dopamine neurons change their response during classical conditioning, where rewarding or aversive stimuli are paired with conditioned cues.

To address these knowledge gaps, we developed an approach for imaging individual dopamine axons based on in vivo two-photon imaging with a microprism (Low et al., 2014). We optimized the microprism design and imaged dopamine axon terminals expressing genetically encoded calcium sensors in the mouse medial PFC (mPFC). We then head-fixed the mice to give rewards or aversive stimuli and trained the mice to associate the stimuli with preceding auditory cues (classical conditioning). During classical conditioning, we tracked the activity of dopamine axons over a period of days. We found that the dopamine axons showed diverse preferences for unconditioned (rewarding or aversive) stimuli. During classical conditioning, activity preferences for conditioned auditory cues were enhanced only for aversive-preferring axons. Moreover, in aversive-preferring axons, a machine learning-based analysis revealed that cue activity became more selective when the behavior of animals was judged as correct. We conclude that mesocortical dopamine axon activity is involved in aversive processing that is modulated by both classical conditioning across days and trial-by-trial judgements of conditioned cues within a day.

Results

Two-photon imaging shows dopaminergic axons in the mPFC of awake mice

To investigate the signal sent by dopamine neurons to the mPFC in mice, we developed an approach based on two-photon imaging using a microprism (Andermann et al., 2013; Low et al., 2014). We first expressed an axon-targeted (Broussard et al., 2018), genetically encoded calcium sensor, axon-jGCaMP8m, in dopamine neurons in the VTA. We injected Cre-dependent AAV into the midbrain regions of transgenic mice (DAT-Cre), which express Cre-recombinase in dopamine neurons (Kim et al., 2016) (Figure 1A, see Methods). After 2–3 weeks, using sectioned slices, we confirmed that GCaMP expression in cell bodies in the VTA (and substantia nigra pars compacta [SNc]) (Figure 1C) coincides with the expression of tyrosine hydroxylase, an endogenous marker for dopamine neurons (Figure 1D). Dopamine neurons in the VTA are known to project sparsely to the mPFC, including the superficial layers (Vander Weele et al., 2018) (Figure 1–figure supplement 1), but the mPFC itself is located deep in the medial bank (Figure 1B), rendering two-photon imaging of GCaMP (which is typically excited at 920–980 nm) infeasible. Therefore, we inserted a microprism into the longitudinal fissure between the two medial banks (two hemispheres) to optically access the mPFC (Figure 1E–F). The right-angle microprism bends the optical axis within the brain, providing optical access to the fissure wall and the mPFC surface (Low et al., 2014). We optimized the microprism assembly (Figure 1B, Figure 1–figure supplement 2C) in order to reach 2 mm in depth from the dorsal surface (Figure 1F). The assembly incorporated double-layer glass at the top (Komiyama et al., 2010), stabilizing the brain from both the medial and dorsal sides, which significantly reduced the movement of the brain (Figure 1– figure supplement 2E-H). Through the microprism, we could visualize GCaMP-expressing axons in the superficial layers of the mPFC in live animals (Figure 1G, Figure 1–figure supplement 3). The GCaMP signal can indicate the calcium influx into axons and terminals, which is triggered by axonal action potentials (Petreanu et al., 2012; Howe and Dombeck, 2016; Lutas et al., 2019), thereby providing a measure of the activity of dopamine neurons that send projections to the mPFC. In contrast, when we inserted a gradient refractive index (GRIN) lens into the mPFC (Kamigaki and Dan, 2017), we could not reliably see GCaMP-expressing dopamine axons, unlike the case for dopamine axons in the basal amygdala (Lutas et al., 2019). This difference might indicate that the dopamine axons in the mPFC have weaker signals requiring a lens with a larger numerical aperture (GRIN lens: NA 0.5 vs. Nikon objective lens: NA 0.8) or that these axons are less resilient to mechanical suction in close vicinity.

Two-photon imaging of dopaminergic axons projecting to the mPFC.

(A, B) Experimental design. The activity of midbrain dopamine neurons projecting to the mPFC was measured by two-photon calcium imaging of their axons. The axons were accessed through a microprism that bends the optical axis inside the brain (gray arrows in B). (C) GCaMP was expressed virally in dopamine neurons in DAT-Cre transgenic mice. AAV-axon-DIO-jGCaMP8m was injected into the VTA (coronal section). (D) jGCaMP8m-expressing neurons were positive for tyrosine hydroxylase (TH), a marker for dopamine neurons. (E, F) Dorsal view of a mouse head implanted with a microprism assembly. The microprism was 1 x 2 mm. (G) A sample in vivo image of jGCaMP8m-expressing axons.

Dopaminergic axons in the mPFC have diverse responses to rewarding and aversive stimuli

Using our imaging approach, we first investigated whether individual dopamine axons respond to unexpected rewards, one of the most well-documented features of dopamine neurons (Schultz et al., 1997). As a reward, we delivered drops of water through a spout with random timing to water-deprived mice (Figure 2A). In response to the reward delivery, the mice licked the water spout, and we filmed this behavior to quantify the tongue position (Figure 2B and Figure 2–figure supplement 1). Upon the delivery of a water drop, the mice started licking (licking latency: 0.538 ± 0.065 s, n = 8 animals) (Otis et al., 2017). Two-photon calcium imaging revealed that the water reward evoked brief calcium transients in many dopamine axons (40.1% of axons in 8 animals, example in Figure 2G). The brief calcium response to the reward is consistent with increased phasic firing in dopamine neurons at the time of unexpected reward, as previously reported in many studies in primates (Schultz et al., 1997) and rodents (Engelhard et al., 2019; Amo et al., 2022).

Dopaminergic axonal response to unexpected rewarding or aversive stimuli.

(A) Experimental design. Mice were placed under a two-photon microscope on a linear treadmill and were given unexpected rewarding (water drops) or aversive (electrical shock to the tail) stimuli. The mouse’s face was filmed with an infrared web camera to track the tongue position. (B) Example behavioral response to rewarding (top, tongue position) or aversive (bottom, treadmill speed) stimuli. (C) Average behavioral response on a single day for the same animal shown in B. (D) Representative image for dopamine axons labeled with jGCaMP8m. Scale bar: 50 μm. (E, F) Heatmaps of rewarding (E) or aversive (F) stimuli for the same imaging plate shown in D. (G–I) Calcium response for rewarding (left) and aversive (right) stimuli of three sample axons. (J) Comparison between reward (x-axis) and aversive (y-axis) responses for dopamine axons (n = 162). Statistically significant axons were labeled in either cyan (reward-preferring axons, n = 25) or magenta (aversive-preferring axons, n = 75). (K) Histogram of the polar angle of the scatter plot in J (n = 162). The solid line indicates probability density, estimated by kernel smoothing. A value of 0° represents axons that solely prefer rewards, whereas 90° represents those that solely prefer aversive stimuli.

In contrast to conventional midbrain dopamine neurons, mPFC dopamine axons are proposed to play a key role in aversive processing (Weele et al., 2019). To investigate the calcium response to an unexpected aversive stimulus, we delivered mild electrical shocks to the tail of the mice (Kim et al., 2016; Patriarchi et al., 2018; Lutas et al., 2019) (Figure 2A) that were randomly interleaved with reward delivery (one shock for every seven rewards on average). The mild shock evoked calcium transients in many dopamine projections (Figure 2F, H, I), together with locomotion (Figure 2B). These transients could simply reflect locomotion initiation, similar to dopamine axons in the dorsal striatum (Howe and Dombeck, 2016). To explore this possibility, we investigated whether locomotion without aversive stimuli is accompanied by increased calcium activity. We found no significant calcium increase at the time point of spontaneous locomotion initiation (Figure 2– figure supplement 2). Therefore, unlike the axons projecting from the SNc to the dorsal striatum (Howe and Dombeck, 2016; Ma et al., 2022), mPFC dopamine axons do not encode the initiation of movement; rather, these axons are more involved in aversive processing.

Previous studies have demonstrated that the overall dopamine release at the mPFC or the summed activity of mPFC dopamine axons exhibits a strong response to aversive stimuli (e.g., tail shock), but little to rewards. We evaluated the preference of individual axons for rewarding and aversive signals at a single-axon resolution by computing the polar angle for individual axons on a Cartesian representation of reward and shock activity (Figure 2J, K). In the polar representation, an angle of 0° indicates a strong preference for reward information, whereas 90° indicates a preference for aversive information. The polar angle distribution revealed that a significant number of axons preferred aversive stimuli, although some preferred reward. As a result, probability density, estimated by kernel smoothing, showed a bimodal distribution (Figure 2K, solid line) with a trough at around 45–50°. In addition, axons showing significant responses were categorized into two clusters based on k-means clustering (Figure 2J), the separation of which coincided roughly with 45–50° (Figure 2K). Thereafter, we refer to these clusters as aversive- or reward-preferring axons (colored magenta or cyan, respectively). We could not find any anatomical patterns for aversive- or reward-preferring axons. These axons were present in either half of the prism view (i.e., anterior or posterior; ventral or dorsal), implying no obvious functional projection patterns within the mPFC. We note that the strength of preference could be quantitatively changed. Indeed, we found that the reward response to 10 μL nearly reached saturation, but the aversive response could be further increased at a stronger current (Figure 2– figure supplement 3). Therefore, the exclusive preference for aversive stimuli observed in some studies might be explained by a smaller reward volume and/or stronger aversive stimuli. Altogether, our two-photon imaging revealed, for the first time, that individual axons show diverse preferences for rewarding and aversive stimuli and that the population-level preference is biased toward aversive stimuli.

Aversive cue processing is enhanced in aversive-preferring axons during classical conditioning

How do the reward and aversive activities of individual axons change while animals are learning that the reward and aversive events are preceded and predicted by sensory cues? This paradigm, known as classical conditioning, is a key framework for capturing learning-related changes in midbrain dopamine neurons (Sutton and Barto, 1981; Schultz et al., 1997) and mPFC neurons (Takehara-Nishiuchi and McNaughton, 2008; Otis et al., 2017). We presented mice with a 2-s pure tone as a conditioned stimulus (CSreward and CSaversive: 9 and 13 kHz, or 13 and 9 kHz), and then, after a 1-s delay, we presented either a rewarding or an aversive unconditioned stimulus (Figure 3A). Through this conditioning process, the mice learned the contingency between the conditioned stimulus (tone) and the outcome (reward or electrical shock), which was reflected in changes of their behavior (Figure 3B, C). To quantify such behavioral changes during learning, we separated the learning into three stages plus the first day (Figure 3D–G). On the first day, the animals licked the water spout only after the reward was delivered (first day; Figure 3B). However, during the middle and late stages, animals gradually showed licking behavior even before the reward delivery, representing an anticipation of reward (Figure 3D). We observed this anticipatory licking more frequently after CSreward than CSaversive (Figure 3D vs. 3E, p = 0.031 for the middle phase, p = 0.031 for the late phase, Wilcoxon signed-rank test, n = 6 animals), indicating that the animals behaviorally learned to discriminate the two conditioned stimulus tones. Similarly, running before the delivery of the unconditioned stimulus was more frequent after CSaversive than CSreward at the late phase (Figure 3F vs. 3G, p = 0.031, Wilcoxon signed-rank test, n = 6 animals), again indicating that the two conditioned auditory cues were behaviorally discriminated. Therefore, as in previous studies, anticipatory licking (Otis et al., 2017) and anticipatory running (Lutas et al., 2019) can capture whether animals behaviorally discriminate conditioned cues in classical conditioning.

Classical conditioning induced behavioral and neural changes.

(A) Experimental design. Auditory cues were presented before unconditioned stimuli (rewarding or aversive stimuli). (B, C) Behavioral changes in one example animal across 12 days (B: licking, C: running). In the reward condition, the animal gradually developed anticipatory licking (B, left). In the aversive condition, the animal usually ran after the shock delivery, but sometimes even before the delivery (C, right). Licking and running traces were normalized to the instantaneous maximum values for this animal and then averaged over a single day. (D-G) Anticipatory behavior during classical learning (n = 5 animals). In the late stage of learning, anticipatory licking was primarily observed in the reward condition (D) but not in the aversive condition (E). Anticipatory running (F) was seen more often in the aversive condition (G) than in the reward condition (F). (H) Activity change in one sample axon across 12 days. The axon was from the same animal shown in B, C. (I, J) Learning induced changes in response to conditioned cues (I) and unconditioned stimuli (J) for aversive-preferring axons (magenta) and reward-preferring axons (cyan) together with non-significant axons (black). Aversive- and reward-preferring axons were defined before the start of classical training. The x-axis represents the reward condition, and the y-axis represents the aversive condition. n = 47 for aversive-preferring axons, n = 12 for reward-preferring axons. (K) Learning induced a change in the unconditioned response of aversive-preferring axons for the aversive condition (magenta dashed line) and reward condition (magenta dotted line) and that of reward-preferring axons for the aversive condition (cyan dashed line) and reward condition (cyan dotted line). Number of axons is the same as I, J. (L) Similar to K, but for the polar angle of the scatter plot in J. The magenta line represents aversive-preferring axons and the cyan line represents reward-preferring axons. (M) Similar to K, but for the conditioned response. (N) Similar to L, but for the conditioned response.

Through the classical conditioning paradigm, our long-term two-photon imaging revealed that aversive-preferring dopamine axons maintained their preference for the unconditioned response but enhanced their selectivity for the aversive cue activity (Figure 3H and Figure 3–figure supplement 1). We evaluated the activity change at the time of the unconditioned stimuli throughout the learning process for aversive- and reward-preferring axons (Figure 3J, magenta and cyan, respectively). On the first day, aversive-preferring dopamine axons showed stronger activity for the aversive stimuli (Figure 3J, top, Figure 3–figure supplement 1A), similar to the response without classical conditioning (Figure 2H, I). Across learning, the activity for the rewarding and aversive unconditioned stimuli gradually decreased (Figure 3K, magenta, p = 0.002 for rewarding stimuli, p = 0.007 for aversive stimuli, n = 47; Wilcoxon signed-rank test, comparison between the first day and the last phase), maintaining similar preferences for rewarding and aversive stimuli (Figure 3L, magenta, p = 0.24, n = 47; Circular statistics, comparison between the first day and the last phase). Similarly, reward-preferring axons maintained their preferences over the course of classical conditioning (Figure 3L, cyan, p =0.77, n = 12; Circular statistics, Figure 3–figure supplement 1B).

Next, we quantified the activity change at the time of the conditioned auditory cues (CSreward and CSaversive, Figure 3I, Figure 3–figure supplement 1) throughout the learning process. On the first day, aversive-preferring axons already showed a transient response to conditioned cues, implying that the conditioned stimulus response was not acquired through learning (Figure 3M, Figure 3– figure supplement 1, first day). In addition, the conditioned stimulus response showed no particular reward/aversive preference (Figure 3N, first day, for aversive-preferring axons, p = 0.66, n = 47), indicating that aversive-preferring axons did not distinguish the two conditioned cues. However, at the later stages of learning, the conditioned stimulus response was enhanced for CSaversive in aversive-preferring axons (Figure 3M, magenta, p < 0.0001, n = 47; Wilcoxon signed-rank test, comparison between the first day and the last phase) and slightly attenuated for CSreward (p < 0.001), resulting in a stronger preference for aversive processing (late stage in Figure 3N, magenta, p < 0.001). In contrast, for reward-preferring axons, the conditioned stimulus response increased both for CSaversive (Figure 3M, cyan, non-significantly, p = 0.09, n = 12, Wilcoxon signed-rank test) and for CSreward (significantly, p < 0.007), resulting in an unchanged preference (Figure 3N, cyan, p = 0.77).

We also tested whether the dopamine axons showed suppressed activity when the predicted reward was omitted, one of the major features of reward prediction error coding (Schultz et al., 1997; Engelhard et al., 2019; Amo et al., 2022). Such activity suppression has been detected with GCaMP6m at cell bodies of dopamine neurons (Engelhard et al., 2019) as lowered activity at 0– 4 s after the delivery of reward. Therefore, we included one condition for an unexpected reward omission on the last day of classical conditioning (Figure 3–figure supplement 2). We found that upon the reward omission, the reward-preferring dopamine axons did not show activity suppression, indicating that the mPFC dopamine axons do not respond to reward omission. Taken together, our two-photon imaging revealed that only a minority of mPFC dopamine axons encode reward activity (reward-preferring axons), and that these axons are not involved in reward prediction error in a classical learning paradigm. In contrast, the majority of dopamine axons are strongly involved in aversive processing (aversive-preferring axons), and the preference for aversive processing is enhanced by the conditioned cue through classical learning.

Dopamine axons show enhanced selectivity of cue activity in trials with correct discrimination

In the classical conditioning paradigm, an enhanced preference of aversive-preferring dopamine axons for aversive cues (Figure 3N) was accompanied by improved behavioral discrimination of the two conditioned cues (Figure 3D-G). Based on this finding, can correct cue discrimination be accompanied by an enhanced neural preference when animals make trial-by-trial judgements in discriminating cues even after conditioning? To investigate trial-by-trial judgements of conditioned cues, we classified the trials into four groups (Figure 4A) based on correct or incorrect discriminating behavior. First, we focused on the presence or absence of anticipatory licking, as the licking behavior can discriminate the two conditioned stimulus tones, particularly at the late stage of learning (Figure 4–figure supplement 1, based on the random forest classifier). The first group exhibited licking after CSreward (correct reward discrimination), the second group exhibited no licking after CSreward (incorrect reward discrimination), the third group displayed no licking after CSaversive (correct aversive discrimination), and the fourth group displayed licking after CSaversive (incorrect aversive discrimination). The classification is invalid when animals make random guesses (discrimination of 50%), so we focused on results from the late stage of learning (Figure 4–figure supplement 1). First, we assessed the CS activity of aversive-preferring axons, as the activity difference between reward- and aversive-predictive cues was significant only in these axons.

Axonal cue response in trials with correct or incorrect cue discrimination.

(A) Classification of trials based on the behavioral response that occurred between the conditioned stimulus onset and unconditioned stimulus onset. Such behaviors include anticipatory licking or facial expressions. (B) Comparison of reward cue response between correct (x-axis) and incorrect (y-axis) trials based on anticipatory licking (magenta, aversive-preferring axons, n = 47; cyan, reward-preferring axons, n = 12). (C) Similar to B, but for aversive cue responses (magenta, aversive-preferring axons, n = 47; cyan, reward-preferring axons, n = 12). (D) Similar to B, but based on facial expressions (magenta, aversive-preferring axons, n = 47; cyan, reward-preferring axons, n = 12) (E) Similar to C, but based on facial expressions (magenta, aversive-preferring axons, n = 47; cyan, reward-preferring axons, n = 12).

An incorrect discrimination of the aversive cue is accompanied by the presence of anticipatory licking, resulting in error trials in our machine learning-based analysis. Such error trials (Figure 4A, fourth group) occurred in 1.2% of cases, showing a weaker aversive cue response than correct trials (third group) (p < 0.0001, n = 47 axons, Wilcoxon signed-rank test, Figure 4B, right). In contrast, the absence of anticipatory licking despite the reward-predictive cue comprises another type of error (second group, 49.5%). In such error trials, the reward cue response was not significantly different from that in the correct trials (first group) (p = 0.29, n = 47 axons, Wilcoxon signed-rank test, Figure 4B, left). Overall, the reward/aversive preference was stronger in correct discrimination trials than in incorrect trials (left vs. right in Figure 4C, magenta, 82.1° ± 1.1° vs. 74.3° ± 7.0°, p = 0.02, circular statistics).

In addition to anticipatory licking, the discrimination of predictive cues can be inferred by the facial expressions of mice. Facial expressions of mice can capture emotional states (Dolensek et al., 2020), and have been used to make binary judgements of the presence or absence of pain with the application of a deep neural network (Tuttle et al., 2018). In this study, we combined a pretrained deep neural network (ResNet3D) (Tran et al., 2018) and a machine learning classifier (random forest classifier) (Breiman, 2001) to make binary judgements of whether the animals experienced reward or aversive conditions, based on facial expressions during the cue presentation (Figure 4–figure supplement 2). The percentage of errors in discrimination during the 2-s cue presentation was 15.2% ± 0.9% (n = 5 animals), comparable to the result for anticipatory licking during the cue plus delay periods (25.3% ± 5.9%). However, error trials defined by facial expressions and anticipatory licking did not fully overlap; discrimination based on facial expressions resulted in a higher number of error trials in aversive conditions than discrimination based on licking (14.1% vs. 1.2%), and a lower number in reward conditions (18.0% vs. 49.5%). This discrepancy might be explained either by temporal discrepancy between the cue period (facial expression) and the delay period (most cases of anticipatory licking) or by the fact that anticipatory licking might represent reward uncertainty rather than reward expectation (Ogawa et al., 2013). Nonetheless, a trial-by-trial error analysis based on facial expressions (Figure 4D) revealed that the axonal activity to CSaversive was stronger in the correct trials in the aversive-preferring axons (Figure 4D, right, p < 0.03, n = 47 axons, Wilcoxon signed-rank test), consistent with the analysis based on anticipatory licking (Figure 4B, right). In addition, the facial expression error analysis demonstrated that the response to CSreward was significantly weaker in the correct trials (Figure 4D, left, p < 0.02, n = 47 axons, Wilcoxon signed-rank test). As a result, the reward/aversive preference was stronger in correct discrimination trials than in incorrect trials for the aversive-preferring axons (left vs. right in Figure 4E, magenta, 84.7° ± 3.0° vs. 78.6° ± 4.0°, p = 0.019, circular statistics).

In contrast to the aversive-preferring axons, correct discrimination had no effect on the CS activity of the reward-preferring axons. We found that the response to CSreward and CSaversive was not significantly different between correct and incorrect judgement trials (cyan points in Figure 4B and D, anticipatory licking: CSaversive p = 0.25, facial expressions: CSrewad p = 0.08, CSaversive p = 0.20, n= 12, Wilcoxon signed rank test) except for the CSreward response based on anticipatory licking (p< 0.001, n= 12). As a result, selectivity for CSreward/CSaversive was not improved in correct trials (anticipatory licking: p = 0.18, facial expressions: p = 0.39, Figure 4C, E). Therefore, correct/incorrect discrimination impacts aversive- and reward-preferring axons differentially.

Altogether, when animals exhibited the correct behavioral response (either anticipatory licking or facial expression), aversive-preferring but not reward-preferring axons showed a higher selectivity for aversive cue processing (Figure 4C, E).

Discussion

Dopamine projections to the mPFC are considered one of the key neuromodulators that enable flexibility in neural processing of the mPFC. However, due to technical difficulties in recording the dopamine neurons of specific projections, little is known about the signals conveyed by mPFC projections, including the basic question of whether individual projections signal reward prediction errors or aversive information. In this study, we optimized a two-photon imaging approach based on a microprism to image the calcium activity of dopaminergic axons in the mPFC. We uncovered differences in reward/aversive preferences in individual dopamine axons with an overall preference of the mPFC for aversive stimuli. In addition, we demonstrated that aversive-preferring axons responded equally to reward- and aversive-predictive conditioned cues in classical conditioning on the first day; however, this response became strongly biased toward the aversive conditioned cue through the conditioning. Finally, based on a trial-by-trial analysis of the animals’ behavior following reward- or aversive-predictive cues, we found that aversive-preferring axons exhibited higher selectivity for cues when the cues were successfully discriminated behaviorally.

Our study addresses a long-standing question of whether dopamine neurons send reward- or aversion-related signals to the mPFC (Weele et al., 2019; Verharen et al., 2020). The activity of mPFC-projecting dopamine neurons can be investigated extracellularly by incorporating antidromic stimulation (Mantz et al., 1989), but this approach is laborious. Thus, many studies have used more technically feasible but less direct approaches, particularly for awake animals, such as measuring dopamine release with microdialysis (Abercrombie et al., 1989; Bassareo et al., 2002), measuring catecholamine release with fast-scan cyclic voltammetry (recently combined with optogenetic and pharmacological identification (Vander Weele et al., 2018), and measuring bulk calcium activity from dopamine axons with fiber photometry (Kim et al., 2016; Ellwood et al., 2017). These studies have led to somewhat inconsistent conclusions: some studies have reported reward signals whereas others have reported aversive signals. Reconciling these findings is challenging, as different studies have used different approaches to assess the effects of either a rewarding or an aversive stimulus, but not both. Our two-photon imaging approach provided a unique opportunity to compare rewarding and aversive signals of individual projection neurons (i.e., individual axon projections). Our comparison revealed diversity in the dopamine axons, and that many dopamine axons responded to both rewarding and aversive stimuli, with a strong bias for aversive stimuli at the population level. However, this population bias was not fixed; rather, the bias depended on both the reward volume and the intensity of the aversive stimulus, possibly explaining the varying conclusions from different studies.

mPFC-projecting dopamine neurons increase their firing rate to opposite hedonic valences, rewarding and aversive stimuli, which cannot be simply explained by reward prediction error coding performed by conventional midbrain dopamine neurons. Instead, their firing might be captured by motivational salience signals (Bromberg-Martin et al., 2010). Similarly, the response to reward- and aversive-predictive cues can be explained by motivational salience signals and alerting signals, both of which are considered a part of salience coding (Bromberg-Martin et al., 2010). On the first day of classical conditioning, when the animals had not yet established a link between conditioned and unconditioned stimuli, the aversive-preferring axons showed a transient activity increase to two types of conditioned cues with no bias, implying that activity serves as an alerting signal (Vander Weele et al., 2018). After days of training, the activity became strongly biased to the aversive cue, indicating that the activity might no longer be a simple alerting signal, but instead a motivational salience signal (Bromberg-Martin et al., 2010; Lee et al., 2021). To clarify the nature of the motivational salience signal, further study is necessary to systematically vary the physical features of conditioned and unconditioned stimuli.

Consistent with salience coding without hedonic valences, phasic optogenetic stimulation of dopamine axons in the mPFC does not reinforce or suppress any behavioral actions (Popescu et al., 2016; Ellwood et al., 2017; Vander Weele et al., 2018) (but see (Gunaydin et al., 2014)). Instead, optogenetic stimulation can increase the signal-to-noise ratio of aversive processing in mPFC neurons for competitive situations in which reward and appetitive cues are simultaneously presented (Vander Weele et al., 2018). Our results are consistent with the recent view that dopamine at the mPFC gates sensory inputs for aversive processing (Ott and Nieder, 2019; Weele et al., 2019).

Using aversive classical conditioning, we revealed that aversive learning can induce activity changes in dopamine axons. The classical conditioning is a form of aversive learning distinct from instrumental aversive learning [including punishment and active avoidance (Jean-Richard-Dit-Bressel et al., 2018)]. Although all types of aversive learning are processed in the mPFC and dopamine systems, each type may be expected to include distinct neural circuits. Further research is necessary to reveal the detailed processing of aversive learning in the mPFC and dopamine projections.

Our study provides new insight on functional diversity in dopamine neurons that constitute mesocortical pathways. Studies that employed fiber photometry imaging has identified functional diversity among dopamine neurons with distinct projection pathways; dopamine axons in the ventral nucleus accumbens medial shell (de Jong et al., 2019; Yuan et al., 2019), in the tail of the striatum (Menegas et al., 2018), and in the basal amygdala (Lutas et al., 2019) do not show activity that matches reward prediction error coding, but instead show increased activity for aversive stimuli (Verharen et al., 2020). Our two-photon imaging demonstrates that even the same projection-defined dopamine neurons can be inhomogeneous, with some preferring aversive signals and others preferring reward signals. The aversive response found in some dopamine projections, including mesocortical dopamine projections, might be linked to glutamate corelease, as vesicular glutamate transporter 2 genes are expressed in dopamine neurons projecting to the ventral nucleus accumbens medial shell, the tail of the striatum, and the mPFC (Poulin et al., 2016) and AMPA-receptor-mediated excitatory postsynaptic currents have been confirmed upon the stimulation of dopamine axon terminals in the basal amygdala (Lutas et al., 2019). As of now, it is not clear how functional diversity within the same projections is linked to molecular diversity. Clarifying such a link will further advance our understanding of distinct dopamine subsystems and may shed light on how dopamine subsystems are dysregulated in psychiatric diseases.

Acknowledgements

We thank J. Zhang for technical help. This work was supported by grants from JSPS KAKENHI (20K16465), and JST PRESTO to K.M. (JPMJPR2128), JSPS KAKENHI to T.I. (JP21K07459), BBRF Young Investigator Grant (29268), National Institute on Drug Abuse (COCA Pilot grant, P50DA046373), National Institute of Aging (R03 AG070517), National Institute of Neurological Disorders and Stroke (R21 NS125571, R01 NS131549) to T.R.S. and JST PRESTO (JPMJPR1883), NHMRC Ideas Grant (APP1184899), and KAKENHI (20K23378) to T.K.S.

Declaration of interests

The authors declare no conflicts of interest.

Author contributions

T.R.S. and T.K.S. conceived the project; T.R.S. and T.K.S. designed the experiments. K.M., H.I., T.T., T.R.S. and T.K.S. prepared the analysis codes. K.A., Z.H., T.I., T.R.S. and T.K.S. carried out experiments; K.A., Z.H., A.M., E.M., T.R.S. and T.K.S. analysed the data. T.R.S. and T.K.S interpreted the data and wrote the manuscript.

Materials and Methods

Key resources table

Experimental model details

Animals

All experimental procedures were approved by the Medical University of South Carolina, Monash University, Kagoshima University, and local institutions supervising animal experiments. Heterozygous dopamine transporter (DAT)-Cre mice (Slc6a3tm1.1(cre)Bkmn, Jackson Laboratory, #006660, crossed with wild-type C57BL/6) was used in this study, including 12 mice for two-photon imaging and 10 for histology. Previous research utilized the same mouse line to express GCaMP6f in dopamine axon terminals in the mPFC that could be detected by one-photon fiber photometry (Kim et al., 2016). Mice of both sexes, aged >8 weeks were included. The mice were maintained in group housing (up to five mice per cage) and experiments were performed during the dark period of a 12-h light/12-h dark cycle.

Method details

Surgery

All surgical procedures were performed aseptically, with the mice under anesthesia with isoflurane. Lidocaine (subcutaneously at the incision), atropine (0.3 mg/kg, intraperitoneally), caprofen (5 mg/kg, intraperitoneally), and dexamethasone (2 mg/kg, intraperitoneally) were applied to prevent pain and brain edema. After surgery, the mice were allowed to recover for at least three days. No experimenter blinding was done.

Headplate implant and virus injection

A custom-made headpost was glued and cemented to the skull, and then, a small craniotomy (<0.5 mm) was performed over the VTA (∼2.9–3.5 mm posterior and ∼0.5 mm lateral from the bregma). Inside the small craniotomy, axon-GCaMP virus (Broussard et al., 2018) (AAV2/1-hSynapsin1-FLEx-axon-jGCaMP8m-WPRE-SV40) was volume-injected (Nanoject III, Drummond Scientific) to the VTA through a pulled capillary glass (40–60 nL/site; depth: 4200–4400 μm; 15 min/injection). After the injection, the craniotomy was sealed with a small piece of cover glass and silicon sealant (Kwik-Cast) and animals were returned to their home cage.

Microprism implant

After a 3-week waiting period of adeno-associated virus (AAV) expression, a microprism was inserted for two-photon imaging as described previously (Low et al., 2014). A rectangular craniotomy (4 x 2 mm) was introduced over the bilateral PFC (∼1.5–3.5 mm anterior from the bregma), and the dura was removed over the right hemisphere. Then, a microprism implant assembly was inserted into the subdural space within the fissure (Figure 1B, E, F). The microprism was centered ∼2.5 mm anterior to the bregma to avoid damaging bridging veins. Once implanted, the prism sat flush against the opposing fissure wall, which contained the medial wall of the PFC (mainly the prelimbic area) in the left hemisphere. The front face of the prism was oriented along the midline.

The assembly consisted of a right-angle microprism (2 x 2 x 1 mm, Prism RA N-BK7, Tower Optical Corp.) and two coverslip layers (bottom layer: 4.5 x 3.0 mm, top layer: 3.6 x 1.8 mm), which were glued by ultraviolet curing optical adhesive (Norland #81). The top layer of glass was cemented to the skull with dental acrylic. Our assembly design (microprism of 2 x 2 x 1 mm, plus double-layer glass) is different from the original report (microprism of 1.5 x 1.5 x 1.5 mm plus single-layer glass) (Low et al., 2014) for the following reasons. First, the thinner microprism (1 mm in the anterior-posterior axis) was easier to insert into the bank, avoiding superficial veins branching from the superior sagittal sinus. Second, the longer prism (2 mm in the dorsal-ventral axis) could spare a wider imageable region below the superior sagittal sinus. Third, the double-layer glass helped suppress brain movements.

Behavior

After the microprism implant surgery, the mice were allowed to recover in their home cages for one week. After recovery, the mice underwent water scheduling (receiving 0.8–1 mL of water per day). Then, the mice were pretrained for head fixation and for drinking water from a spout on a linear passive treadmill (SpeedBelt, Phenosys) in a sound-proof blackout box for two days. After the initial days of reward only, the animals received infrequent electrical shocks interspersed with the reward.

To monitor licking behavior, the face of each mouse was filmed with a camera at 60 Hz (CM3-U3-13Y3M-CS, FLIR) using infrared illumination (850-nm light-emitting diode, IR30, CMVision or M850F2, Thorlabs). To detect locomotion, the running speed on the treadmill was recorded at 30 kHz.

Rewarding and aversive stimuli

The mice received rewarding or aversive stimuli with unpredictable timing. The stimuli were administered in a randomized order (rewarding stimuli: 7 out of 9 cases; aversive stimuli: 1 out of 9; control period: 1 out of 9), with a randomized inter-trial interval of 55–65 s. The mice exhibited comfortable behavior on the treadmill for 1.5–2 h.

As a reward, 10 μL of sugar water was delivered through a water spout (Figure 2A), controlled by a syringe pump (PHM-107, Med Associates, Inc., USA). Based on previous literature, a 10-μL reward is relatively large (Tsutsui-Kimura et al., 2020). In some experiments, the reward volume was varied between 0 and 15 μL (Figure 2–figure supplement 3). As an aversive stimulus (Kim et al., 2016; de Jong et al., 2019; Lutas et al., 2019), a 1-s, 0.2-mA electrical current was delivered via a stimulator (AM2100, A-M systems, USA) between two electrode pads attached to the mouse’s tail (Figure 2A). This current was considered to be mild, just strong enough to evoke locomotion. When the current was doubled (Figure 2–figure supplement 3), the locomotion tended to become stronger, but some animals stopped drinking water. Similarly, when the frequency of the aversive stimuli was increased (e.g., 50% of trials), some mice were no longer motivated to drink the reward water.

Classical conditioning

After three days of reward and aversive stimulus sessions, we trained the mice in reward and aversive trace conditioning. The structure of the task is the same as that for the reward and aversive stimulus sessions (reward condition: 7 out of 9 cases; aversive condition: 1 out of 9; control condition: 1 out of 9; inter-trial interval: 55–65 s), except that auditory stimuli (9 or 13 kHz) were presented 1–3 s before presenting the unconditioned stimuli (rewarding or aversive stimulus). In three animals, 9 kHz tone was used for the reward-predictive cue, 13 kHz for the aversive-predictive cue. In the remaining three animals, the tone association was reversed. When the anticipatory licking (Figure 3B) was stably manifested, we included one condition for an unexpected reward omission among the seven reward conditions (Figure 3–figure supplement 2) and continued for two more days.

Two-photon imaging

In vivo two photon imaging was performed using a table-mounted microscope (Bergamo II, Thorlabs or MOM, Sutter Instruments) and a data acquisition system. The light source was a pulsed Ti:sapphire laser (MaiTai DeepSee HP, SpectraPhysics, or Chameleon Ultra II, Coherent), with the laser wavelength set to 980 nm (Hasegawa et al., 2017; Itokazu et al., 2018), which causes a higher fluorescent change in the GCaMP signal and less scattering in the tissue than 920 nm. The laser power at the apochromatic objective lens (16×, 0.80 NA, Nikon) was <70 mW, and we saw no bleaching. Green fluorescent photons were filtered (ET525/70 m-2p) and collected by a hybrid photodetector (R11322U-40-01, Hamamatsu Photonics) (Tischbirek et al., 2015) and a high-speed current amplifier (DHPCA-100, Femto). Imaging frames were acquired at ∼60 Hz and were downsampled offline. Images were collected at a depth of 30–100 μm from the dural surface (up to ∼200 x 200 μm). The small field of view at a high sampling rate makes it possible to collect weak signals from small structures, as in spine functional imaging (Jia et al., 2014).

Imaging fields were searched based on the presence of fiber morphology with at least occasional calcium transients in the fibers, not based on the behavioral correlation of the transients. For each mouse, imaging was performed for a single field per day in order to gain a sufficient number of repeats with a 1-min inter-trial interval. In reward/aversive preference characterization (Figure 2), 1–2 sites were imaged on different days. For the classical conditioning, only a single site was imaged during the course of conditioning. Once the imaging site was determined on the first day, the reference image of two-photon imaging was captured, in addition to the surface vessel image of one-photon imaging. On subsequent days, these images were used to return to the same imaging site, comparing and overlaying the reference image and the ongoing imaging view.

Calcium imaging data analysis

Data processing

Imaging data was processed for motion correction and registration. Axons were detected for region-of-interest (ROI) drawing using Suite2p (Pachitariu M et al., 2016) and a custom-made MATLAB program (Itokazu et al., 2018). A fluorescent trace for each ROI was generated, and then the trace was normalized by the baseline fluorescence (F0, set as the 50th percentile fluorescence over a 30-s sliding window in order to remove any slow drifts in the baseline) to produce a ΔF/F trace.

Dopamine axons were sparsely labeled in the mPFC, but the same axons needed to be excluded based on correlation analysis among pairs (Petreanu et al., 2012; Sun et al., 2016; Itokazu et al., 2018). The correlation coefficients of ΔF/F traces were calculated for axons in each plane, and pairs showing a higher correlation (>0.65) (Itokazu et al., 2018) were considered to arise from the same axon. The high correlation pairs were grouped into clusters, and the ROI with the largest ΔF/F signal in each cluster was assigned to represent the cluster. Our results remained similar for different correlation threshold values.

Reward, aversive, cue, and locomotion activity

For each axon, reward and aversive activity were evaluated. Reward activity was quantified as an increase in ΔF/F by comparing the average ΔF/F between the control range (−2 to 0 s from the onset of the reward TTL to a syringe pump) and the signal range (0 to 2 s). Similarly, aversive activity was quantified as an increase in ΔF/F, based on the difference between the average ΔF/F between the control range (−2 to 0 s from the onset of the electrical shock TTL) and the signal range (0 to 2 s). Axons were considered to exhibit a significant response if the magnitude of either activity was statistically larger than that of the baseline activity (Wilcoxon signed-rank test; p < 0.05). Significant axons were classified as either reward-preferring (cyan) or aversive-preferring (magenta), based on whether the axons are above or below the unity line of the reward/aversive scatter plot, as shown in Figure 2J.

The locomotion activity was quantified as an increase in ΔF/F by comparing the average ΔF/F between the control range (−2 to 0 s from the locomotion initiation) and the signal range (0 to 2 s). The locomotion initiation is defined in the “Running detection” section below.

During classical conditioning, activity was evaluated in a similar manner. For the conditioned cue activity, the activity increase was computed by comparing the average ΔF/F between the control range (−2 to 0 s from the onset of the predictive cue) and the signal range (0 to 2 s). For the unconditioned response activity (reward or aversive), we compared the control range (−2 to 0 s from the onset of the predictive cue) and the signal range (0 to 2 s from the onset of the unconditioned stimulus). To investigate the preference for reward or aversive processing, we used scatter plots (Figure 3I, J), similar to Figure 2J. The color-coded classification (cyan/magenta) was based on k-means clustering, using the responses before classical conditioning (Figure 2J). Evaluation of brain movement

To compare the amount of brain movement between the two different microprism assemblies (Figure 1-figure supplement 2), we obtained x- and y-axis shifts of acquired images caused by the brain movement. The shifts are computed by Suite2p program and used for image registration (Pachitariu M et al., 2016). We quantified the brain shift using two metrics: root mean square and large transient movement. First, the room mean square is computed based on sequential shifts in pixel in x- and y-dimensions that are combined trigonometrically (Figure 1-figure supplement 2E). Second, to detect large transient movement events, combined brain shift traces are filtered (Butterworth, at 1.5 Hz) and events larger than 5 μm were detected as movement events (red dots in B).

Behavioral analysis

Licking detection

To track the movement of the tongue, videos of orofacial movement (60 Hz, side view) were processed using DeepLabCut (Mathis et al., 2018) (Figure 2–figure supplement 1). The tip of the tongue, the location of the water spout and the position of the nose were labelled in randomly selected ∼200 frames from 6 animals. In frames when the tongue was inside the mouth and was not visible, we estimated its location from the lips and jaw, instead of not labelling the tongue in these frames. This estimation prevented DeepLabCut from making a completely wrong guess in labelling the tongue for these frames.

The learning process was divided into three equal-duration periods. We confirmed that the division into six periods resulted in a saturating discrimination curve for anticipatory licking in the fifth and sixth periods. These last two periods in the six-period division correspond to the ‘late phase” of the three-period division that we used.

Running detection

The speed of treadmill was monitored as the output from a SpeedBelt apparatus (Phenosys). The locomotion period was defined as the duration in which the treadmill speed was above the median + 0.5 x standard deviation for more than 200 ms. Then, the initiation of the locomotion period was defined as a time point preceded by a non-locomotion period (when the running speed is below the threshold) of at least 0.5 s.

Machine learning analysis of facial expressions, licking, and running

The anticipation of animals regarding upcoming unconditioned stimuli (reward or electrical shock) was quantified based on auditory predictive cues using a machine learning classifier (random forest classifier) (Breiman, 2001).

Facial expressions were filmed by an infrared web camera and analyzed with a random forest classifier combined with a deep neural network (Figure 4–figure supplement 2A-D). First, features of facial expressions were extracted from a given temporal series of frames (i.e., a video) using a deep neural network model, the ResNet3D model. The ResNet3D model is a pretrained network consisting of 18 layers, optimized for videos and provided by PyTorch (Tran et al., 2018). The output from the final convolutional layer was fed into the random forest classifier. In our study, training was performed not on the pretrained ResNet3D, but on the random forest classifier. The random forest classifier was trained and tested with independent trials by five-fold cross-validation within each day. To prevent the random forest classifier from being overfit, only the top 400 features of the input were used, which were ranked by the F-value. To train the random forest classifier equally to the reward and aversive conditions despite their imbalanced frequency (7 or 1 out of 8 trials, Figure 3A), an ensemble training technique was used (Wallace et al., 2011). the discrimination accuracy for reward and aversive conditions was computed separately and an average was taken with equal weights as a final discrimination accuracy. The equal weights prevented the accuracy computation from being dominated by the reward condition, which occurred more frequently than the aversive condition. To investigate the time course of the discrimination accuracy, accuracy computation was performed for a 500-ms time window instead of a 2-s window, and the window was systematically shifted by 160 ms (Figure 4–figure supplement 2E).

The discrimination accuracy based on anticipatory licking was also computed (Figure 4–figure supplement 1). To enable a comparison among facial features and licking, the random forest classifier was used. Instead of 400 features (facial expressions), the random forest classifier was fed with one feature (either the number of licking instances, or the running speed).

Histology

Animals were perfused with 4% paraformaldehyde (PFA) in phosphate-buffered saline (PBS). GCaMP or tyrosine hydroxylase immunostaining was performed using standard procedures (Figure 1C, D). Coronal slices (thickness, 30 µm) were cut using a cryostat (Leica Microsystems) and blocked in carrier solution (5% bovine serum albumin; 0.3% Triton X-100 in 0.1 M PBS) for 2 h at room temperature on a shaker. For GFP staining, slices were then incubated with anti-green fluorescent protein (GFP) primary antibody (anti-GFP, 1:1000, A11122, Invitrogen) for 18 h at 4°C on a shaker. After three rinses with 0.1 M PBS for 30 min, sections were incubated with Alexa-Fluor-488-conjugated donkey anti-rabbit secondary antibody (Invitrogen, 1:500 in carrier solution) for 1 h at room temperature on a shaker. For tyrosine hydroxylase staining, additional incubation with anti-tyrosine hydroxylase (TH) primary antibody (anti-TH, 1:200, ab113) and Alexa-Fluor-568-conjugated donkey anti-sheep secondary antibody (Invitrogen, 1:500) was included. Cell nuclei were stained with DAPI (1:1000; D523, Dojindo). After a few additional rinses for 30 min in 0.1 M PBS were performed, slices were mounted on slide glasses for imaging. Images were acquired using a confocal laser-scanning microscopy (FV3000, Olympus) and a fluorescence microscope (VS200, Olympus).

Experimental design and statistical analysis

Data are described as the mean ± s.e.m. unless otherwise noted. The statistical significance for behavioral analysis was determined by t-tests (2-tailed) using MATLAB. Difference in neural activity was determined by non-parametric Mann Whitney test.

Significance levels of data were denoted as * P < 0.05, ** P < 0.01 and *** P < 0.001. P > 0.05 was insignificant and was denoted as n.s.

Resource availability

Lead contact

Further information and requests for reagents may be directed to the Lead Contact, Takashi R Sato (satot@musc.edu).

Materials availability

These studies did not generate unique reagents.

Data and code availability

All data reported in this study will be shared by the corresponding authors upon request. All the analysis codes from this study are available from the corresponding authors upon request.

Sparse dopaminergic projection to the the mPFC.

(A) Example experiment of AAV-GFP injection into the VTA in the Allen Brain Atlas Data Portal (experiment No. 156314762). (B) Sparse GFP-expressing axons in the mPFC region, including the prelimbic area (the same experiment shown in A). (C) Sparse jGCaMP8m expression at the mPFC was observed in our experiment. The approximate position of the image is indicated as a square in B. (D) The level of projected GFP axons at the prelimbic area was 1.3% ± 0.9% of that at the caudoputamen. All experiments at the Allen Portal used wild-type or tyrosine hydroxylase-Cre mice that received an AAV-GFP injection into the VTA (experiment No. 156314762, 120572378, 127796728).

Double layer glass significantly reduces brain movement.

(A) The assembly design similar to Low et al, 2014. (B) Brain movement during imaging with the assembly design similar to Low’s. The brain shit was computed by Suite2p program, and shift in pixel in x and y dimensions were combined trigonometrically. Frame rate was 58Hz (200000 frames corresponds to ∼57 minutes). One pixel is 0.31 μm. Movement events larger than 5 μm were detected as red dots. (C) Our own assembly design with additional layer of glass. (D) Similar to B, but with our assembly design. (E) Root mean square of the brain shift for Low’s design (Standard: n = 3) and ours (Extra layer: n=5). The root mean square is statistically smaller with our design (p =0.0066, t-test) (F) The frequency of movement events larger than 5 μm. The events are less frequent with our design (n = 3, 5, p = 0.0080, t-test).

Long-term imaging of dopamine axons across days.

(A) The imaged acquired on the 1st day. (B) Similar to A, but on the 4th day. (C) Similar to A, but on the 9th day. The same imaging plane with Figure 1G

Lick detection using DeepLabCut.

A) Sample frames used by DeepLabCut. The tongue tip was labeled as a red dot and the water port as a blue dot (left). When the tongue was not visible (right), we intentionally trained DeepLabCut to consider the tongue position as the inside of the oral cavity (red dot).(B) Example of a tongue position trace. (C) Tongue movement that crossed the water port position (gray line) was counted as a lick (black ticks).

Insignificant locomotion activity of dopaminergic axons.

No significant increase in locomotion initiation (for aversive-preferring axons, p = 0.583, n = 75; for reward-preferring axons, p = 0.92, n = 25; for all axons, p = 0.23, n = 160, Wilcoxon signed-rank text).

Dopaminergic axonal activity depends on the reward volume and shock current.

Axonal activity increased as the reward volume increased. Differences were significant between 0 and 5 μL (p < 0.001), between 2.5 μL and 5 μL (p < 0.001), and between 5 μL and 10 μL (p < 0.001, n = 39, 2 animals). However, the differences were not significant between 0 μL and 2.5 μL (p = 0.06) or between 10 μL and 15 μL (p = 0.97), indicating saturation at 10 μL. Axonal activity increased as the electrical current increased. The difference was significant between 0 and 0.2 mA (p < 0.001, n = 61, 2 animals) and between 0.2 and 0.4 mA (p < 0.001). * indicates p < 0.05, ** P < 0.01 and *** P < 0.001.

Population activity of aversive- and reward-preferring axons throughout learning (aversive-preferring axons, n = 47; reward-preferring axons, n =12).

(A-D) Averaged calcium traces of aversive-preferring axons for the first day (A), early phase (B), middle phase (C), and late phase (D) of classical training for the reward condition (left) and aversive condition (right). (E-H) Similar to A-D, but for reward-preferring axons.

Activity is not suppressed at reward omission.

Population calcium trace for reward-preferring axons aligned to reward omission (solid line, n = 11 reward-preferring axons). The response in the presence of reward is overlaid (dotted line). (B) Nonsignificant change occurs at reward omission (for reward-preferring axons, p=0.76, n=11; Wilcoxon signed-rank test).

Machine learning discriminates auditory cues based on anticipatory licking.

On the first day, cue discrimination was nearly 50%, indicating that discrimination is random. However, as the training proceeded, discrimination was improved, reaching 77.1 ± 5.4± at the late phase (n = 6 animals).

Discrimination based on facial expressions.

(A) A frame sequence during the cue presentation was analyzed. (B) All pixel values were inputted to a pretrained deep neural network, resulting in more than 400 features. (C) The top 400 features were fed into a machine learning classifier (random forest). (D) The recall rate of the classifier was computed as the discrimination rate. (E) The time course of discrimination was computed (one example animal). High discrimination could be achieved only in the late phase of classical conditioning, even though the early phase was analyzed based on its own classifier trained with frames from the early phase.