Identifying midbrain dopaminergic and GABAergic neurons

(A) Left: We confined ChR2 expression to DA and GABA neurons by injecting locally into the VTA the adeno-associated virus FLEX-ChR2 into transgenic mice expressing the Cre recombinase under the control of the promoter of the DA transporter (DAT::Cre) or the vesicular GABA transporter (VGAT::Cre). Approximately 10 days after the virus injection, the silicon probe was inserted into the brain in the same AP and ML coordinates. On a daily basis, the probe was inserted deeper into the brain by a few microns. Therefore, recording sessions were performed on different DV coordinates.

Right: High-pass filtered voltage trace recorded during a light-stimulation session. Thick blue lines indicate light pulses (450 nm, 12 ms). Two light-induced spikes are shown below.

(B) Light response patterns of representative DA (red), GABA (blue), and unidentified (gray) neurons. (Top) Raster plots of spikes discharged during light stimulation (colored dots) and in the inter-stimulus baseline period (baseline, black dots). (Bottom) PSTHs extracted from the light-induced spikes. The black dashed line indicates the upper confidence limit of the baseline activity. If it is exceeded by the light-induced PSTH, then the unit is identified as light-responsive (See Figure S2 for an explanation of this term). Right inset shows, superimposed, the mean waveforms of spontaneous (black) and light-induced (colored) spikes recorded by a single probe shank.

Memory task, performance, and population activity.

(A) Schematic representation of the T-maze apparatus configuration in the memory task. Depending on the individual features of cognitive demand, the maze was divided into sections. Every trial commenced when the animal left the starting point, running along the central arm. In the visual cue section, a visual cue signaled the rewarded location. Between the visual cue section and the turning point at the end of the main arm, a brief memory delay was introduced. After reward consumption, the animals returned of their free will to the starting point to commence a new trial.

(B) Correct performance rates in sessions with electrophysiological recordings. Gray lines illustrate performance rates for left and right trials in every session. Colored lines illustrate performance averages across sessions (mean ± standard deviation) for DAT- Cre (red) and VGAT-Cre (blue) animals.

(C) In some training sessions animals received two blocks of trials with different memory loads. Gray lines illustrate correct performance rates for each block in every session and black lines show the average performance rates (mean ± standard deviation) across sessions, for all animals.

(D) Mean firing rates (thick lines) ± 1 standard error of the mean (shaded areas) of the population activities of DA (red) and GABA (blue) neurons. The averaged population firing activity of GABA neurons increased in the cue and delay sections. However, the averaged population activity of DA neurons did not deviate from the beginning until the end of the trials.

Trajectory-specific activities by DA and GABA neurons in the memory task

(A) Firing patterns of representative DA and GABA neurons. In each example: (Top) Raster plots of spiking events, for every correct trial, and their corresponding firing rate heatmaps as a function of position during right (purple) and left (green) trials. (Bottom) Average firing rates for correct left and right trials. Note that the trial and average firing rates (spikes/sec) are plotted as a function of position but normalized by the amount of time the mouse occupied each position on every trial. Thick lines above the average firing rates represent segments with significantly different firing rates between right and left correct trials (See also Figure S4; permutation test; P < 0.05). It is evident in these examples that midbrain neurons differentiate their discharge rates between left and right trajectories in certain positions.

(B) Heatmaps of neuronal population responses organized by preferred lap trajectory (i.e., the trajectory with the stronger response; first column) and non-preferred lap trajectory (i.e., the trajectory with the weaker response; second column) for DA neurons (Top; n = 104 units, 35 sessions in five mice) and GABA neurons (Bottom; n = 74 units, 25 sessions in four mice). Each row contains preferred and non-preferred trajectory responses of the same neuron. In every row, both responses are normalized by the maximum rate of the preferred trajectory. The third column shows maze segments with significantly different discharge rates between preferred and non-preferred trajectories.

VTA neuronal responses in a T-maze task without visual cues and memory-dependent decisions (no-cue-no-choice task)

(A) Schematic representation of the T-maze apparatus illustrating the sequence of events in the no-cue-no-choice task.

(B) Firing patterns of representative DA (left) and GABA (right) neurons during the memory and no-cue-no-choice tasks. Both examples illustrate that the trajectory-specific firing rate difference in the delay section of the memory task declines prominently in the control task when animals do not receive visual cues which indicates the reward location, or make memory-dependent decisions.

(C and D) The firing patterns of DA neurons (C; n = 96 units, 30 sessions in four mice) and GABA neurons (D; n = 32 units, 12 sessions in three mice) in the memory task and the no-cue-no-choice task recorded in the same sessions. (Left and Middle columns) Normalized average neuronal responses for preferred (left) and non-preferred (middle) trajectories. The right column represents the maze segments with significantly different discharge rates. The row order of the neurons is the same for the memory task and the control task heatmaps. The data shown here for the memory task are a subset of those shown in Figure 3B.

(E) Average number of position points (mean ± standard deviation) with a significant rate difference, arranged by maze section and behavioral task for DA (left) and GABA (right) neurons (** P < 0.01, *** P < 0.001, paired t-test comparing numbers of significant position points between tasks. Also, the numbers in parentheses describe the number of neurons with a significant rate difference).

(F) Representative example showing the prominent difference in running speeds (cm/s) between the memory (black) and no-cue-no-choice (brown) task trials in a single session.

VTA neuronal responses in a T-maze task with visual cues but no memory-dependent decisions (cue-no-choice task)

(A) Schematic representation of the T-maze apparatus illustrating the sequence of events in the cue-no-choice task.

(B) Firing patterns of representative DA (left) and GABA (right) neurons during the memory and cue-no-choice tasks. Both examples illustrate that the trajectory-specific firing rate difference in the delay section of the memory task becomes notably weaker in the cue-no-choice task when animals do not make memory-dependent decisions, although running speed activities and incentive motivational drives of physical effort are the same between tasks.

(C and D) The firing patterns of DA neurons (C; n = 31 units, 28 sessions in four mice) and GABA neurons (D; n = 28 units, 11 sessions in one mouse) in the memory task and the cue-no-choice task recorded during the same sessions. (Left and Middle columns) Normalized average neuronal firing rates associated with the preferred (left) and non-preferred (middle) trajectories. The right column represents the maze segments with significantly different discharge rates. The row order of the neurons is the same for the memory task and the control task heatmaps. The data shown here for the memory task are a subset of those shown in Figure 3B.

(E) Average number of position points (mean ± standard deviation) with significant rate differences, arranged by maze section and behavioral task for (left) DA and (right) GABA neurons (** P < 0.01, *** P < 0.001, paired t-test comparing numbers of significant position points between tasks. Also, the numbers in parentheses describe the number of neurons with a significant rate difference).

Neuronal responses during reward consumption are not related to the trajectory-specific activities in the memory delay

(A) Firing patterns of representative DA and GABA neurons in the memory task for the period from trial start until the first lick of the waterspout, which triggered the water pump (top, space domain) and during reward consumption (bottom, time domain).

In each example: (Top) Raster plots of the spike trains and their corresponding firing rate heatmaps arranged by trajectory and position (maze) or time (reward) in right (purple) and left (green) trials. (Bottom) Average firing rates for correct left and right trials. The thick lines above the firing rates represent segments with significantly different firing rates between the correct right and left trials.

(B) Firing patterns of DA (top) and GABA (bottom) neurons in the time domain during reward consumption (from the first lick until 1 s later) for preferred (first column) and non-preferred (second column) rewards (DA: n = 104 units, 35 sessions in five mice; GABA: n = 74 units, 25 sessions in four mice). Each row represents the normalized average firing rates (preferred and non-preferred) of a single neuron on a color scale. Neurons were ordered according to the time point of the maximum rate in the preferred arm. The third column shows neurons with significant discrepancies between the left and right reward-related responses (paired t-test for mean firing rates, P < 0.05). The fourth column shows post-delivery reward segments (100 ms each) with significant excitation or inhibition compared with the 100-ms pre-reward segment (paired t-test comparing firing rates, P < 0.05).

(C) Correlations between the mean firing rate difference in the reward section and the difference in every other maze section for DA (left) and GABA (right) neurons (Pearson’s R values with P-values, *** P < 0.001). Only the trajectory-specific firing rate difference in the side arms correlated with the reward-specific rate difference.

Validation of optogenetic identification results. Related to Figure 1.

(A) Serial reconstructions of recording sites along the rostro-caudal axis of coronal midbrain slices for DAT-Cre (red) and VGAT-Cre (blue) animals. fr: fasciculus retroflexus, IF: interfascicular nucleus, IP: interpeduncular nucleus, ml: medial lemniscus, PBP: parabrachial pigmented nucleus, PN: paranigral nucleus, RN: red nucleus, rs: rubrospinal tract, SNc: substantia nigra pars compacta, SNr: substantia nigra pars reticulata.

(B) Photograph of the silicon probe with attached optical fibers. Optical fibers (core diameter 50 μm) were glued to the shanks to ensure a firm, accurate, and durable insertion into the deep midbrain. Optical fibers on shanks 2 and 5 were coupled to blue laser diodes (450 nm), maintaining a certain distance from the tip of the shank to ensure that blue light could irradiate the whole span of the recording electrode array.

(C) Schematic drawing of the mouse brain in the sagittal plane, illustrating the insertion of the silicon probe into the VTA.

(D) Immunofluorescence microscopy image of a coronal brain slice of a DAT-Cre mouse injected with the optogenetic virus in the left VTA. DAT: dopamine transporter (red); YFP: yellow fluorescent protein (green); SN: substantia nigra. NOTE: the white dashed line highlights the trace by the probe shank on the tissue.

(E) Mean waveforms of light-induced (colored) and spontaneous (black) spikes discharged by identified DA (red) and GABA (blue) neurons in a single light stimulation session.

(F) Pearson’s correlation coefficient between the waveforms of spontaneous and light-induced spikes for the identified DA (red) and GABA (blue) neurons. The light-induced spike waveforms were almost identical to the spontaneous waveforms for all the identified neurons. This was reflected by the high mean correlation coefficient (mean ± standard error of the mean; DA: r = 0.98 ± 0.09, P < 0.05; GABA: r = 0.98 ± 0.03, P < 0.05).

(G) Trough-to-peak spike duration vs. average spontaneous firing rate raster plot for the identified DA (red) and GABA (blue) neurons. The side plots show the distributions of the X- and Y-axis values in the main plot (pmf: probability mass function). In agreement with earlier reports (2) DA neurons fired spikes with lower rates (DA: 4.50 ± 5.00 spikes/sec, GABA: 17.34 ± 15.72 spikes/sec; unpaired t-test comparing the spontaneous rates: t(176) = 7.80, P < 0.001) and wider waveforms (DA: 0.46 ± 0.08 ms, GABA: 0.34 ± 0.10 ms; unpaired t-test comparing trough-to-peak values: t(176) = 5.96, P < 0.001) than the GABA neurons.

(H) Latency of the first spike discharged during light stimulation for the identified DA (red) and GABA (blue) neurons.

(I) Number of light-induced spikes per light pulse for DA (red) and GABA (blue) neurons.

Statistical method for optogenetic identification. Related to Figure 1.

To identify the light-responsive units, we compared the light-induced average responses to average spontaneous firing activity. To this end, we used the following jittering method.

(A) Schematic representation of a spike train during a light stimulation session. Vertical black bars represent spiking events. The blue shaded areas indicate the light-stimulation windows. The firing activity between light stimulations is spontaneous.

(B) To identify light-responsive units we raised the hypothesis that the relative firing rate during light stimulation, λ̂ligt(t), did not differ from the spontaneous firing rate, λ̂spont(t). To do so, we employed a jittering method in which a randomly jittered window (12 ms) fitted in the spontaneous activity period preceding the light-pulse (that is, the jittered window onset never preceded the offset of the preceding light-pulse and the offset never succeeded the onset of the succeeding light pulse). We produced as many jittered windows as the light-pulses and we repeated this process 500 times in total.

(C) For every neuron we produced peri-stimulus-time-histograms (PSTHs, 12 ms, 1 ms bins) of the relative firing rate, by superimposing the spike trains of every light stimulation pulse. If n(k)(t) is the number of spikes of a single neuron in post-light onset time t of the kth stimulation pulse, then, the average firing rate, , where K is the number of stimulation pulses, T is the bin size (1 ms) and t0 the light-pulse onset time, represents the relative firing rate at time t. In the end we produced one λ̂ligt(t) and 500 λ̂spont(t) firing rates.

(D) From every, λ̂i,spont(t) histogram we extracted the maximum and minimum values and stored them in two separate vectors. Vectors were sorted, and from them, the 95% confidence limits were calculated. These were the 2.5% largest value from the maximum PSTH vector and the 2.5% smallest value from the minimum PSTH vector (that is 13rd from the 500 rows). Units with λ̂ligt(t) PSTH exceeding at any time point (between 1 ms and 11ms) the high confidence limit, were identified as light-responsive.

(E) If the λ̂ligt(t) PSTH fell below the low confidence limit at any time point, then the unit was characterized as light-inhibited due to synaptic transmission (we did not find any light-inhibited units). Otherwise, those units whose PSTH did not violate any of the confidence limits were characterized as unidentified.

T-maze configuration, behavior and running speed assessment in the memory task. Related to Figures 2, 3, 4, and 5.

(A) Photograph of the T-maze apparatus with maze sections labeled. The mice were trained on different variations of an associative T-maze task with visual instruction cues. The behavioural apparatus was a Figure-eight T-maze (O’Hara & CO., LTD, Tokyo, Japan), 50 cm tall, with a main arm (120 cm) and side arms (30 cm each). Infrared light-beam sensors placed at key position points on the maze defined the beginning and end of the successive maze sections. The apparatus was equipped with sliding doors, which prevented the animals from developing unwanted behavior (e.g. moving backwards). Sensor activations were on-line monitored, and significant behavioural events (e.g. visual-cue presentation, reward delivery, door opening/closing) were dictated by an in-house behavioural software written in MATLAB (Mathworks, MA, USA), through a multi-signal processor interface system (RX6; Tucker-Davis Technologies, FL, USA). The moment-by-moment position of the animal on the maze apparatus was monitored continuously by a video tracking system, composed of a red light-emitting diode mounted on the mouse head stage and a video camera (39 frames/sec) hung by the room ceiling. Video data were stored for offline analysis.

(B) Detailed schematic representation of the T-maze apparatus. Infrared light-beam sensors (s1–s6) were mounted securely at key positions on the maze to define the beginning and end of the maze sections (start: s1–s2; visual cue: s2–s3; delay: s3–s4; side-arms: s4–s5/s6; reward: s5/s6). The first activation of s5 or s6 sensors (i.e., 1st lick) triggered the water-delivery pump. Five sliding doors restrained the animals from unwanted moves. Every trial began with the animal activating s1. At this point, the doors d1, d3, and d5 remained closed, isolating the animal in the starting location for approximately 2 s. This served two purposes: (1) to drain the waterspouts and (2) to prevent the animal from making decisions on the next trial guided by the previous trial outcome. After door d1 opened, the animal could run freely along the main arm. Activating s2 and s3 resulted in the visual cue onset and offset, respectively. Activating s5 triggered the left water-delivery pump, and door d3 opened while d1 closed. Activating s6 triggered the right water-delivery pump, and door d5 opened while d1 closed. In the memory task, doors d2 and d4 were always open. In the no-cue-no-choice task, doors d2 and d4 were initially closed. However, when the mouse activated s4, one door opened pseudo-randomly.

(C) To assess the influence of behavioural left-right biasing in decision making we applied the chi square (χ2) test of independence. The null hypothesis was that correct performance was independent of behavioural choices. Black line illustrates the chi-squared distribution for 1 degree of freedom, along with the χ2 values from the independence test for every session (DAT-Cre: red dots and VGAT-Cre: blue dots). The test was passed in every session (the level of significance was set to 5%, corresponding to χ2 = 3.84 for 1 degree of freedom).

(D) Representative example of trial speeds (cm/sec) in a single recording session of the memory task.

(E) Shown are running speeds (mean ± standard error of the mean) categorized by maze sections and averaged across sessions for left (green) and right (magenta) trials for individual VGAT-Cre and DAT-Cre mice. Although differences were small, they occasionally reached significance. Importantly, we did not detect speed differences between left and right trials in the delay section (unpaired t-test between left and right speed per region, * P < 0.05).

Statistical method for identifying trajectory-specific neurons (permutation method). Related to Figure 3.

We applied a permutation method reported elsewhere (Fujisawa et al., 2008) to identify neurons with trajectory-specific encoding properties. The motivation behind this analysis is that if the lap trajectory contributes to the firing rate difference observed at certain positions, then shuffling the trajectory labels assigned to the spike trains of individual trials would cause a marked reduction in the rate difference. This process is described below.

(A) From the spike trains of the correct left and right trials (top; raster plot), we estimated the trial relative firing rates (λ̂Left(x) and λ̂Rigt(x); middle). Importantly, we considered the variability in occupation time between positions and trials. Hence, to calculate the firing rates, we divided the number of spikes occurring in every position and trial (trial PSTHs) by the occupation time in the same position and for the same trial (trial occupation time). We subsequently produced the original average rate difference (D0(x), bottom) between the left- and right-correct trials.

(B) We shuffled the trajectory labels assigned to the trial spike trains along with the respective occupation times and re-calculated the difference Di(x) of the permuted labels.

(C) We produced 500 surrogates with permuted differences Di(x). In the end, we created a 500-by-100 matrix (shuffles-by-position) of Di(x) surrogates.

(D, E) From these surrogates we extracted the 95% confidence interval (lower and upper confidence limits) for the null hypothesis test. To do so, first, we calculated the confidence interval of each position point x. The difference values Di(x) in every position (column) were sorted and every row in the new matrix was treated as a potential confidence limit with different P-value. We term it as “pointwise acceptance band”. If the original data D0(x) breaks the pointwise band, it corresponds to rejecting the null hypothesis at position x. However, if this procedure is repeated for every position x (i.e., 100 position points) in the maze, it raises the issue of multiple comparisons.

(F, G) To address this, we constructed the “global band”, which can control errors of any false rejection across multiple indices. We produced another 500 permutations and calculated the percentage of the surrogate data Dj(x) that broke the pointwise band candidates at any position points. If the percentage was more than 5%, we replaced the pointwise band with lower P-value. We repeated this process until a pointwise band candidate was exceeded by less than 5% of the new surrogates. When this happened, the pointwise band was used as the “global band” (i.e., 95% confidence interval) for the hypothesis test.

(H) The null hypothesis was rejected if the original difference D0(x), exceeded at any position x the global band. Then, the spatial extent of significance was defined by the number of position points exceeding the 95% confidence interval.

Arranging neuronal firing activities by time or position.

(A) Firing activities of representative neurons (DA#2393 and GABA#3943) arranged by time and aligned at the offset of memory delay. The colored bands illustrate the range of the timestamps of key task events across the trials of a single recording session. Timestamps differed significantly across trials, sessions, behavioral protocols, and animals. As a result, we could not define a fixed epoch for every maze section with adequate duration to analyze the neuronal responses.

(B) Firing activities of representative neurons (including DA#2393 and GABA#3943) arranged by position (same plots as in figure 3A). Notably, the position of key task events was the same across trials, sessions, protocols and animals, enabling the comparison of neuronal responses between animals and behavioral protocols.

(C) Firing activities of the same neurons as in Figure S5B, arranged by time and aligned by the timestamps of sensor 3 crossings. Consistent with earlier reports (Howe et al., 2013; Kim et al., 2020), while animals navigated the maze, receiving a continuous sensory input and making accurate estimations of the timing of key task events, we did not observe profound discharge rate elevations in response to the visual cue onset that would resemble strong RPE signaling.

Firing activities of midbrain DA and GABA neurons in the memory task. Related to Figure 3.

All plots included in this figure were obtained from data recorded in the memory task. (A and B) Representative discharge activity of DA (a) and GABA (b) neurons. In each example: (Top) Raster plots of the spikes, arranged by trial, and their corresponding firing rate heatmaps as a function of position in right (purple) and left (green) trials. (Bottom) Average firing rates for correct left and right trials. Note that the firing rate (spikes/s) is plotted as a function of position but has been normalized by the amount of time the mouse occupied each position in every trial. The thick lines above the firing rates represent segments with significantly different firing rates between the correct right and left trials.

(C and D) Heatmaps of the standardized average firing rates for the preferred (first column) and non-preferred (second column) trajectories of DA (c) and GABA (d) neurons. These plots show the number of standard deviations by which the average firing rate varied above or below the mean rate of correct trials as a function of the position on the maze. The row arrangement was the same as that in Figure 3B. The third column of the significant firing-rate difference is the same as that in the third column in Figure 3B.

(E) Heatmaps of the normalized average firing rate for preferred (first column) and non-preferred (second column) trajectories and the significant difference between them (third column) for the unidentified VTA neurons.

Evaluating DA and GABA neuronal responses to specific behavioral variables in the memory task through regression analysis. Related to Figure 3.

(A) Average firing rates of representative DA and GABA neurons extracted from real firing rate data (solid lines) and encoding model predictions (dashed lines) for left (green) and right (magenta) trials.

(B) (Left and Middle columns) Normalized firing rate heatmaps for preferred and non-preferred trajectories for DA and GABA neurons recorded in the memory task (the same as the ones in Figure 3B). (Right column) Significant contributions in the firing rate difference (yellow lines) by individual behavioral variables in the memory delay extracted with the permutation analysis of the encoding model predictions.

(C) (Top) Number of neurons significantly modulated by one of the independent variables. (Bottom) Number of neurons co-modulated by trajectory (trajectory-specific) and one of the individual variables. (Note: In B and C, the trajectory-specific neurons correspond to the significant neurons extracted using the permutation analysis of the original firing rates; third column in Figure 3B).

(D) The distribution of the trial coefficient for DA and GABA neurons along with the mean ± standard deviation values.

(E) Histogram of the number of independent behavioral variables co-modulating neurons in the delay region.

The dopamine signaling model of incentive motivational drive. Related to Figures 2, 3 and 4.

Striatal DA levels ramp up when mice navigate a maze or corridor in search for reward (Howe et al., 2013; Hamid et al., 2016; Kim et al., 2020). Initially, this evidence challenged theoretical neuroscientists, since the time-course of this ramping activity (lasting a few seconds) did not conform either with the prediction-error (RPE, phasic activity) or reward rate (tonic activity) theories (Niv, 2013). However, further research on this topic, provided empirical evidence to support a theory that DA conveys a signal for incentive motivational drive in the form of state-action value (value of work; (Hamid et al., 2016), or its derivative RPE; (Kim et al., 2020)).

Accordingly, the motivational value of future rewards is exponentially discounted with time or distance (the theory was developed on an adaptive decision-making framework; Hence the value of reward is defined by reward probability; figure 4 in (Hamid et al., 2016)). Different reward values correspond to different functions of discounted state-action values. If a cue predicts a reward with higher value, then the state-action value jumps to the discounted value function of that reward.

In our study, key parameters that could potentially influence incentive motivational drives were equal between left and right trials (reward amount, lap trajectory distance, etc.) and moreover, behavioral performance did not indicate choice biasing (Figures S3C). However, we cannot rule out the possibility that left and right rewards were valued differently, producing distinct value functions. If this is true, then the trajectory-specific activities elicited by subgroups of DA and GABA neurons could reflect differences in the state-action values for left and right trials, and not memory-dependent decisions.

(A) The plot illustrates the hypothesis of different discounted value functions (i.e., the state-action value is a function of position and trajectory). In the memory task and the cue-no-choice task the assignment of the state-action value function takes place when the visual cue signals which reward is available.

(B) However, in the no-cue-no-choice task, the assignment occurs only when the mice are given access to one of the side arms.

Running speed differences between the memory task and the cue-no-choice task. Related to Figures 3 and 5

(A) Representative example of running speeds (cm/s) for memory (black) and cue-no-choice (orange) trials in a single session. Speed differences were evident at the T-intersection (between delay and arms).

(B) Average running speed values (mean ± standard error of the mean) grouped by maze section and behavioral task for each of the DAT-Cre and VGAT-Cre animals performing the memory and cue-no-choice tasks. Importantly, no differences were detected in the delay and side-arm sections between memory (black) and cue-no-choice (brown) task trials (* P < 0.05, unpaired t-test on speed).

DA neurons which encode memory information for a specific trajectory do not show preference for the same trajectory reward.

(A) Heatmaps of neuronal population responses organized by preferred lap trajectory (first column) and non-preferred lap trajectory (second column) for DA neurons (Top; n = 104 units, 35 sessions in five mice) and GABA neurons (Bottom; n = 74 units, 25 sessions in four mice). The third column shows maze segments with significantly different discharge rates between preferred and non-preferred trajectories for the start, visual cue, delay, and side arms sections. The fourth column shows neurons with significant discrepancies between the left and right reward-related responses (paired t-test for mean firing rates, P < 0.05).

Note: The first three heatmaps are adopted from Figure 3B (position-aligned firing rate) and the fourth heatmap is adopted from Figure 6B (third column; time-aligned firing rate). In all the heatmaps, each row contains responses of the same neuron.

(B) Scatterplots created by dividing neuronal responses from the third and fourth column heatmaps in plot (A) into six (6) categories, grouped by the trajectory-specific preference in the delay section (x-axis) and during reward consumption (y-axis). “L” corresponds to a significant preference for the left trajectory, “R” for the right trajectory, and “NS” for non-significant firing rate difference (i.e. no preference).

Note: to yield a better visual sense of how many observations belong to every category, we randomly jittered each point along the x- and y-axis.

In both neuronal populations, more neurons showed opposite significant lap-trajectory preferences (red clusters) between delay and reward sections, as opposed to a small minority of neurons that elicited the same lap-trajectory preference (green clusters).

Notably, in agreement with earlier studies reporting preferential contralateral responses of DA neurons (Kim et al., 2015; Parker et al., 2016; Engelhard et al., 2019; Lee et al., 2019; Moss et al., 2020), the majority of memory-specific neurons exhibited a preference for the contralateral lap trajectory to that of the recording site (left hemisphere). Accordingly, seventeen (17) of the twenty-three (23) DA neurons with trajectory-specific activities in the delay period of the memory task, elicited a significant preference for the right trials and only five (5) of them for the left trials. In GABA neurons, the percentage was similar with twenty-six (26) neurons showing a clear preference to the right trials and nine (9) to the left trials.

Anatomical organization of memory-specific VTA neurons.

(A) Schematic representations of coronal slices containing the ventral tegmental area at different Bregma coordinates. Representations were adopted and modified from The Mouse Brain Atlas in Stereotaxic Coordinates (ref).

(B) The approximate stereotaxic coordinates of the optogenetically identified DA and GABA neurons were extracted from the estimated location of the recording channels. From those, a scatterplot was produced. Solid triangles correspond to DA (red; DAT-significant) and GABA (blue; Vgat-significant) neurons with trajectory-specific differences in memory delay. Open circles illustrate neurons without significant firing rate differences between left and right trials (DAT non-significant or Vgat non-significant).

(C) Among neuronal populations, significant anatomical segregation was observed only in GABA neurons. Accordingly, across the mediolateral axis, memory-specific GABA neurons localized more lateral regions of the VTA circuit (unpaired t-test, t(72) = -2.38, P = 0.019).

Number of neurons with trajectory-specific firing activities in the memory task grouped by maze section

Data are presented for all recorded neurons and individually for optogenetically identified DA and GABA neurons; DA, dopamine; GABA, gamma-aminobutyric acid.