Neural activity in ACC signals a motivational state to obtain reward

(A) Schematic of virtual reality experimental setup and trial structure. A mouse initiates a trial by running to trigger the onset of cues (olfactory and auditory). After cue onset, a mouse stops to collect a water reward, which ends the trial (see Methods).

(B) Representative traces of speed and licks from one mouse during a session, with shaded portions corresponding to when cues are on. Red arrows correspond to periods when mice are running to trigger cue onset or stopping to trigger water delivery. Black arrows correspond to sections of a session where we can quantify time to initiate trials, initiation speed, cue stops, and rewards.

(C) Quantification per mouse of time to initiate a trial (far left; seconds), initiation speed (left; cm/s), % trials in which a stop occurred during cue presentation (right), and rewards received per minute. Individual data points shown (N=12 mice).

(D) Scatter plots of the mean time (s) to initiate a trial plotted alongside rewards received per minute per mouse (N=12 mice). Individual data points shown, with a best fit line, represented by the solid line in the figure. r2=0.8675 and p<0.0001 are determined by linear regression.

(E) Left: bulk neural activity recording experimental design. GCaMP6f was injected into the anterior cingulate cortex (ACC) and neural activity was recorded on a fiber photometry setup (see Methods). Right: Brain histology from a representative mouse showing DAPI in blue, GCaMP6f in green and photometry cannula implantation in ACC (dotted white lines). Scale bar: 1mm.

(F) Top: Trial average plots of ACC activity (z-scored dF/F) and speed (cm/s) aligned to reward onset. Data are mean (solid line) ± s.e.m (shaded area). Bottom: Relative frequency plots of the time (s) for ACC dF/F or speed to rise above 1 std or 1 cm/s during rewards, respectively (N=105 trials across 12 mice). *p<0.05, paired t-test between time to rise (s) between ACC and speed. Data is the frequency of values across time.

(G) Same as F, but for trial initiations. (N=510 trials across 12 mice).

(H) Injection strategy for DREADDS-based chemogenetic inhibition of ACC during self-paced task. Coronal section from an animal virally injected with AAV1-Cam-Kii-hM4D(Gi) in ACC. DAPI is shown in blue and hM4D(Gi) in red. Scale bar: 1mm.

(I) Representative traces of speed and licks from one mouse during the task on a day with saline (top) or CNO (bottom) administration 45 minutes prior to a session, with shaded portions corresponding to when cues are presented.

(J) Left: Quantification of time (s) to initiate trial (left) across saline and CNO sessions in mCherry-control mice (N=188 trials across 6 mice) and hM4D(Gi)-DREADDs mice (N=215 trials across 4 mice). Right: same as left but for rewards received per minute in mCherry-control mice (N=60 minutes across 6 mice) and hM4D(Gi)-DRE-ADDs mice (N=40 minutes across 4 mice). p=0.8707 for mCherry and *p<0.05 for hM4Di (time to initiate), p=0.2073 for mCherry and *p<0.05 for hM4Di (rewards per min), unpaired t-test between saline and CNO sessions per group. Data are mean ± s.e.m.

Neural activity in ACC scales to match an increased motivational state during learning

(A) Top: Schematic of training where mice learn to associate stopping to one set of cues with no water reward (”N”) or with water reward (”R”). Bottom: Representative traces of speed and licks from one mouse during a session on Training Day 2 and Day 4, with shaded portions corresponding to when a reward cues (R, blue) or no-reward cues (N, orange) is presented. Red arrow denotes the suppression of licks on Day 2, and rise in speed during no-reward cues on Day 4.

(B) Trial averaged speed (cm/s; top), lick rate (Hz; middle) and ACC activity (dF/F z-scored; bottom) aligned to cue presentation across day 2 and 4 of training, separated by reward and no-reward cues (blue vs orange). Black arrow signifies rise in speed after no-reward cue presentation. N=12 mice. Data are mean (dark line) with s.e.m. (shaded area).

(C) Quantification of average cue speed (cm/s; top), lick rate (Hz; middle) and ACC activity (dF/F z-scored; bottom) across training, separated by reward and no-reward cues (blue vs orange). N=12 mice in each group, data are mean ± s.e.m. *p<0.05, paired t-test between reward and no-reward.

(D) Scatter plots of rewards per minute vs stop discrimination (top), lick discrimination (middle), or dF/F difference (bottom) for each mouse throughout training (N=120 data points, 12 mice per each of 10 days). Data are individual points with best fit line. r2 and p values are shown, as determined by linear regression.

(E) Top: Trial averaged speed (cm/s) and ACC activity (dF/F z-scored) aligned to cue presentation across 3 trials consisting of a reward, no-reward, and reward cue (RNR). Bottom: Trial averaged ACC activity (dF/F z-scored) aligned to cue presentation across 4 trials consisting of a reward, no-reward, no-reward and reward cue (RNNR). N=12 mice. Data are mean (dark line) with s.e.m. (shaded area).

(F) Quantification of average cue dF/F activity across RNR and RNNR trial sequences. N=12 mice. *p<0.05, one-way repeated measured ANOVA with post-hoc Tukey’s multiple comparison test. Data are mean ± s.e.m (right)

(G) Top: Injection strategy for AAV1-CaMKII-stGtACR2 into ACC for optogenetic inhibition during training. Middle: Brain histology from a representative mouse showing DAPI in blue, stGtACR2 in red and photometry cannula implantation in ACC. Scale bar: 1mm. Bottom: optogenetic inhibition was targeted to days 1-6 of training and mice were allowed to continue training for days 7-10.

(H) Left: Trial averaged plots of speed (cm/s) aligned to cue entry on T6 for mCherry controls and GTACR inhibition mice, separated by reward or no reward cues. Right: Quantification of mean speed during cue presentations. N=8 mice for mCherry, 4 for GTACR early inhibition. *p<0.05, paired t-test.

Mice with extended motivational states during learning display neural activity ramps in OFC

(A) Injection strategy and fiber-based photometry setup to record bulk GCaMP6f of projections to ACC from OFCACC (orbitofrontal cortex), AMACC (anteromedial thalamus), BLAACC (basolateral amygdala), or LCACC (locus coeruleus). Representative traces for a single mouse showing traces for each region dF/F, speed, and licks. Shaded portions are shown corresponding to when a reward cues (R, blue) or no-reward cues (N, orange) are presented.

(B) Left: trial averaged bulk GCaMP6f dF/F of ACC, OFCACC, AMACC, BLAACC, and LCACC during a sequence of trials on T6 including reward, no-reward, and reward cues (RNR). Black arrows denote the rise in pre-cue activity from N cue to the following R cue in the RNR sequence. Right: quantification of pre-cue activity for the N cue and following R cue. Data are mean (solid line) ± s.e.m (shaded area). N=19, 12, 5, 4 mice, data are mean (solid line) ± s.e.m (shaded area), *p<0.05, paired t-test between N vs R cues.

(C) Left: trial averaged bulk GCaMP6f dF/F of OFCACC during a sequence of trials including reward, two no-reward, and reward cues (RNNR). Red arrows denote the rise in pre-cue activity from first N cue to the last R cue in the RNNR sequence. Right: quantification of pre-cue activity for the first N cue, second N cue and last R cue. Data are mean (solid line) ± s.e.m (shaded area). N=19 mice, data are mean (solid line) ± s.e.m (shaded area), *p<0.05, one-way repeated measures ANOVA with post-hoc Tukey’s multiple comparison test.

(D) Left: speed (cm/s) for “Learner” (black; reached a DI > .5 for 3 consecutive days) or “Non-Learner” (red) mice on training day 6 aligned to no-reward cue onset. Middle: discrimination index for each group of mice throughout training. Right: speed during reward and no-reward cues for “Learner” mice. N=7 (“Learner”) and 9 (“Non-Learner”) mice. Data are mean (solid line) ± s.e.m (shaded area), *p<0.05, unpaired t-test between Learner and Non-Learner DI (middle), paired t-test between reward and no-reward cues (right).

(E) Left: trial averaged bulk GCaMP6f dF/F of OFCACC during a sequence of trials including reward, two no-reward, and reward cues (RNNR). Black arrows denote the rise in pre-cue activity from first N cue to the last R cue in the RNNR sequence. Red arrows denote the absence of this ramp in Non-Learner mice. Right: Quantification of pre-cue activity for the first N cue, second N cue and last R cue. Data are mean (solid line) ± s.e.m (shaded area). N=7 (“Learners”) and 9 (“Non-Learner”) mice, data are mean (solid line) ± s.e.m (shaded area), *p<0.05, one-way repeated measures ANOVA with post-hoc Tukey’s multiple comparison test.

Orbitofrontal cortex projection neurons tile sequences of trials with no-rewards

(A) Injection strategy (top left), histology (top right; scale bar, 1mm) and z-projection images of two-photon recording (bottom left; mean over time; scale bars, 200 μm) of GCaMP expressing OFC projection neurons with GRIN implants. Bottom right: sequence of trials with z-scored dF/F for individual neurons, with shaded portions corresponding to when a reward cues (R, blue) or no-reward cues (N, orange) are presented. Red arrow denotes a dF/F transient occurring after 2 consecutive N cues.

(B) Stop (black) or lick (grey; see methods) discrimination index on the first day stop DI reaches > 0.4 (”after”) and the two previous days (”before” and “middle”). N=5 mice.

(C) Schematic of OFCACC bulk activity based on Figure 3 results and potential single neuron findings that tile a sequence of trials with two no-rewards followed by a reward cue presentation (NNR).

(D) Representative neurons with tunings (std > 0.75 for 3 seconds prior to or after cue presentation) to separate cues in an NNR trial sequence. Trial averaged activity of a N (top), NN (middle), and NNR (bottom) neuron with heat map showing individual trial responses.

(E) Quantification of neurons tuned to separate cues within an NNR trial sequence and their activity to all other cues. N=17 (N), 18 (NN), 32 (NNR) cells out of 115 cells in total. *p<0.05, one-way repeated measures ANOVA with post-hoc Tukey’s multiple comparison test.

(F) Percentage of neurons tuned to different cues in an NNR trial sequence before (top) or after (bottom) training. N=5 mice. *p<0.05, one-way repeated measures ANOVA with post-hoc Tukey’s multiple comparison test.

(G) Ensemble average plots of neurons tuned to R cues after 2 consecutive N cue presentations (NNR cells) before learning (top) and their activity after learning (bottom). Black arrows denote the rise in activity prior to R cues after learning. N=18 NNR cells out of 81 cells tracked across days.

(H) Quantification of transient time (s) since R cue onset for neurons tracked across days. N= 132, 170 transient events before and after learning across 18 NNR cells and 105, 59 transient events before and after learning across 12 NR cells. *p<0.05, unpaired t-test.

(I) Left: Injection strategy for AAV1-hSyn-SIO-stGtACR2 into OFCACC for optogenetic inhibition during training. Optogenetic inhibition was targeted to training for 6 days. Right: Brain histology from a representative mouse showing DAPI in blue, stGtACR2 in red and photometry cannula implantation in ACC. Scale bar: 1mm.

(J) Left: mean animal speed (cm/s) aligned to cue zone entry after no-reward on T6 for mCherry control or GtACR mice. Black arrow signifies lack of speed increase during N cues. Right: quantification of mean change speed in cue zone after no-reward, assessed separately for each cue presentation. N=10 mice for mCherry and 13 mice for GtACR, *p<0.05, paired t-test.