Experimental design and linear track behavior.

(A) TH-cre rats underwent stereotactic surgery to inject virus bilaterally into VTA and implant a tetrode microdrive above dorsal CA1. (B) Co-expression of mCherry (red) and TH (green) in VTA from three example animals. Left panel, mCherry-only virus, scale bar 600 µm; middle panel, hM4Di-mCherry, scale bar 150 µm; right panel, hM4Di-mCherry, scale bar 75 µm. (C) Intraperitoneal injection of saline or CNO (1-4 mg/kg) preceded recording sessions by at least 10 minutes. Rats were placed at one end of a linear track and collected liquid chocolate reward from wells at each end. Each epoch lasted 10-20 laps and reward changes were unsignaled to the animal. For each session, the Incr. end was defined as the reward end with 4X reward in Epoch 2, and the Unch. end was defined as the reward end with 1X reward in Epoch 2. (D) During stopping periods at reward ends, LFP was bandpass filtered in the ripple band (150-250 hz) and SWR events were detected. (E) Three example ripple-filtered LFP traces from one lap (two stopping periods) are shown. (F) Cumulative distribution of reward end stopping periods at the Unch. reward end in Epoch 1 and 2 for experimental rats (left panel) and control rats (right panel). See also Figures S1-S3. (G) The duration of Unch. reward end stopping periods decreased from Epoch 1 to Epoch 2. Mean ± standard error, Exp Saline, Epoch 1: 6.28±0.17, Epoch 2: 4.74±0.15, two-sample t-test: t(1530)=6.7, p<10-8; Exp CNO, Epoch 1: 6.96±0.21, Epoch 2: 5.71±0.16, two-sample t-test: t(1352)=4.785, p<10-5. Con Saline, Epoch 1: 6.55±0.15, Epoch 2: 4.58±0.1, two-sample t-test: t(1149)=11.032, p<10-10; Con CNO, Epoch 1: 6.45±0.12, Epoch 2: 4.39±0.06, two-sample t-test: t(1286)=15.06, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,5317]=252.26, p<10-10), drug (F[1,5317]=9.93, p=0.0016), group (F[1,5317]=16.09, p=0.0001), epoch X group (F[1,5317]=8.23, p=0.0041), drug X group (F[1,5317]=20.3, p<10-5). (H) Cumulative distribution of reward end stopping periods at the Incr. reward end in Epoch 1 and 2 for experimental rats (left panel) and control rats (right panel). (I) The duration of Incr. reward end stopping periods increased from Epoch 1 to Epoch 2. Mean ± standard error, Exp Saline, Epoch 1: 6.314±0.17, Epoch 2: 10.351±0.2, two-sample t-test: t(1514)=-15.315, p<10-10; Exp CNO, Epoch 1: 6.67±0.22, Epoch 2: 11.691±0.25, two-sample t-test: t(1340)=-15.059, p<10-10. Con Saline, Epoch 1: 6.859±0.17, Epoch 2: 11.047±0.17, two-sample t-test: t(1138)=-17.447, p<10-10; Con CNO, Epoch 1: 6.229±0.12, Epoch 2: 10.304±0.11, two-sample t-test: t(1274)=-24.745, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,5266]=1077.4, p<10-10), drug X group (F[1,5266]=33.8, p<10-5), epoch X drug X group (F[1,5266]=4.33, p=0.0376).

Modulation of SWR rate by reward, novelty, and VTA inactivation.

(A) SWR rate as a function of time in stopping period in Epoch 1 and 2 for four example sessions in experimental rats; from left to right, saline on familiar track, saline on novel track, CNO on familiar track, and CNO on novel track. In each panel, visits to the Incr. end are on the left and visits to the Unch. end are on the right. Relative to Epoch 1 (black lines), in Epoch 2 (red lines) SWR rate increased at Incr. end and decreased at Unch. end in all conditions except for CNO on a novel track (far right), where SWR rate increased at both ends in Epoch 2. SWR rate was binned in 0.25 s windows and smoothed with a two-bin Gaussian. Line, mean; shading, standard error. (B) SWR rate in experimental rats as a function of epoch, drug (saline in solid lines, CNO in dashed lines), reward end (Unch. in black, Incr. in green), and novelty (familiar in left panel, novel in right panel). See also Figure S3 and S4. (C) SWR rate in control rats as a function of epoch, drug (saline in solid lines, CNO in dashed lines), reward end (Unch. in black, Incr. in green), and novelty (familiar in left panel, novel in right panel). (D) Difference between SWR rate at Incr. and Unch. ends in Epoch 2 in Experimental rats. Full stopping period, left panel. Trimmed stopping period, with first 1 s and last 1 s of visit excluded to eliminate all slow approaching/leaving movement, right panel. Saline, gray bars; CNO, white bars. Mean and standard error. Full stopping periods, three-way ANOVA with animal group, drug, and novelty: drug (F[1,153]=5.19, p=0.0241), group X drug (F[1,153]=5.16, p=0.0245). Trimmed stopping periods, three-way ANOVA with animal group, drug, and novelty: group X drug (F[1,153]=5.58, p=0.0194). (E) Difference between SWR rate at Incr. and Unch. ends in Epoch 2 in Control rats, as in (D). Statistics in legend (D). (F) In experimental rats, the difference in SWR rates at each reward end (Incr. – Unch.) in Epoch 2, after subtracting the mean rates in Epoch 1, averaged over a 5-lap sliding window within Epoch 2. Blue lines, novel sessions. Gray lines, familiar sessions. Blue and gray asterisks denote the centers of sliding windows in which the difference in SWR rate was significantly greater than 0 in novel and familiar sessions, respectively (one-sample t-test, p<0.05). Shading denotes 95% confidence interval. See also Figure S4. (G) As in (F), but for control animals.

Frequent reward changes modulated SWR rate.

(A) Recording sessions in the volatile reward task were preceded by intraperitoneal injection of saline or CNO by at least 10 minutes. Rats were placed on the stable end to begin each session, which delivered 0.2 ml reward at each visit, while the volatile end delivered 0, 0.1, 0.2, 0.4, or 0.8 ml, pseudorandomly chosen on each lap. Bottom panel, schematic of how value and RPE would modulate SWR. Given a particular current volume, value coding predicts a positive correlation between SWR rate and previous volume, while RPE coding predicts a negative correlation. (B) SWR rate as a function of reward volume and time in end visit in example rat, experimental rat 4. Left panel, saline. Right panel, CNO. In stable panel, traces are colored based on previous volatile end visit volume. In volatile panel, traces are colored based on current volatile volume. See also Figure S5 and S6. (C) SWR rate as a function of reward volume and time in end visit in example control rat 3, as in (B). (D) Top panel, SWR rate at volatile end as a function of current and previous volatile volume, for saline sessions in experimental rats. Middle panel, SWR rate for each non-zero volatile volume plotted as a function of previous volume, with the mean SWR rate for that current volume subtracted. Unfilled symbols, mean of previous volume across all current volumes. Thick dashed line, linear fit to mean values. Pearson correlation between (ripple rate – mean) and previous volume, r=-0.076, p=0.177. Error bars, standard error. Bottom panel, SWR rate as a function of reward volume, separated by recent reward history (median split on average of last 3 visits). Black, recent history below median; red, recent history above median. (E) Same as (D), for CNO sessions in experimental rats. Middle panel, Pearson correlation between (ripple rate – mean) and previous volume, r=-0.109, p=0.049. GLM fitting SWR rate as a function of drug, current volume, and previous volume: previous volume, z=-2.31, p=0.021; drug and current volume, both p>0.8. Bottom panel, Poisson GLM fitting ripple rate as a function of volume, drug condition, and reward history (above/below median): volume, z=13.86, p<10-10; history, z=-2.23, p=0.026; drug, z=-1.05, p=0.29. (F) The RPE of volatile end visits were calculated by subtracting the previous volatile volume from the current volume. Two-way ANOVA with drug and RPE sign (+/-): drug (F[1,518]=0.3, p=0.582), RPE sign (F[1,518]=6.42, p=0.0116), drug X RPE sign (F[1,518]=0.07, p=0.785).

Replay recruitment by reward change in novel sessions requires VTA signaling.

(A) Place cells exhibit directional place fields on the linear track. Fields calculated from movement in a particular direction (“right” fields and “left” fields), ordered based on field center location in either running direction (“right” order and “left” order). Example saline session and CNO session from experimental rat 3. See also Figure S7 and S8. (B) Three example replays from Epoch 2 of a novel saline session from experimental rat 3. Red, posterior in upwards map; blue, posterior in downwards map. Title indicates reward end (Incr., Unch.) and replay direction (Reverse, Forward). The horizontal black line indicates rat position. (C) Three example replays from Epoch 2 of a novel CNO session from experimental rat 3, as in (B). (D) The difference in rate of reverse replay at each end (Incr. – Unch.) in novel sessions in experimental rats. Error bars, standard error of the mean. Reward condition is indicated by color (equal reward, epoch 1 and 3, gray; unequal reward, epoch 2, orange), and drug condition is indicated on the x-axis. The difference between equal and unequal reward conditions was assessed with a three-way ANOVA with drug, novelty, and replay directionality: drug X novelty X directionality (F[1,106]=4.64, p=0.0335), all other terms p>0.05. (E) Same as (D), but for familiar sessions. (F) Same as (D), but for forward replay. (G) Same as (F), but for familiar sessions. (H) Same as (D), but for control rats. The difference between equal and unequal reward conditions was assessed with a three-way ANOVA with drug, novelty, and replay directionality: novelty X directionality (F[1,101]=9.04, p=0.0034), all other terms p>0.05. (I) Same as (H), but for familiar sessions. (J) Same as (H), but for forward replay. (K) Same as (J), but for familiar sessions.

Behavioral effects of novelty and VTA inactivation. Related to Figure 1.

(A) In novel sessions, Unch. visit duration decreased from Epoch 1 to Epoch 2, while CNO additionally led to longer visit duration in experimental rats. Mean ± standard error, Exp Saline, Epoch 1: 7.181±0.33, Epoch 2: 4.892±0.2, two sample t-test: t(348)=5.836, p<10-5; Exp CNO, Epoch 1: 10.542±0.65, Epoch 2: 6.789±0.41, two sample t-test: t(297)=5.012, p<10-5. Con Saline, Epoch 1: 7.594±0.28, Epoch 2: 4.8±0.14, two sample t-test: t(401)=9.054, p<10-10; Con CNO, Epoch 1: 7.267±0.26, Epoch 2: 4.834±0.12, two sample t-test: t(426)=8.471, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,1472]=171.66, p<10-10), drug (F[1,1472]=33.3, p<10-5), group (F[1,1472]=32.57, p<10-5), drug X group (F[1,1472]=41.66, p<10-5), epoch X drug X group (F[1,1472]=4.5, p=0.034). (B) In novel sessions, Incr. visit duration increased from Epoch 1 to Epoch 2, while CNO additionally led to longer visit duration in experimental rats. Mean ± standard error, Exp Saline, Epoch 1: 7.179±0.39, Epoch 2: 9.968±0.3, two sample t-test: t(343)=-5.668, p<10-6; Exp CNO, Epoch 1: 10.16±0.74, Epoch 2: 13.721±0.48, two sample t-test: t(293)=-4.164, p=0.00004. Con Saline, Epoch 1: 7.478±0.29, Epoch 2: 10.907±0.24, two sample t-test: t(395)=-9.086, p<10-10; Con CNO, Epoch 1: 6.65±0.21, Epoch 2: 10.506±0.15, two sample t-test: t(420)=-15.18, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,1451]=192.37, p<10-10), drug (F[1,1451]=31.38, p<10-5), group (F[1,1451]=31.16, p<10-5), drug X group (F[1,1451]=65.62, p<10-10). (C) In familiar sessions, Unch. visit duration decreased from Epoch 1 to Epoch 2, with only a modest effect of CNO compared to novel sessions. Mean ± standard error, Exp Saline, Epoch 1: 6.011±0.2, Epoch 2: 4.7±0.18, two sample t-test: t(1180)=4.806, p<10-4; Exp CNO, Epoch 1: 6.037±0.18, Epoch 2: 5.371±0.17, two sample t-test: t(1053)=2.712, p=0.0068. Con Saline, Epoch 1: 5.969±0.17, Epoch 2: 4.465±0.13, two sample t-test: t(746)=7.057, p<10-10; Con CNO, Epoch 1: 6.035±0.13, Epoch 2: 4.174±0.07, two sample t-test: t(858)=13.165, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,3837]=120.87, p<10-10), group (F[1,3837]=9.22, p=0.0024), epoch X group (F[1,3837]=8.15, p=0.0043), epoch X drug X group (F[1,3837]=4.26, p=0.0391). (D) In familiar sessions, Incr. visit duration increased from Epoch 1 to Epoch 2. Mean ± standard error, Exp Saline, Epoch 1: 6.058±0.19, Epoch 2: 10.463±0.24, two sample t-test: t(1169)=-14.293, p<10-10; Exp CNO, Epoch 1: 5.77±0.18, Epoch 2: 11.07±0.29, two sample t-test: t(1045)=-15.654, p<10-10. Con Saline, Epoch 1: 6.519±0.2, Epoch 2: 11.12±0.23, two sample t-test: t(741)=-14.979, p<10-10; Con CNO, Epoch 1: 6.018±0.15, Epoch 2: 10.206±0.15, two sample t-test: t(852)=-19.849, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,3807]=885.39, p<10-10), drug X group (F[1,3807]=7.78, p=0.0053), epoch X drug X group (F[1,3807]=4.44, p=0.0352). (E) Unch. visit duration increased from Epoch 2 to Epoch 3. Mean ± standard error, Exp Saline, Epoch 2: 4.743±0.15, Epoch 3: 7.274±0.23, two sample t-test: t(1354)=-9.542, p<10-10; Exp CNO, Epoch 2: 5.705±0.16, Epoch 3: 6.898±0.27, two sample t-test: t(1096)=-4.031, p=10-5. Con Saline, Epoch 2: 4.58±0.1, Epoch 3: 6.05±0.18, two sample t-test: t(939)=-7.838, p<10-10; Con CNO, Epoch 2: 4.391±0.06, Epoch 3: 6.049±0.13, two sample t-test: t(1033)=-12.818, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,4422]=196.47, p<10-10), group (F[1,4422]=52.77, p<10-5), epoch X drug (F[1,4422]=5.53, p=0.0187), epoch X drug X group (F[1,4422]=9.74, p=0.0018). (F) Incr. visit duration decreased from Epoch 2 to Epoch 3. Mean ± standard error, Exp Saline, Epoch 2: 10.351±0.2, Epoch 3: 7.878±0.31, two sample t-test: t(1369)=6.993, p<10-10; Exp CNO, Epoch 2: 11.691±0.25, Epoch 3: 7.718±0.35, two sample t-test: t(1116)=9.437, p<10-10. Con Saline, Epoch 2: 11.047±0.17, Epoch 3: 6.578±0.18, two sample t-test: t(958)=17.166, p<10-10; Con CNO, Epoch 2: 10.304±0.11, Epoch 3: 6.296±0.17, two sample t-test: t(1057)=20.98, p<10-10. Three-way ANOVA with epoch, drug, and animal group: epoch (F[1,4500]=491.46, p<10-10), group (F[1,4500]=25.7, p<10-5), epoch X group (F[1,4500]=9.11, p=0.0026), drug X group (F[1,4500]=10.72, p=0.0011), epoch X drug X group (F[1,4500]=8.49, p=0.0036).

Effect of reward change on running velocity. Related to Figure 1.

Running speed towards the Incr. end in Epoch 2 was consistently significantly faster than towards the Unch. end, across all animal group, drug, and novelty conditions (one-sample t-test: exp, novel, CNO: t(9)=3.96, p=0.003; exp, novel, saline: t(8)=3.45, p=0.009; exp, familiar, CNO: t(33)=6.9, p<10-7; exp, familiar, saline: t(35)=3.96, p<10-3; control, novel, CNO: t(11)=3.34, p=0.007; control, novel, saline: t(11)=4.26, p=0.001; control, familiar, CNO: t(24)=8.46, p<10-7; control, familiar, saline: t(22)=5.6, p<10-4). Filled symbol, saline; unfilled symbol, CNO. Bars are standard error.

Modulation of SWR rate by reward increase. Related to Figure 2.

(A) In experimental rats, a mixed effects Poisson GLM was fit to the data and 5,000 drug identity shuffles. The difference between model-predicted SWR rate in saline and CNO sessions at each reward end (Unch. top row, Incr. bottom row) and novelty condition (familiar left column, novel right column), in data (red lines) and in bootstrap shuffles (histogram). Significance values reflect one-tailed hypothesis test, with hypotheses that Unch. saline < Unch. CNO and Incr. saline > Incr. CNO. (B) A mixed effects GLM with bootstrap, as in (A), but for control animals.

SWR rate in Epoch 3. Related to Figure 2.

(A) SWR rate as a function of time in stopping period in Epoch 2 and 3 for four example sessions in experimental rats, as in Figure 2a. Epoch 2 (red lines), Epoch 3 (dashed gray lines). SWR rate was binned in 0.25 s windows and smoothed with a 2 bin Gaussian. Line, mean; shading, standard error. (B) Same as Figure 2F, but for Epoch 3. (C) Same as Figure 2G, but for Epoch 3.

SWR rate at stable end in experimental rats. Related to Figure 3.

(A) At stable end visits in saline sessions, SWR rate was not significantly modulated by the previous volatile end visit reward volume. Pearson correlation between SWR rate and previous volatile volume, r=-0.0643, p=0.21. Two sample t-test between volatile volume ≤ 2 and volatile volume > 2, t(380)=1.465, p=0.144. (B) At stable end visits in CNO sessions, SWR rate was not significantly modulated by the previous volatile end visit reward volume. Pearson correlation between SWR rate and previous volatile volume, r=-0.0645, p=0.205. Two sample t-test between volatile volume ≤ 2 and volatile volume > 2, t(386)=1.137, p = 0.256. Two-way ANOVA with drug and previous volatilevolume ≤ 2: drug (F[1,766]=6.43, p=0.0114), volume ≤ 2 (F[1,766]=3.36, p=0.067), drug X volume (F[1,766]=0.03, p=0.853). Error bars, standard error.

SWR rate in all sessions in volatile reward task. Related to Figure 3.

(A) SWR rate as a function of reward volume and time in end visit, as in Figure 3B, for all sessions combined (including saline and CNO sessions in experimental and control rats). Left panel, stable reward end. Right panel, volatile reward end. In stable panel, traces are colored based on previous volatile end visit volume. In volatile panel, traces are colored based on current volatile volume. (B) SWR rate at volatile end as a function of current and previous volatile volume, as in Figure 3D, for all volatile reward task sessions. (C) SWR rate for each non-zero volatile volume plotted as a function of previous volume, with the mean SWR rate for that current volume subtracted. Unfilled symbols, mean of previous volume across all current volumes. Thick dashed line, linear fit to mean values. Pearson correlation between (ripple rate – mean) and previous volume, r=-0.07, p=0.0014, consistent with RPE coding. Error bars, standard error. (D) Positive RPE caused significantly greater ripple rate than negative RPE (two-sample t-test, t[1661]=2.741, p=0.0062). (E) SWR rate at the stable end was significantly negatively correlated with the most recent volatile volume (r=-0.06, p=0.003). (F) SWR rate at the stable end was significantly greater when the most recent volatile end volume was less than or equal in volume (≤ 2) than when it was greater (two-sample t-test, t[2485]=2.582, p=0.01). (G) SWR rate at the volatile end was significantly higher if recent reward history was lower than the average. Reward volume at the 3 previous visits was averaged, then split above and below the median. Poisson GLM with two terms, current volume and reward history (above/below median): current volume, z=22.21, p<10-10; history, z=-2.03, p=0.042).

Effect of novelty and VTA inactivation on place cell properties. Related to Figure 4.

(A) Correlation between single lap place fields and session averaged field. Three-way ANOVA with drug, novelty, and animal group: novelty (F[1,3249]=6.75, p=0.0094), novelty X group (F[1,3249]=15.76, p=0.0001), all others, p>0.2. (B) Correlation between unidirectional fields calculated separately in each running direction. Three-way ANOVA with drug, novelty, and animal group: drug (F[1,2816]=5.76, p=0.0164), novelty (F[1,2816]=28.21, p<10-10), drug X novelty (F[1,2816]=5.52, p=0.0188), novelty X group (F[1,2816]=6.56, p=0.011), all others, p>0.17.

Run decoding accuracy in replay analysis sessions. Related to Figure 4.

(A) Mean decoding error during run. Position and running direction were decoded during periods of strong locomotion (animal velocity >20 cm/s and position >20 cm from the reward wells) in 250 ms bins. Sessions with >35 cm mean decoding error were excluded from analysis. Filled and unfilled symbols are saline and CNO sessions, respectively. Error bars, standard error. Three-way ANOVA with animal group, drug, and novelty, all terms n.s. (B) Mean fraction of bins where actual and decoded running direction were the same. Sessions with <60% match were excluded from analysis. Symbols as in (A). Three-way ANOVA with animal group, drug, and novelty, all terms n.s.