Distribution of sessions/rat

Different forms of cognitive effort as expressed in different behavioral strategies.

Sessions were grouped based on the relative strength of a dLP-biased strategy versus an ival-tracking strategy in choice patterns. A strong bias for dLPs was observed in G1 that remained consistent regardless of ival. In contrast, G2 sessions exhibited a strong ival-tracking strategy as the ratio of dLPs:iLPs reversed when ival went from low to high. Finally, G3 exhibited a mix of both strategies such that there were more dLPs when ival was low and the ratio of dLPs to iLPs decreased at higher ivals but never exceeded 0.5.

Quantifying choice behavior using a RL model.

A) The transition matrix showing all possible task states. Task states were defined according to all the possible lever-press to lever-press choice transitions for the task. The sessions always began from an initial state (state 1), which would be equivalent to the pre-task period. The rat (or agent) could then choose to perform either a dLP (blue) or iLP (red) until they made repeated presses on either lever, which would then initiate a forced choice trial (thick lines with open arrows, states 5→7 or 4→6). Task state distributions of G1 (B), G2 (C). D) Parameter space of the effects of changing the learning rate (α) and the likelihood of exploration (ε-greedy) on dLP:iLP. The heat plot gives the free choice dLP:iLP produced by different model parameters. γ was held at 0.2 for all simulations. E) Simulations that matched the dLP:iLP of G1 (Left), G2 (Right) sessions were obtained using the same RL parameters (α=0.95; ε-greedy=0.4; γ=0.2) but a dbias term of 0.4 was included for the simulations shown in the right panel. In B and D, the task state frequency distributions were averaged across sessions and normalized to the maximum state visitation frequency.

Neural representations of ival tracking.

Examples of PCs (dark traces) that tracked ival (gray dashed traces) during various epochs of a session from A) G1 (PC1) and B) G2 (PC3). The left panels show the portion of the PC associated with the LP epoch and the right, with the outcome epoch. To help visualize the relationship between a given PC and ival, the PC was normalized between 0-1.

Robust ival tracking was present in all 3 groups.

A) MCML was performed on concatenated spike count matrices of the LP epoch for an example session from G1. The clusters were colored according ival on the trial in which the LP was performed. B) The MCML components tracking ival across trials from another G1 session. C) The sorted loadings of neurons on the components shown in (B). D) Examples of neurons exhibiting ival tracking. The mean (and s.e.m.) spike rate/0.2s bin in the 1s period preceding the LP on each trial is plotted in black and ival is given by the gray dotted line. (E-H) same as (A-D) but for G2. I) The mean (and s.e.m.) r2 between ival and the ival-tracking component associated with the LP epoch for all sessions in G1 and G2.

Group differences in tracking of ival by single neurons.

For each neuron, a linear regression model was used to fit the z-scored mean spike count during the LP epoch across trials with ival. A) Example of a neuron from G1. B) Example of a neuron from G1. In (A) and (B), model fit line is black, the normalized ival line is dotted gray, the residuals on iLP trials are given by the red vertical lines. The sign of the residuals relates to whether the fit line was below ival (negative residual) or above ival (positive residual). C) Summary of the mean of the residuals on all iLP trials for G1 (black bars) and G2 (gray bars). Neurons had to have a mean spike count of at least 0.2 spikes/bin during the LP epoch of all trials and an r2 of 0.2 or greater to be included in (C). ** denotes p<0.001 based on post-hoc multiple comparison testing (Tukey’s HSD).

Theta power increases prior to delay choices when it is the preferred option.

Theta oscillations were examined prior to iLPs (A) and dLPs (B). Theta power was highest in G1 prior to a dLP when compared with G2 or an iLP in G1. Stratifying theta power by ival revealed increases ∼10 sec following an iLP in G1 for low ivals (C). Increases in theta power around a dLP were observed for mid and high ivals (D). While no effect of ival was observed for iLPs (E) or dLPs (F) in G2. Data are presented at mean ± SEM. Red line denotes Tukey’s HSD, p < 0.05 G1 vs G2; Orange line denotes Tukey’s HSD, p < 0.05 G1 dLPvs G2 dLP (A, B). Green line denotes Tukey’s HSD, p < 0.05, 0-2 vs 5-6 ival; purple line denotes Tukey’s HSD, p < 0.05, ival 0-2 vs 3-4.

Increase theta entrainment of spiking is observed in G1.

PCA was performed on the autocorrelations from the spike trains of each neuron and the amount of variance captured by each PC is shown (A). The first three PCs are shown in (B), the black and gray traces on top show the mean autocorrelation pattern captured by the positive and negative coefficients, respectively. The bottom panels show the spectrum of the mean autocorrelation for positive and negative loaders for each PC. Neurons that oscillate in the theta band were separated via negative coefficients on PC3 (B3). The distribution of coefficients for PC3 separated for G1 and G2. The three modes in the distribution made it possible to quantify the negative loaders on PC3 as theta entrained and compare between G1 and G2 (C). *** Kolmogorov-Smirnov test, p<0.0001.

Modeling approaches for G2. A) α=0.2, ε.-greedy=0.8, no dbias. B) α=0.5, ε.-greedy=0.5, no dbias. C) α=0.97, ε.-greedy=0.4, uniform dbias=1. D) α=0.2, ε.-greedy=0.8, uniform dbias=0.5. In each subpanel, the state frequency distribution (top) and choice distribution with respect to ival (bottom) are plotted as in Figure 2.