Hypothesized neural implementation of a context-dependent leak in evidence accumulation.

For perceptual decisions about motion direction, momentary evidence is encoded in the middle temporal area (MT) and then accumulated by downstream circuits to form a decision variable that guides behavior. Models of decision-making often implement flexible, context-dependent evidence accumulation via a single parameter that controls the temporal dynamics (i.e., leakiness) of an accumulator (top). In principle, a flexible leak in the accumulation process could involve changes in evidence encoding, accumulation, or both (bottom). Here we used manipulations of context stability (low-versus high-frequency direction reversals of an adapting motion stimulus) to test for a role of evidence encoding. Because global signals related to arousal can influence decision flexibility and cortical information processing at multiple levels, we further considered potential interactions between adaptation- and arousal-related effects on flexible evidence accumulation

Manipulating context stability in a random-dot motion task affects evidence-accumulation behavior.

(A) Each trial consisted of an adapting stimulus (2400 ms) followed immediately by a test stimulus (100–1200 ms, drawn from an exponential distribution). During the adapting stimulus, random-dot motion switched directions at either a low (LSF; 1 switch) or high (HSF; 5 switches) frequency, creating two context-stability conditions. Between epochs, there was a 50% probability of an additional change in motion direction, producing switch and non-switch trials. During the test stimulus, the monkeys accumulated motion evidence over variable durations and reported the final direction of motion with a saccade to the corresponding choice target. (B) Runing average (5-trial window) of behavioral choice data (dotted line) and psychometric fits (solid lines) for representative sessions from each monkey. LSF (blue) and HSF (orange) conditions were fit separately. Shallower slopes indicate decreased perceptual sensitivity as a function of viewing time, consistent with leakier evidence accumulation. (C) Across-session comparison of fitted slopes for LSF and HSF conditions (Monkey An = 51 sessions, circles; Ch = 34, diamonds; Mi = 76, squares). p-value is from a Wilcoxon signed-rank test for equal medians.

Context stability modulates motion-evidence encoding in MT.

(A) Raster plot (top) and baseline-subtracted average firing rate (bottom; mean ± SEM) for switch trials from a representative MT single unit on low (LSF; blue) and high (HSF; orange) switch-frequency trials. Solid and dashed lines indicate responses to motion in the unit’s preferred and null directions, respectively, defined relative to motion direction during the test stimulus. (B) Baseline-subtracted, normalized firing rate averaged across all recorded MT single units (n = 155; mean ± SEM). (C) Pairwise comparison of average single-unit activity during the test stimulus (50–500 ms) for LSF and HSF preferred-motion trials (Monkey An = 55 single units, circles; Ch = 13, diamonds; Mi = 87, squares). p-value is from a Wilcoxon signed-rank test for equal medians. (D) Pearson’s correlation between MT direction selectivity and the magnitude of LSF–HSF response differences during the test stimulus (50–500 ms. Red solid and dashed lines indicate a linear fit ± 95% CI.

Context-dependent differences in MT evidence encoding emerge following repeated preferred-motion stimulus presentation.

(A) Responses of MT single units (points) to the first (adapting-stimulus onset) and final (test) presentations of preferred-motion stimuli. Horizontal bars represent means. p-values are for two-sample t-tests for equal means of the two distributions. (B) Change in response differences (LSF–HSF) from the first (adapting-stimulus onset) and final (test) presentations of preferred-motion stimuli. Individual animals are in gray; group average is in black; all points are mean ± SEM across units. p-value is from a two-sample t-test comparing the full distribution of response differences (LSF–HSF) from the two time intervals for all monkeys. (C) Example responses from representative facilitating (top) and adapting (bottom) single units. Colors and line styles as in Fig. 2A. (D) Comparison of average baseline-subtracted and normalized responses across successive presentations of preferred motion within the adapting epoch for HSF switch trials (panel C, orange solid). For adapting (black, n = 87) and facilitating (white, n= 27) groups, responses to each preferred-motion presentation were compared to the first (adapting-stimulus onset) preferred-motion response using Cohen’s d to quantify changes in response magnitude as a function of stimulus number. Points denote mean ± SEM across units. (E, F) Pairwise comparison of average single-unit responses to the test stimulus (50–500 ms) during LSF versus HSF conditions for adapting (E) and facilitating (F) units. Different symbols represent data from different monkeys, as in Fig. 2C. p-values are from a Wilcoxon signed-rank test for equal medians.

MT neural activity relates to evidence-accumulation behavior.

(A) Average ROC area over time for a representative MT single unit showing discriminability between preferred and null motion during low (LSF, blue) and high (HSF, orange) switch-frequency switch trials. (B) Pairwise comparison of average ROC area during the test stimulus (50–500 ms) for LSF and HSF switch trials. p-value is from a Wilcoxon signed-rank test for equal medians.(C) Context stability-dependent differences in behavioral performance differences (LSF–HSF, measured as the difference in accuracy for trials that ended 375–600 ms after test-stimulus onset; gray band in D and E, left panels) plotted as a function of MT single-unit discriminability (LSF–HSF ROC area, 200–400 ms after test-stimulus onset; gray band in D and E, right panels). Red solid and dashed lines indicate a linear fit ± 95% CI. p-value is for the Pearson’s correlation coefficient (H0: r = 0). (D) Sessions with shallower psychometric slopes at HSF relative to LSF. Columns show (left) MT evidence encoding (mean ± SEM baseline-subtracted, normalized firing rates averaged across units), (middle) MT evidence discriminability (average ROC area, 200–400 ms after test-stimulus onset; p-values from a Wilcoxon signed-rank test for equal medians), and (right) behavioral performance in stimulus-duration bins with roughly equal numbers of trials (mean ± SEM across sessions). (E) Same as D, but for sessions with shallower or equal slopes at LSF relative to HSF. Different symbols in the four scatterplots represent data from different monkeys, as in Fig. 2C.

Evoked pupil responses depend on context stability and relate to evidence-accumulation behavior.

(A) Average evoked pupil responses over time (z-scored per session) for low (LSF; blue) and high (HSF; orange) switch-frequency conditions (mean ± SEM across all sessions), shown separately for the two monkeys. (B) Sliding-window regression coefficients (βcxt; mean ± SEM across all sessions from both monkeys) estimating the effect of context-stability condition (LSF–HSF) on pupil diameter during stimulus viewing while controlling for baseline pupil diameter. Black bar (top) indicates windows in which βcxt differed significantly from zero (p < 0.05, uncorrected for multiple comparisons). (C) Spearman’s rank correlation coefficient between behavioral sensitivity (LSF–HSF psychometric slope) and βcxt coefficients from individual sessions as a function of time relative to test-stimulus onset (computed in 100 ms bins with 10 ms steps). Data from individual animals are shown separately (Monkey An: light grey; Monkey Mi: dark grey) along with the across-animal average (black). Corresponding grayscale bars at the top indicate time points with significant correlations (p < 0.05, uncorrected for multiple comparisons). (D) Spearman’s rank correlation between average evoked pupil diameter differences (βcxt, -500–0 ms relative to test-stimulus onset; gray band in C) and behavioral sensitivity differences (LSF–HSF slope) across sessions. Red solid and dashed lines indicate a linear fit ± 95% CI. p-value is for the Spearman’s correlation coefficient (H0: rho = 0).

Context-stability jointly and differentially recruits adaptation and arousal-related mechanisms across sessions.

(A) Distributions of differences in explanatory power (Tjur’s pseudo-R2) for models fit with versus without the neural (purple) or pupil (green) term across sessions for Monkey An (left) and Monkey Mi (right). Mean values are indicated by dashed lines and corresponding colored triangles. p-values are from a one-sample sign test for H0: fit parameters came from a distribution with a median of zero. (B) Context-stability differences (LSF–HSF) in neural (MT firing rate; abscissa) and pupil (evoked diameter; ordinate) model terms for Monkey An (left) and Monkey Mi (right). Points are fits to data from individual sessions/units from logistic models that quantified how much the slope of time-dependent psychometric functions covaried with the given neural or pupil term, computed separately for LSF and HSF trials. Shaded quadrants denote relative dominance of neural (purple) or pupil (green) effects. p-values are from a Wilcoxon signed-rank test for equal medians (top) and Spearman’s correlation (bottom).

Time-dependent logistic model captures choice behavior.

Distribution of Tjur’s pseudo-R2 values from empirical logistic fits (blue) compared with values from shuffled control (“null”) fits (cyan) for each monkey, as indicated. Null fits were obtained by shuffling the association between test-stimulus durations and switch/non-switch trial types across 100 iterations per session. For each session, the reported pseudo-R2 value was the average from separate fits to low and high switch-frequency trials. Dashed lines and triangles indicate mean values for each distribution. Enhanced explanatory power of the time-dependent logistic model versus the shuffled null was consistent across the three monkeys (Wilcoxon rank-sum test for equal medians: Monkey An: p < 0.001; Monkey Mi: p < 0.001; Monkey Ch: p < 0.001).

Context-dependent differences in evidence accumulation were consistent across animals.

Pairwise comparisons of fitted psychometric slopes for low (LSF) and high (HSF) switch-frequency conditions across sessions for each monkey, as indicated. Slope values were bounded at 0.05, which is roughly the maximum resolvable steepness given our sampling of viewing times. All three monkeys tended to have shallower time-dependent psychometric slopes at HSF relative to LSF (p-values are from a Wilcoxon signed-rank test of equal medians). This result implies reduced sensitivity to evidence as a function of viewing time, which is consistent with leakier evidence accumulation at HSF.

Context-dependent differences in evidence encoding were consistent across animals and reflected changes in response magnitude rather than temporal dynamics.

(A) Pairwise comparisons of mean responses of individual MT units (points) to preferred motion during the test stimulus (50–500 ms after onset) for low (LSF) and high (HSF) switch-frequency conditions across sessions for each monkey, as indicated. All three monkeys exhibited stronger MT responses at LSF relative to HSF (p-values are from a Wilcoxon signed-rank test for equal medians), demonstrating consistent, context-stability-dependent differences in evidence encoding. (B) We quantified the time course of MT neural response by fitting baseline-subtracted, normalized activity during the test stimulus (preferred-motion switch trials only) with a single-exponential function. To facilitate fitting, data for each neuron were divided into three bins (200–330 ms, 340–470 ms, and 480–600 ms following test-stimulus onset) chosen to accommodate initial response latency and include approximately equal trial counts. Fitting was performed separately for low (LSF) and high (HSF) switch-frequency conditions. The exponential time constant (tau, in ms) from best-fitting, single-exponential fits showed no difference between LSF and HSF conditions for each monkey (p-values are from a Wilcoxon signed-rank test for equal medians). Thus, the temporal dynamics of MT responses were not reliably affected by recent temporal statistics.

Context-dependent differences in initial MT responses were minimal and stable over the course of the session.

(A) Average MT single-unit responses (points) during onset of the adapting stimulus (50–400 ms) for low (LSF) versus high (HSF) switch-frequency conditions, shown separately for each monkey. Initial responses to preferred motion were matched between LSF and HSF for Monkey An and Monkey Ch, but were slightly elevated at LSF for Monkey Mi (p-values are from a Wilcoxon signed-rank test for equal medians). (B) For Monkey Mi, this initial difference in activity between LSF and HSF was stable across early and late blocks of each condition, possibly reflecting expectations about switching dynamics that were learned relatively quickly within each block of trials.

Context-dependent differences in MT evidence discriminability and its relationship to behavioral performance varied across monkeys.

(A) Average ROC area during the stimulus (50–500 ms) for low (LSF) and high (HSF) switch-frequency conditions, shown separately for each monkey. ROC area was greater at LSF relative to HSF for Monkey An (Wilcoxon signed-rank test for equal medians). Monkey Ch showed a similar directional effect with a comparable effect size but for a much smaller sample and thus was not statistically significant. No effect was apparent for Monkey Mi despite a large sample size and context-stability differences in preferred-motion responses during the same stimulus window (Extended Data Fig. 3A). (B) Correlations between context-stability differences in ROC area (LSF–HSF; 200–400 ms) and behavioral performance (LSF–HSF percent correct; 375–600 ms), shown separately for each monkey. A significant positive relationship was observed for Monkey An, with a similar directional trend for Monkey Ch, though power was limited by the smaller number of recorded neurons. No relationship was observed for Monkey Mi despite a large sample size. Red solid and dashed lines indicate linear fits ± 95% CI.

Relationship between MT neural activity and evidence-accumulation behavior was consistent across animals.

Average MT neural activity during the test stimulus (200–400 ms) for low (LSF) and high (HSF) switch-frequency conditions, shown separately for each monkey (rows) and session group (columns). Sessions were divided based on evidence-accumulation behavior: left column: sessions in which psychometric slopes were steeper at LSF than HSF (LSF–HSF > 0); right column: sessions in which psychometric slopes were greater at HSF or equal across conditions (LSF–HSF ≤ 0). When the monkeys were more sensitive at LSF (left column), MT neurons showed greater activity at LSF relative to HSF. Conversely, when the monkeys were more sensitive at HSF or equally sensitive between conditions (right column), neural responses did not differ between conditions. p-values are for a Wilcoxon signed-rank tests for equal medians.

Context-dependent differences in MT activity were specific to correct trials.

Baseline-subtracted and normalized MT firing rates during the test stimulus for low (LSF; blue) and high (HSF; orange) switch-frequency conditions, averaged across incorrect trials for all sessions. Data are shown separately for sessions in which the monkeys were more sensitive (i.e., accumulated more evidence over time) at LSF (left; LSF n = 1883, HSF n = 1939) or HSF (right; LSF n = 463, HSF n = 395). Unlike correct trials, error trials showed no consistent modulation by context stability or evidence-accumulation behavior, further confirming that context-dependent changes in MT neural activity were behaviorally relevant.

Baseline pupil diameter covaried with fixation-acquisition time across trials, sessions, and monkeys but did not differ across context-stability conditions.

(A) Representative sessions illustrating trial-by-trial relationships between fixation-acquisition time (top) and baseline pupil diameter (bottom) for Monkey An (left, grey) and Monkey Mi (right, black). The colored bar above indicates the active context-stability condition on each trial. (B) Distributions of session-wise Spearman’s rank correlation coefficients between fixation-acquisition time and baseline pupil diameter across sessions for Monkey An (grey) and Monkey Mi (black). For both monkeys, the distribution of correlation coefficients was significantly positive (Monkey An: p < 0.001, t = 18.6, df = 48; Monkey Mi: p < 0.001, t = 48.93, df = 74). Mean correlation values are indicated by dashed lines and associated triangles. (C) Session-averaged baseline pupil diameter for LSF versus HSF conditions for Monkey An (left) and Monkey Mi (right). Average baseline pupil size did not reliably differ across context-stability conditions for monkey An and only slightly for monkey Mi (p values are for a Wilcoxon signed-rank test for equal medians).

Evoked pupil diameter related to context-dependent evidence accumulation-behavior, but not MT neural activity.

(A) Context-stability effects on evoked pupil diameter for individual monkeys. Sliding-window regression coefficients (βcxt; mean ± SEM across sessions) estimating differences in pupil diameter between low (LSF) and high (HSF) switch-frequency conditions while controlling for baseline pupil size. Black bar (top) indicates windows in which βcxt differed significantly from zero (p < 0.05, uncorrected for multiple comparisons). (B) Average evoked pupil differences (β2, -500–0 ms relative to test-stimulus onset) plotted versus behavioral sensitivity differences (LSF–HSF psychometric slope) from individual sessions for Monkey An (top) and Monkey Mi (bottom). Red solid and dashed lines indicate a linear fit ± 95% CI. (C) Spearman’s rank correlation between context-stability differences in MT neural activity (LSF–HSF, 50–500 ms) and pupil βcxt coefficients from individual sessions as a function of time relative to test-stimulus onset (computed in 100 ms bins with a 10 ms slide). Individual animals are shown separately (Monkey An: light grey; Monkey Mi: dark grey) along with the across-animal average (black). Corresponding colored bars indicate time points with significant correlations (p < 0.05, uncorrected for multiple comparisons). (D) Same analysis for MT discr9iminability (LSF–HSF ROC area, 50–500 ms) and pupil βcxt coefficients as a function of time relative to test-stimulus.

Effects of null-motion adaptation on preferred-motion responses in MT.

Pairwise comparisons of MT responses to preferred motion before (first stimulus; solid orange line in Fig. 3C) versus after null motion (second, null-adapted stimulus; dashed orange line in Fig. 3C), shown separately for high (HSF, left) and low (LSF, right) switch-frequency conditions. Across the population (top row), prior exposure to null motion did not reliably alter subsequent preferred-motion responses. However, null-motion adaptation produced opposite effects in subgroups of adapting and facilitating neurons. Adapting single units (middle row) showed modest decreases in response to preferred motion following null-motion exposure, whereas facilitating units (bottom row) showed increased responses. Notably, the reduction observed in adapting single units was smaller than that following repeated preferred-motion stimulation. p-values are from Wilcoxon signed-rank tests for equal medians.