Figures and data

The monkeys were biased toward choices associated with large reward
A, Task design and timeline. The monkeys made saccades to indicate their perceived motion direction. Correct trials were rewarded based on the reward context (see Table inset). Error trials were not rewarded. “Epochs” illustrate the time windows for epoch-based analyses. B, Average choice (left) and RT (right) behavior of the two monkeys. Monkey C: 93 sessions, 37,706 trials; Monkey F: 63 sessions, 27,113 trials. Filled and open circles are data from the three reward contexts, as indicated at the top of the panel. C, Histograms of reward bias for all sessions, estimated using logistic fits to choice data.

Example STN neurons showing modulation by decision-related factors
A-C, Average activity from the equal-reward task, for trials with contralateral (top row) and ipsilateral (bottom row) choices. Activity was truncated at the median RT for each trial condition. D-F, Average activity from the asymmetric-reward task, for trials with contralateral (top row) and ipsilateral (bottom row) choices. Activity was truncated at the median RT for each trial condition. Purple and green colors indicate blocks with the large reward paired with contralateral and ipsilateral choices, respectively. G-I, Results from a multiple linear regression analysis. Each line shows the timing of significant non-zero coefficients for a specific regressor (t-test, p<0.05). Results for the interaction terms are not shown.

STN activity is modulated by choice, visual evidence, and reward information
A, Fractions of neurons with statistically reliable modulation by task-related factors in the seven task epochs (defined in Figure 1A), as identified by a multiple linear regression (Eq. 2). Horizontal dashed lines: chance levels. Filled circles: fractions that were significantly above chance levels (chi-square test, p<0.05). B, Fractions of neurons with joint modulation, defined as significant modulation by motion coherence (for either choice) and reward context or reward size, as well as significant modulation by the coherence-reward size interaction terms.

STN activity covaries with multiple DDM parameters
A, Illustration of the DDM. B, Example of average choice and RT performance for trials split by firing rates in Epoch 5. Circles: performance for trials grouped by firing rates and reward contexts (see legend in top panel). Curves: fits by logistic (choice) and linear (RT) functions. C, Identification of neurons with strong task-related modulation. x-axis: z-score of maximal activation for each neuron; y-axis: z-score of maximal suppression for each neuron. Dashed lines: criteria used for identifying task-modulated neurons (modulation z-score>1.5 for activation or suppression). Red circles: neurons included in the analyses in D. D, Histograms of regression coefficients for combinations of activity epochs (rows) and DDM parameters (columns). The parameters include: a, the maximal bound height; B_collapse and B_d, the decay speed and onset specifying the time course of the bound “collapse”; k, a scale factor governing the rate of evidence accumulation; me, an offset specifying a bias in the rate of evidence accumulation; z, an offset specifying a bias in the DV, or equivalently, asymmetric offsets of equal magnitude for the two choice bounds; and t0_Contra and t0_Ipsi, non-decision times for the two choices that capture RT components that do not depend on evidence accumulation (e.g., visual latency and motor delay). Filled bars: neurons showing significant non-zero coefficients (t-test, p<0.05). Red triangles: population median differs from zero (sign rank test), assessed with criteria of p=0.001 (p = 0.05 after multiple comparisons correction for 3 epochs, 8 parameters and both firing rate and firing rate x reward context effects). Blue squares: the proportion of neurons with significant covariation is above chance (chi-square test), assessed with criteria p=0.001.

STN consists of subpopulations with distinct activity patterns and relationships with DDM components.
A,D,F, Average firing rates for each cluster of neurons for asymmetric-reward trials, separately for correct trials with contralateral (top) and ipsilateral (bottom) choices and aligned to motion (left) and saccade (right) onsets. Colors indicate coherence levels and reward context (see legend). B,E,G, Relationship between activity measured in the three epochs and DDM components for each cluster. Top: sign-test results for the null hypothesis that the regression coefficients for the effect of firing rate in the three epochs on the DDM component have a zero median. Bottom: sign-test results for the null hypothesis that the regression coefficients for the reward context-firing rate interaction have a zero median. Dark brown/teal: positive/negative median with p<0.001. Light brown/teal: positive/negative median with p<0.05. C, Fractions of neurons with significant modulation by task-related factors in the seven task epochs, separately for each cluster. Same format as Figure 3. Horizontal dashed lines: chance levels. Filled circles: fractions that were significantly above chance levels (Chi-square test, p<0.05). Triangles: epochs in which there was a significant difference among clusters (Chi-square test, p<0.05/7 epochs).

Relationship between reward context modulation of firing rates and reward context modulation of DDM parameters, based on k-means clustering
Each panel shows the scatterplot of reward context modulation of firing rates in an epoch, normalized by the average firing rate in the same epoch, and the difference in a DDM parameter value between the two reward contexts. Each data point represents one neuron in a given cluster. Significant correlation (t-test, p<0.05/18) is indicated by the raw p value and regression line with confidence intervals.

Activity of STN neurons reflect decision evaluation signals.
A, Heatmaps of partial correlation coefficients between neural activity and post-decision choice accuracy and reward expectation for contralateral and ipsilateral choices, after accounting for the effects of reward expectation and choice accuracy, respectively. Values that did not differ from 0 (t-test, p>0.05 after multiple comparison corrections) were set to zero. Only neurons with task-modulated activity were included (n=87). B, Heatmaps for each cluster (column). Clusters were defined using the k-means method (same as Figure 5). C, Fractions of neurons with non-zero partial correlation (top) and with positive partial correlation (bottom). D, Fraction plots for each cluster.

STN neurons with different activity patterns are intermingled
A, Locations of neurons in the two monkeys. Black points are neurons considered “task-modulated”. B, Locations of neurons in different clusters. Distance values are relative to the anterior commissure. AP: anterior/posterior; DV: dorsal/ventral.

Hypothesized functions and connectivity of STN subpopulations
The three STN subpopulations may receive different inputs and serve distinct functions. Cluster 1 neurons may receive direct input from the cortex via the hyperdirect pathway and provide early-onset modulation of bound dynamics. Cluster 2 and 3 neurons may receive inputs from different GPe subpopulations in the indirect pathway, which relay caudate signals related to evidence accumulation and saccade generation, to mediate evidence bias and non-decision times, respectively. These different STN outputs and the outputs from GPe and the caudate direct pathway converge onto SNr to affect decision-related SC activity via inhibition. Abbreviations: GPe: external segment of globus pallidus; SNr: substantia nigra pars reticulata; SC: superior colliculus.

Summary of temporal profiles of modulation of STN activity by decision-related factors.
For each regression factor, heatmaps of significant regression coefficients are plotted in the first row for activity aligned to motion (left) and saccade (right) onsets. Neurons are sorted by the timing of peak magnitude of modulation. The fraction of neurons with significant non-zero coefficients are plotted in the second row. Significance for a coefficient was assessed using t-test (p<0.05). For the fraction plots, the dashed horizontal lines represent chance level. Time bins with values that are significantly above chance (chi-square test, p<0.05) are indicated via thicker lines.

Comparison between STN, FEF, and caudate neurons
Each panel shows the fractions of neurons in the three regions (see legend for colors) with significant coefficients (t-test, p<0.05) for a specific regressor (rows) and activity alignment (columns). Horizontal bars indicate results of chi-square tests using a criterion of p<0.05/3(alignments)/7(regressors). Black horizontal bars indicate significant difference between FEF and STN populations. Red horizontal bars indicate significant difference between caudate and STN populations. FEF and caudate data are from Fan, et al. eLife, 2020.

Dendrograms for linkage-based analysis
A, Dendogram computed using all neurons with task-modulated activity (n = 86). A cutoff of 0.7 (red dashed line) was used to identify “outlier” neurons that did not belong to major subpopulations. B, Dendrogram computed using filtered neurons (n = 76). A cutoff of 0.85 was used to cluster the neurons into three groups.

Results from linkage clustering analysis.
Same format as Figure 5

Cluster-average activity for clusters identified using only neurons from monkey C (A) or monkey F (B).
Same format as Figure 5A.

Clustering results are stable with subsets of trials.
10 sets of vectors were generated by sampling with replacement a fraction of trials from each neuron. K-means and linkage clustering were peformed on these resampled vectors. Left column: mean silhouette scores for each fraction value. Right column: mean Rand index between cluster identities of the resampled vectors and the original cluster assignment based on all trials.

Histograms of significant relationships between neural activity and DDM parameters for the three clusters
Each panel corresponds to a brown/teal box in Figure 5B,E,G. Red lines indicate median values.

Relationship between reward context modulation of firing rates and reward context modulation of DDM parameters, based on linkage clustering
Same format as Figure 6.