Figures and data

Monkeys biased toward choices associated with large reward
A, Task design and timeline. Monkeys made saccades to indicate their perceived motion direction. Correct trials were rewarded based on the reward context (see Table inset). Error trials were not rewarded. “Epochs” illustrate the time windows for epoch-based analyses. B, Average choice (left) and RT (right) behavior of the two monkeys. Monkey C: 91 sessions, 37,291 trials; Monkey F: 59 sessions, 25,481 trials. Filled and open circles are data from the three reward contexts, as indicated at the top of the panel. C, Histograms of reward bias for all sessions, estimated using logistic fits to choice data.

Example STN neurons showing modulation by decision-related factors
A-C, Average activity from the equal-reward task, for trials with contralateral (top row) and ipsilateral (bottom row) choices. Activity was truncated at median RT for each trial condition. D-F, Average activity from the asymmetric-reward task, for trials with contralateral (top row) and ipsilateral (bottom row) choices. Activity was truncated at the median RT for each trial condition. Purple and green colors indicate blocks with the large reward paired with contralateral and ipsilateral choices, respectively. G-I, Results from a multiple linear regression. Each line shows the timing of significant non-zero coefficients for a specific regressor (t-test, p<0.05). Results for the interaction terms are not shown.

STN activity reflects incorporation of visual evidence and reward information
A, Fractions of neurons with significant modulation by task-related factors in the seven task epochs (defined in Figure 1A), as identified by a multiple linear regression (Eq. 2). Horizontal dashed lines: chance levels. Filled circles: fractions that were significantly above chance levels (Chi-square test, p<0.05). B, Fractions of neurons with joint modulation, defined as significant modulation by motion coherence (for either choice) and reward context or reward size, as well as significant modulation by the coherence-reward size interaction terms.

STN consists of subpopulations with distinct activity patterns.
A, Visualization of clusters in the tSNE space. Only neurons that passed visual inspection for task-related modulations were included. Colors indicate cluster identity and are used in other panels. B, Silhouette scores for neurons. C, Rand index values (mean±sd) between multiple iterations of clustering based on different specified cluster numbers. D, Average firing rates for each cluster of neurons for asymmetric-reward trials, separately for trials with contralateral (top) and ipsilateral (bottom) choices and aligned to motion (left) and saccade (right) onsets. Colors indicate coherence levels and reward context (see legend). E, Fractions of neurons with significant modulation by task-related factors in the seven task epochs, separately for each cluster (colors follow those in D. Same format as Figure 3. Horizontal dashed lines: chance levels. Filled circles: fractions that were significantly above chance levels (Chi-square test, p<0.05). Triangles indicate epochs in which there was a significant difference among clusters (Chi-square test, p<0.05/7 epochs).

STN subpopulations relate differently to computational components in a DDM.
A, Illustration of the DDM. B, Example of average choice and RT performance for trials split by firing rates in Epoch 5 of neurons in the second cluster. Circles: performance for trials grouped by firing rates and reward contexts. Curves: fits by logistic (choice) and linear (RT) functions. C, Cumulative density function of regression coefficient (bFR) for the fitted me parameter from behavioral data in B. Triangle: median value. Raw p value: sign test. D, Scatterplot of the difference in me between reward contexts (the large reward was paired with contralateral or ipsilateral choices) and the regression coefficient for reward context of neural activity measured in Epoch 5 (normalized by average activity across conditions, i.e., b0 term in the regression). Data are from neurons in the second cluster. Raw p value: t-test. E-H: Relationship between activity measured in the three epochs and DDM components for the four clusters identified in Figure 4. Top row shows simplified plots of average firing rates, pooling across coherence levels. Colors indicate different reward contexts. Solid and dashed lines indicate trials with contralateral and ipsilateral choices, respectively. The next three rows show results based on activity measured in epochs 3-5 (during decision formation). For each matrix panel, the top row: sign test results for the null hypothesis that the regression coefficients for the effect of firing rate on the DDM component have a zero median (e.g., see C). Second row: sign test results for the null hypothesis that the regression coefficients for the reward context-firing rate interaction have a zero median. Third row: Pearson correlation results for the null hypothesis that the reward context modulation of neural activity is not correlated with the difference in the DDM component between reward contexts (e.g., see D). Dark brown/teal positive/negative median with a p<0.05/8 (DDM components)/3 (epochs). Light brown/teal: positive/negative median with a p<0.05.

STN subpopulations relate to decision evaluation signals.
A, Simplified plots of average firing rates, pooling across coherence levels, for the four clusters of neurons. Same as Figure 5E. B,C, Heatmaps of partial correlation coefficients between neural activity and post-decision choice accuracy for contralateral (B) and ipsilateral (C) choices, after accounting for the effects of reward expectation. Each column shows the results from one neuron cluster. D,E, Heatmaps of partial correlation coefficients between neural activity and post-decision reward expectation for contralateral (D) and ipsilateral (E) choices, after accounting for the effects of choice accuracy. F, Fractions of neurons with non-zero correlation coefficients (t-test, p<0.05). Each column shows the results for one neuron cluster. Colors indicate different evaluation signals. Horizontal black bars on top indicate timepoints when the fraction values significantly differed between accuracy and reward expectation (Chi-square test, p<0.05). Pink and green bars indicate timepoints when the fraction values significantly differed between choices for accuracy and reward expectation, respectively. G, Fractions of neurons with positive correlation coefficients. For each time point, only neurons with non-zero correlation coefficients were included. Same format as F.

Summary of temporal profiles of modulation of STN activity by decision-related factors.
For each regression factor, heatmaps of significant regression coefficients are plotted in the first row for activity aligned to motion (left) and saccade (right) onsets. Neurons were sorted by the timing of peak magnitude of modulation. The fraction of neurons with significant non-zero coefficients were plotted in the second row. Significance for a coefficient was assessed using t-test (p<0.05). For the fraction plots, the dashed horizontal lines represent chance level. Time bins with values that were significantly above chance (chi-square test, p<0.05) were indicated via thicker lines.

Comparison between STN, FEF and caudate neurons
Each panel shows the fractions of neurons in the three regions (see legend for colors) with significant coefficients (t-test, p<0.05) for a specific regressor (rows) and activity alignment (columns). Horizontal bars indicate results of chi-square tests using a criterion of p<0.05/3(alignments)/7(regressors). Black horizontal bars indicate significant difference between FEF and STN populations. Red horizontal bars indicate significant difference between caudate and STN populations. FEF and caudate data are from Fan, et al., 2020. N = 126, 136, and 150 neurons for FEF, caudate, and STN, respectively.

Stability of clustering results
A, Silhouette scores for different combinations of distance metrics and number of clusters. Red triangle indicates the clustering presented in Figure 4. Mean and s.d. values were computed from 50 iterations of clustering. B, Fraction of negative silhouette scores. Same format as A. C, Rand index. Same format as A. D, Visualization of clusters from Figure 4 in the tSNE space calculated from all units. Non-gray circles indicate neurons that passed visual inspection. Colors indicate cluster identities that are used in Figure 4. E, Average firing rates for each cluster of neurons, with the assumption of three clusters. Same format as Figure 4D. Colored boxes indicate clusters that loosely correspond to those in Figure 4. F, Average firing rates for each cluster of neurons, with the assumption of five clusters. Same format as Figure E. G, Pairwise Rand index values for clusterings based on all or only good units, using asymmetric (AR) or equal-reward (ER) data, and assuming 4 or 5 clusters.

Average choice and RT performance for trials split by firing rates for different neuron clusters and epochs.
Same format as Figure 5B. The columns correspond to the same clusters in Figure 5E.

Raw plots corresponding to the summary in Figure 5E
Top section: average firing rates for the four clusters. Same as the top row of Figure 5E. Second section: Cumulative density functions of regression coefficient (bFR) for DDM parameters. The title for each plot indicates the DDM parameter and the epoch from which firing rates were used to split the trials. Same format as Figure 5C. Third section: Cumulative density functions of regression coefficient (bRewFR). Same format as the second section. Bottom section: Scatterplots of the difference in DDM parameters between reward contexts and the regression coefficient for reward context of neural activity. Same format as Figure 5D. For the last three sections, only relationships with a p<0.05 from sign test (for bFR and bRewFR) or significant Pearson correlation are shown, corresponding to the colored pixels in Figure 5E.

Assessing the robustness of the regression and correlation results with different clustering settings.
A, Regression and correlation results based on dividing the neurons into three clusters. Same format as Figure 5E. B, Regression and correlation results based on dividing the neurons into five clusters. Same format as Figure 5E.