Abstract
Humans and animals exhibit a remarkable ability to selectively filter out irrelevant information based on context. However, the neural mechanisms underlying this context-dependent selection process remain elusive. Recently, the issue of discriminating between two prevalent selection mechanisms— input modulation versus selection vector modulation—with neural activity data has been highlighted as one of the major challenges in the study of individual variability underlying context-dependent decision-making (CDM). Here, we investigated these selection mechanisms through low-rank neural network modeling of the CDM task. We first showed that only input modulation was allowed in rank-one neural networks and additional dimensions of network connectivity were required to endow neural networks with selection vector modulation. Through rigorous information flow analysis, we gained a mechanistic understanding of why additional dimensions are required for selection vector modulation and how additional dimensions specifically contribute to selection vector modulation. This new understanding then led to the identification of novel neural dynamical signatures for selection vector modulation at both single neuron and population levels readily testable in experiments. Together, our results provide a rigorous theoretical framework linking network connectivity, neural dynamics and selection mechanisms, paving the way towards elucidating the circuit mechanisms when studying individual variability in context-dependent computation.
Introduction
Imagine you are playing a card game. Depending on your opponents’ actions, you may change the way you organize the cards and adjust your strategy to increase your chance of winning. This example exemplifies the context-dependent nature of most decisions we made in our daily life (Miller & Cohen, 2001; Roy et al., 2010; Mante et al., 2013; Saez et al., 2015; Siegel et al., 2015; Bernardi et al., 2020; Takagi et al., 2021; Flesch et al., 2022; Barbosa et al., 2023). However, how the brain performs such context-dependent computation remains elusive (Fusi et al., 2016; Cohen, 2017; Badre et al., 2021; Okazawa & Kiani, 2023).
Using a monkey context-dependent decision-making (CDM) behavioral paradigm, together with neurophysiological recordings and recurrent neural network modeling, an influential work uncovered a novel context-dependent selection mechanism, namely selection vector modulation (Mante et al., 2013), distinct from the early sensory input modulation counterpart (Desimone & Duncan, 1995; Noudoost et al., 2010). More recently, aided by a high-throughput animal training approach, a large number of rats were trained to perform a similar CDM task (Pagan et al., 2022). The obtained neural and behavioral data then supported a new theoretical framework revealing a spectrum of possible network mechanisms for the context-dependent selection process, opening the door towards addressing the important issue of individual variability in higher cognition. Critically, this theoretical framework pointed out that current neurophysiological data fell short of distinguishing between selection vector modulation and sensory input modulation, calling for rethinking what kind of evidence is required for differentiating different selection mechanisms.
Here, we investigated these two selection mechanisms through low-rank neural network modeling (Landau & Sompolinsky, 2018; Mastrogiuseppe & Ostojic, 2018; Kadmon et al., 2020; Schuessler et al., 2020; Beiran et al., 2021, 2023; Dubreuil et al., 2022; Valente et al., 2022; Ostojic & Fusi, 2024), with the aim of generating experimentally testable predictions for selection mechanism differentiation. Through introducing novel pathway-based information flow analysis afforded by the low-rank network modeling, we gained a parsimonious pathway-based understanding of selection vector modulation and explained where the two types of selection modulations can occur along different pathways. Critically, our results led to the identification of novel experimentally testable neural dynamical signatures for selection vector modulation at both the single-neuron and population levels. Together, our work provides a theoretical basis for resolving the challenging issue of distinguishing different selection mechanisms with neural activity data, shedding new light on the study of individual variability in neural computation underlying the ubiquitous context-dependent behaviors.
Results
Task paradigm, key concept and modeling approach
The task paradigm we focused on is the pulse-based context-dependent decision-making (CDM) task (Pagan et al., 2022), a novel rat-version CDM paradigm inspired by the previous monkey CDM work (Mante et al., 2013). In this paradigm, rats were presented with sequences of randomly-timed auditory pulses that varied in both location and frequency (Figure 1A). In alternating blocks of trials, rats were cued by an external context signal to determine the prevalent location (in the “LOC” context) or frequency (in the “FRQ” context). Note that compared to the continuous sensory input setting in previous works (e.g., Mante et al., 2013), this pulse-based sensory input setting allowed the experimenters to better characterize both behavioral and neural responses (Pagan et al., 2022). We will also demonstrate the unique advantage of this pulse-based input setting later in the present study (e.g., Fig. 7).
To solve this task, rats had to select the relevant information for the downstream evidence accumulation process based upon the context. There were at least two different mechanisms capable of performing this selection operation, i.e., selection vector modulation and input modulation (Mante et al., 2013; Pagan et al., 2022). To better introduce these mechanisms, we first reviewed the classical linearized dynamical systems analysis and the concept of selection vector (Figure 1B; Mante et al., 2013; Sussillo & Barak, 2013; Maheswaranathan et al., 2019; Maheswaranathan & Sussillo, 2020; Nair et al., 2023). In the linearized dynamical systems analysis, the neural dynamics around the choice axis (Figure 1B, red line) is approximated by a line attractor model. Specifically, the dynamics in the absence of external input can be approximated by the following linear equation where M is a matrix with one eigenvalue being equal to 0 and all other eigenvalues having a negative real part. For brevity, let us denote the left eigenvector of the 0 eigenvalue as s. In this linear dynamical system, the effect of any given perturbation can be decomposed along the directions of different eigenvectors: the projection onto the s direction will remain constant while the projections onto all other eigenvectors will exponentially decay to zero. Thus, for any given input I, only the component projecting onto the s direction (i.e., I ⋅ s) can be integrated along the line attractor (see Methods for more details). In other words, s serves as a vector selecting the input information, which is known as the “selection vector” in the literature (Mante et al., 2013).
Two distinct selection mechanisms can then be introduced based upon the concept of selection vector (Pagan et al., 2022). Specifically, to perform the CDM task, the stimulus input (LOC input, for example) must have a larger impact on evidence accumulation in the relevant context (LOC context) than in the irrelevant context (FRQ context). That is, I ⋅ s must be larger in the relevant context than in the irrelevant context. The difference between these two can be decomposed into two components:
where the Δ symbol denotes difference across two contexts (relevant – irrelevant) and the bar symbol denotes average across two contexts (see Methods for more details). The first component is called input modulation in which the change of input information across different contexts is emphasized (Figure 1B, top). In contrast, the second component is called selection vector modulation in which the change of selection vector across contexts is instead highlighted (Figure 1B, bottom).
While these two selection mechanisms were clearly defined, recent work showed that it is actually challenging to differentiate them through neural dynamics (Pagan et al., 2022). For example, both input modulation and selection vector modulation can lead to similar trial-averaged neural dynamics through targeted dimensionality reduction (Figure 1C; Pagan et al., 2022). Take the input modulation as an example (Figure 1C, top). One noticeable aspect we can observe is that the input information (e.g., location information) is preserved in both relevant (LOC context) and irrelevant contexts (FRQ context), which seems contradictory to the definition of input modulation. What is the mechanism underlying this counterintuitive result? As Pagan et al. pointed out earlier, input modulation is not the input change (ΔI) per se. Rather, it means the change of input multiplied by selection vector (i.e., ). Therefore, for the input modulation, while input information indeed is modulated by context along the selection vector direction, input information can still be preserved across contexts along other directions orthogonal to the selection vector, which explains the counterintuitive result and highlights the challenge of distinguishing input modulation from selection vector modulation in experiments (Pagan et al., 2022).
In this study, we sought to address this challenge using the low-rank RNN modeling approach. In contrast to the “black-box” vanilla RNN approach (e.g., Mante et al., 2013), the low-rank RNN approach features both well-controlled model complexity and mechanistic transparency, potentially providing a fresh view into the mechanisms underlying the intriguing selection process. Specifically, the low-rank RNNs we studied here implemented an input-output task structure similar to the classical RNN modeling work of CDM (Figure 1D; Mante et al., 2013). More concretely, the hidden state x of a low-rank RNN with N neurons evolves over time according to
where is a low-rank matrix with R output vectors mr, r = 1,⋯, R and R input-selection vectors nr, r = 1,⋯, R, τ is the time constant of single neurons, ϕ is the nonlinear activation function, us(t), s = 1, 2 embedded into the network through Is mimic the location and frequency click inputs, and embedded through indicate whether the current context is location or frequency. The output of the network is a linear projection of neural activity (see Methods for more model details). Under this general architectural setting, on one hand, through controlling the rank R of the matrix J during backpropagation training, we can determine the minimal rank required for performing the CDM task and reverse-engineer the underlying mechanism, which will be demonstrated in Figure 2. On the other hand, recent theoretical progress of low-rank RNNs (Beiran et al., 2021, 2021; Dubreuil et al., 2022) enabled us to explicitly construct neural network models with mechanistic transparency, complementing the reverse-engineering analysis (Mante et al., 2013; Sussillo & Barak, 2013), which will be shown in Figure 3.
No selection vector modulation in rank-one models
In the literature, it was found that rank-one RNN models suffice to solve the CDM task (Dubreuil et al., 2022). Here, we further asked whether selection vector modulation can occur in rank-one RNN models. To this end, we trained many rank-1 models (see Methods for details) and found that indeed having rank-1 connectivity (e.g., with the overlap structure listed in Figure 2A; for the detailed connectivity structure, see Figure 2-figure supplement 1) is sufficient to perform the CDM task, consistent with the earlier work. As shown in Figure 2B, in context 1, the decision was made based on input-1 evidence, ignoring input-2 evidence, indicating that the network can effectively filter out irrelevant information. To answer what kind of selection mechanisms underlies this context-dependent computation, we computed the selection vector in two contexts through linearized dynamical systems analysis (Sussillo & Barak, 2013). Cosine angle analysis revealed that selection vectors kept invariant across different contexts (Figure 2C, left), indicating no selection vector modulation. This result was preserved across different hyperparameter settings (such as different regularization coefficients or activation functions) and we also provide a mathematical proof in Methods (see also Pagan et al., 2023).
While the selection vector was not altered by contexts, the direction of input representations changed significantly across different contexts (Figure 2C, left; see Methods for the definition of input representation direction). Further analysis revealed that the overlap between the input representation direction and the unchanged selection vector is large in the relevant context and small in the irrelevant context, supporting the input modulation mechanism (Figure 2D). These results indicate that while a rank-1 network can perform the task, it can only achieve flexible computation through input modulation (Figure 2E). Importantly, when applying a similar targeted dimensionality reduction method to this rank-1 model, we found that the irrelevant sensory input information was indeed well-represented in neural activity state space (Figure 2F), supporting the conclusion made in the Pagan et al. paper that the presence of irrelevant sensory input in neural state space cannot be used as a reliable indicator for the absence of input modulation (Figure 1C).
In summary, we conclude that to study the mechanism of selection vector modulation, instead of limiting to the simplest model of CDM task, it is necessary to explore network models with higher ranks.
A low-rank model with pure selection vector modulation
To study the mechanism of selection vector modulation, we designed a rank-3 neural network model, with one additional rank for each sensory input feature (i.e., for input 1 and for input 2; Figure 3A, left). Specifically, we ensured that I1 (I2) has a positive overlap with and zero overlap with ndv, while has a positive overlap with ndv (Figure 3A, right; see Figure 3-figure supplement 1 and Methods for more details). This configuration implies that the stimulus input 1 (2) is first selected by , represented by , and subsequently selected by ndv before being integrated by the accumulator. In principle, this sequential selection process enables more sophisticated contextual modulations.
We confirmed that a model with such connectivity structure can perform the task (Figure 3B) and then conducted an analysis similar to that performed for rank-1 models. Unlike the rank-1 model, the selection vector for this rank-3 model changes across contexts while the input representation direction remains invariant (Figure 3C). Further analysis revealed that the overlap between the selection vector and the unchanged input representation direction is large in the relevant context and small in the irrelevant context (Figure 3D), supporting a pure selection vector modulation mechanism (Figure 3E) distinct from the input modulation counterpart shown in Figure 2D. When applying a similar targeted dimensionality reduction method to this rank-3 model, as what we expected, we found that both relevant and irrelevant sensory input information was indeed well-represented in neural activity state space (Figure 3F), which was indistinguishable from the input modulation counterpart (Figure 2F).
Together, through investigating these two extreme cases—one with pure input modulation and the other with pure selection vector modulation, we not only reconfirm the challenge of distinguishing input modulation from selection modulation based on neural activity data (Pagan et al., 2022) but also point out the previously unknown link between selection vector modulation and network connectivity dimensionality.
Understanding context-dependent modulation in Figs. 2 and 3 through pathway-based information flow analysis
What is the machinery underlying this link between selection vector modulation and network connectivity dimensionality? One possible way to address this issue is through linearized dynamical systems analysis: first computing the selection vector and the sensory input representation direction through reverse-engineering (Barak & Sussillo, 2013) and then calculating both selection vector modulation and input modulation according to Eq. 1. However, the connection between the network connectivity dimensionality and the selection vector obtained through reverse-engineering is implicit and in general non-trivial (Pagan et al., 2023), hindering further investigation of the underlying machinery. Here, building on recent theoretical progress in low-rank RNNs (Mastrogiuseppe & Ostojic, 2018; Dubreuil et al., 2022), we introduced a novel pathway-based information flow analysis approach, providing an explicit link between network connectivity, neural dynamics and selection mechanisms.
To start with, the low-rank RNN dynamics (i.e., Eq. 2) can be described by an information flow graph, with each task variable as a node and each effective coupling between task variables as an edge (Mastrogiuseppe & Ostojic, 2018; Dubreuil et al., 2022). Take the rank-1 RNN in Figure 2 as an example. A graph with three nodes, including two input variables and one decision variable kdv(t), suffices (Figure 4A; see Methods for more details). In this graph, the dynamical evolution of the task variable kdv can be expressed as:
where the effective coupling from the input variable to the decision variable kdv is equal to the overlap between the input representation direction (each element is defined and the input-selection vector ndv. More precisely, , where ⟨a, b⟩ is defined as for two length-N vectors. Since the input representation direction depends on the single neuron gain ϕ′ and the context input can modulate this gain, the effective coupling is context-dependent. Indeed, as shown in Figure 4A and Figure 4-figure supplement 1A, exhibited a large value in context 1 but was negligible in context 2 while exhibited a large value in context 2 but was negligible in context 1. In other words, information from the input variable can arrive at the decision variable only in the relevant context, which exactly is the computation required by the CDM task.
To get a more intuitive understanding of the underlying information flow process, we discretized the equation and followed the information flow step by step. Specifically, by discretizing Eq. 3 using Euler’s method with a time step equal to the time constant τ of the system, we get
Take context 1 as an example (Figure 4B, top panel). Initially, there is no information in the system. In step 1, pulse inputs of size A1 and A2 are placed in the inp1 and inp2 slots, respectively. The information from these slots, after being multiplied by the corresponding effective coupling, then flows to the dv slot. In context 1, , meaning only the content from the inp1 slot can arrive at the dv slot. Consequently, in step 2, the information content in the dv slot would be . The following steps will replicate step 2 due to the recurrent connectivity of the dv slot. The scenario in context 2 is similar to context 1, except that only the content from the inp2 slot arrives at the dv slot (Figure 4B, bottom panel). The continuous dynamics for each task variable given pulse input is displayed in Figure 4-figure supplement 2, A-C.
The same pathway-based information flow analysis can also be applied to the rank-3 model (Figure 4C and 4D). In this model, similar to the rank-1 models, there are input variables (and ) and decision variable (kdv). Additionally, it includes intermediate variables ( and ) corresponding to the activity along the and axes. In this scenario, instead of flowing directly from the input to the decision variable, the information flows first to the intermediate variables and then to the decision variable. These intermediate variables act as intermediate nodes. Introducing these nodes does more than simply increase the steps from the input variable to the decision variable. In the rank-1 case, the context signals can only modulate the pathway from the input to the decision variable. However, in the rank-3 case, context signals can modulate the system in two ways: from the input to the intermediate variables and from the intermediate variables to the decision variable.
Take the rank-3 model introduced in Figure 3 as an example. Context signals did not alter the representation of input signals, leading to constant effective couplings (i.e., constant and ) from input to intermediate variables across contexts. Instead, it changed the effective coupling from the intermediate variables to the decision variable (i.e., large in context 1 and near zero in context 2; Figure 4C and Figure 4-figure supplement 1B). Consider the context 1 scenario in the discrete case. In step 1, pulse inputs of size A1 and A2 are placed in the inp1 and inp2 slots, respectively. In step 2, information flows to the intermediate slots, with in the iv1 slot and in the iv2 slot. In step 3, only the information in the iv1 slot flows to the dv slot, with the information content being (Figure 4D, top panel). The scenario in context 2 is similar to context 1, except that only the content from the iv2 slot reaches the dv slot in the third step, with the content being (Figure 4D, bottom panel). The continuous dynamics for each task variable given pulse input is displayed in Figure 4-figure supplement 2, D and E.
Together, this pathway-based information flow analysis provides an in-depth understanding of how the input information can be routed to the accumulator depending on the context, laying the foundation for a novel pathway-based information flow definition of selection vector and input contextual modulations.
Information flow-based definition of selection vector modulation and selection vector for more general cases
Based on the understanding gained from the pathway-based information flow analysis, we now provide a novel definition of input modulation and selection vector modulation distinct from the one in Eq. 1. To begin with, we first considered a model with mixed input and selection vector modulation (Figure 5A, left), instead of studying extreme cases (i.e., one with pure input modulation in Figure 2 and the other with pure selection vector modulation in Figure 3). In this more general model, input information can either go directly to the decision variable (with effective coupling Einp→dv) or first pass through the intermediate variables before reaching the decision variable (with effective coupling Einp→iv and Eiv→dv respectively). Applying the same information flow analysis, we see that a pulse input of unit size will ultimately reach the dv slot with a magnitude of Einp→ dv + Einp→iv Eiv→dv (Figure 5A, right; see Methods for more details). In other words, the total effective coupling Etol from the input to the decision variable is equal to Einp→dv 2 Einp→ivEiv→dv. Now, it is straightforward to decompose the context-dependent modulation of Etol in terms of input or selection vector change:
in which the first component stands for the change of input representation (termed as input modulation) and the second component is the one without changing the stimulus input representation (termed as selection vector modulation).
We then asked if this pathway-based definition is equivalent to the one in Eq. 1 based on linearized dynamical system analysis (Figure 5B; Pagan et al., 2022). To answer this question, we numerically compared these two definitions using a family of models with both input and selection vector modulations (Figure 5C; see Methods for model details) and found that these two definitions produced the same proportion of selection vector modulation across a wide range of parameter regimes (Figure 5D). Together with theoretical derivation of equivalence (see Methods), this consistency confirmed the validity of our pathway-based definition of contextual modulation decomposition.
Having elucidated the pathway-based definitions of input and selection vector modulation, we next provide a novel pathway-based definition of selection vector. For the network depicted in Figure 5A, the total effective coupling Einp→dv 2 Einp→ivEiv→dv can be rewritten as , where is the input representation direction and . This reformulation aligns with the insight that the amount of input information that can be integrated by the accumulator is determined by the dot product between the input representation direction and the selection vector. Thus, ntol is the selection vector of the circuit in Figure 5A.
To better understand why selection vector has such a formula, we visualized the information propagation from input to the choice axis using a low-rank matrix (Figure 6A). Specifically, it comprises two components, each corresponding to a distinct pathway. For the first component, input information is directly selected by ndv. For the second component, input information is sent first to the intermediate variable and then to the decision variable. This pathway involves two steps: in the first step, the input representation vector is selected by niv, and in the second step, to arrive at the choice axis, the selected information has to be multiplied by the effective coupling . By concatenating these two steps, information propagation from the input to the choice axis in this pathway can be effectively viewed as a selection process mediated by the vector (termed as the second-order selection vector component). Therefore, ntol provides a novel pathway-based definition of selection vector in the network. We further verified the equivalence between this pathway-based definition and the linearized-dynamical-systems-based classical definition (Mante et al., 2013) in our simple circuit through theoretical derivation (see Methods) and numerical comparison (Figure 6B).
To visualize the pathway-based selection vector in neural activity state space, we found that a minimum of three dimensions is required, including the input representation direction, the decision variable representation direction and the intermediate variable representation direction (Figure 6C, left). This geometric visualization highlighted the role of extra-dimensions beyond the classical two-dimensional neural activity space spanned by the line attractor and selection vector (Figure 6C, right) in accounting for the selection vector modulation. This is simply because only the second-order selection vector component, which depends on the existence of the intermediate variable, is subject to contextual modulation. In other words, without extra-dimensions to support intermediate variable encoding, there will be no selection modulation.
Together, this set of analysis provided a parsimonious pathway-based understanding for both selection vector and its contextual modulation.
Model prediction verification with vanilla RNN models
The new insights obtained from our new framework enable us to generate testable predictions for better differentiating selection vector modulation from input modulation, a major challenge unresolved in the field (Pagan et al., 2022).
First, we predict that it is more likely to have a large proportion of selection vector modulation for a neural network with high-dimensional connectivity. To better explain the underlying rationale, we can simply compare the number of potential connections contributing to the input modulation with those contributing to the selection vector modulation for a given neural network model. For example, in the network presented in Figure 5, there are three connections (including light blue, dark blue and green ones) while only one connection (i.e., the green one) supporting selection vector modulation. For a circuit with many higher-order pathways (e.g., Figure 7A), only those connections with the input as the sender is able to support input modulation. In other words, there exist far more connections potentially eligible to support the selection vector modulation (Figure 7B), thereby leading to a large selection vector modulation proportion. We then tested this prediction on vanilla RNNs trained through backpropagation (Figure 7C; Mante et al., 2013; Song et al., 2016; Yang & Wang, 2020). Using effective dimension (see Methods for a formal definition; Rudelson & Vershynin, 2007; Sanyal et al., 2020) to quantify the dimensionality of the connectivity matrix, we found a strong positive correlation between the effective dimension of connectivity matrix and the selection vector modulation (Figure 7D, left panel and Figure 7-figure supplement 1A).
We then asked if we could generate predictions to quantify the proportion of selection vector modulation purely based on neural activities. To this end, as what has been performed in Pagan et al., we took advantage of the pulse-based sensory input setting to calculate the single neuron response kernel (see Methods for details). For a given neuron, the associated response kernel is defined to characterize the influence of a pulse input on its firing rate in later times. For example, for a neuron encoding the decision variable, the response kernel should exhibit a profile accumulating evidence over time (Figure 7E, top left). In contrast, for a neuron encoding the sensory input, the response kernel should exhibit an exponential decay profile with time constant τ (Figure 7E, top right). For each RNN trained through backpropagation, we then examined the associated single neuron response kernels. We found that for the model with low effective dimension (denoted by a star marker in Figure 7D), there were mainly two types of response kernels, including sensory input profile and decision variable profile (Figure 7E, top). In contrast, for the model with the highest effective dimension (denoted by a square marker in Figure 7D), aside from the sensory input and decision variable profiles, richer response kernel profiles were exhibited (Figure 7E, bottom). In particular, there was a set of single neuron response kernels with peak amplitudes occurring between the pulse input onset and the choice onset (Figure 7E, bottom right). These response kernels cannot be explained by the combination of sensory input and decision variable. Instead, the existence of these response kernels signifies neural dynamics in extra-dimensions beyond the subspaces spanned by the input and decision variable, the genuine neural dynamical signature of the existence of selection vector modulation (Figure 6C).
While single neuron response kernels are illustrative in highlighting the model difference, they lack explanatory power at the population level. Therefore, we employed the singular value decomposition method to extract the principal dynamical modes of response kernels at the population level (see Methods for details). We found that similar dynamical modes, including one persistent choice mode (grey) and three transient modes (blue, orange, green), were shared across both the low and high effective dimension models (Figure 7F, left). The key difference between these two models lies in the percentage of explained variance (PEV) of the second transient mode (orange): while there is near-zero PEV in the low effective dimension model (Figure 7F, top right), there is a substantial PEV in the high effective dimension model (Figure 7F, bottom right), consistent with the single neuron picture shown in Figure 7E. This result led us to use the PEV of extra dynamical modes (including the orange and green ones; see Methods for details) as a simple index to quantify the amount of selection vector modulation in these models. As expected, we found that the PEV of extra dynamical modes can serve as a reliable index reflecting the proportion of selection vector modulation in these models (Figure 7G, right panel and Figure 7-figure supplement 1, B and C). Similar results for vanilla RNNs trained with different hyperparameter settings are displayed in Figure 7-figure supplement 2.
Together, we identified novel neural dynamical signatures of section vector modulation at both the single neuron and population level, suggesting that the potential great utility of these neural dynamical signatures in distinguishing the contribution of selection vector modulation from input modulation in experimental data.
Discussion
Using low-rank RNNs, we provided a rigorous theoretical framework linking network connectivity, neural dynamics, and selection mechanisms, and gained an in-depth algebraic and geometric understanding of both input and selection vector modulation mechanisms, and accordingly uncovered a previously unknown link between selection vector modulation and extra-dimensions in neural state space. This gained understanding enabled us to generate novel predictions linking novel neural dynamic modes with the proportion of selection vector modulation, paving the way towards addressing the intricacy of neural variability across subjects in context-dependent computation.
A pathway-based definition of selection vector modulation
In their seminal work, Mante, Sussillo, and their collaborators developed a numerical approach to compute the selection vector for trained RNN models (Mante et al., 2013). Based on this concept of selection vector, recently, Pagan et al. proposed a new theoretical framework to decompose the solution space of context-dependent decision-making, in which input modulation and selection vector modulation were explicitly defined (i.e., Eq. 1). Here, taking the theoretical advantage of low-rank RNNs (Mastrogiuseppe & Ostojic, 2018; Dubreuil et al., 2022), we went beyond numerical reverse-engineering and provided a complementary pathway-based definition of both selection vector and selection vector modulation (i.e., Eq. 5). This new definition gained us a novel geometric understanding of selection vector modulation, revealed a previously unknown link between extra dimensions and selection vector modulation (Figure 6C), and eventually provided us experimentally identifiable neural dynamical signature of selection vector modulation at both the single neuron and population levels (Figure 7, E-G).
Individual neural variability in higher cognition
One hallmark of higher cognition is individual variability, as the same higher cognition problem can be solved equally well with different strategies. Therefore, studying the neural computations underlying individual variability is no doubt of great importance (Hariri, 2009; Parasuraman & Jiang, 2012; Keung et al., 2020; Nelli et al., 2023). Recent experimental advances enabled researchers to investigate this important issue in a systematic manner using delicate behavioral paradigms and large-scale recordings (Pagan et al., 2022). However, the computation underlying higher cognition is largely internal, requiring discovering novel neural activity patterns as internal indicators to differentiate distinct circuit mechanisms. In the example of context-dependent decision-making studied here, to differentiate selection vector modulation from input modulation, we found the PEV in extra dynamical modes is a reliable index for a wide variety of RNNs (Figure 7D and Figure 7-figure supplement 2). However, cautions have be made here as we can conveniently construct counter-examples deviating from the picture depicted by this index. In fact, in the extreme scenario, we can construct two models with distinct circuit mechanisms (selection vector modulation and input modulation, respectively) but having the same neural activities (Figure 7-figure supplement 3), suggesting that any activity-based index alone would fail to make this differentiation. Therefore, our modeling work suggests that, to address the intricacy of individual variability of neural computations underlying higher cognition, integrative efforts incorporating not only large-scale neural activity recordings but also activity perturbations, neuronal connectivity knowledge and computational modeling may be inevitably required.
Beyond context-dependent decision making
While we mainly focused on context-dependent decision-making task in this study, the issue of whether input or selection vector modulation prevails is not limited to the domain of decision-making. For instance, recent work (Chen et al., 2024) demonstrated that during sequence working memory control (Botvinick & Watanabe, 2007; Xie et al., 2022), sensory inputs presented at different ordinal ranks first entered into a common sensory subspace and then were routed to the corresponding rank-specific working memory subspaces in monkey frontal cortex. Here, similar to the decision-making case (Figure 1C), the same issue arises: where is the input information selected by the context (here, the ordinal rank)? Can the presence of a common sensory subspace (similar to the presence of location information in both relevant and irrelevant contexts in Figure 1C) preclude the input modulation? The pathway-based understanding of input and selection vector modulation gained from CDM in this study may be transferable to address these similar issues.
The role of transient dynamics in extra-dimensions in context-dependent computation
In this study, we linked the selection vector modulation with transient dynamics (Aoi et al., 2020; Soldado-Magraner et al., 2023) in extra-dimensions. While the transient dynamics in extra-dimensions is not necessary in context-dependent decision-making here (Dubreuil et al., 2022; Figure 2), more complex context-dependent computation may require its presence. For example, recent work (Tian et al., 2024) found that transient dynamics in extra subspaces is required to perform the switch operation (i.e., exchanging information in subspace 1 with information in subspace 2). Understanding how transient dynamics in extra-dimensions contribute to complex context-dependent computation warrants further systematic investigation.
In summary, through low-rank neural network modeling, our work provided a parsimonious mechanistic account for how information can be selected along different pathways, making significant contributions towards understanding the intriguing selection mechanisms in context-dependent computation.
Appendix
The general form of RNNs
We investigated networks of N neurons with S input channels, described by the following temporal evolution equation
In this equation, xi(t) represents the activation of neuron i at time t, τ denotes the characteristic time constant of a single neuron, and ϕ is a nonlinear activation function. Unless otherwise specified, we use the tanh function as the activation function. The coefficient Jij represents the connectivity weight from neuron j to neuron i. The input us(t) corresponds to the s-th input channel at time t, with feedforward weight Isi to neuron i, and ϵi(t) represents white noise at time t. The network’s output is obtained from the neuron’s activity ϕ(x) through a linear projection:
The connectivity matrix J, specified as J = {Jij}, i = 1, …, N, j = 1, …, N, can be a low-rank or a full-rank matrix. In the low-rank case, J is restricted to a low-rank matrix , in which mr is the r-th output vector, and nr is the r-th input-selection vector, with each element of mr and nr considered an independent parameter (Dubreuil et al., 2022). In this paper, we set the time constant τ to be 100 ms, and use Euler’s method to discretize the evolution equation with a time step Δt = 20 ms.
Task setting
We modeled the click-version CDM task recently investigated by (Pagan et al., 2022). The task involves four input channels , and , where u1(t) and u2(t) are stimulus inputs and and are context inputs. Initially, there is a fixation period lasting for Tfix = 200 ms. This is followed by a stimulus period of Tsti = 800 ms and then a decision period of Tdecision = 20 ms.
For trial k, at each time step, the total number of pulse inputs (#pulse) is sampled from a Poisson distribution with a mean value of 40Δt = 0.8. Each pulse has two properties: location and frequency. Location can be either left or right, and frequency can be either high or low. We randomly sample a pulse to be right with probability (hence left with probability ) and to be high with probability (hence low with probability). The values of and are independently chosen from the set {39/40, 35/40, 25/40, 15/40, 5/40, 1/40}. The stimulus strength for the location input at trial k is defined as , and for the frequency input, it is defined as . The input u1(t) represents location evidence, calculated as 0.1 times the difference between the number of right pulses and the number of left pulses (#right-#left) at time step t. The input u2(t) represents frequency evidence, calculated as 0.1 times the difference between the number of high-frequency pulses and the number of low-frequency pulses (#high-#low) at time step t.
The context is randomly chosen to be either the location context or the frequency context. In the location context, and throughout the entire period. The target output zk is defined as the sign of the location stimulus strength (1 if , otherwise -1). Thus, in this context, the target output zk is independent of the frequency stimulus input. Conversely, in the frequency context, and . The target output zk is defined as the sign of the frequency stimulus strength (1 if , otherwise -1). Thus, in this context, the target output is independent of the location stimulus input.
Linearized dynamical system analysis (Figure 1)
To uncover the computational mechanism enabling each RNN to perform context-dependent evidence accumulation, we utilize linearized dynamical system analysis to “open the black box” (Mante et al., 2013). The dynamical evolution of an RNN in context c is given by:
where r = ϕ(x) is the neuron activity. First, we identify the slow point of each RNN in each context using an optimization method (Mante et al., 2013). Let x* be the discovered slow point, i.e., where r* = ϕ(x*). We define a diagonal matrix G = diag(ϕ′(x*)), with the i-th diagonal element representing the sensitivity of the i-th neuron at the slow point. Near the slow point, we have Δr = r − r* ≈ Gϕ′(Δx). Then, we can derive:
Thus, the dynamics of neuron activity around the slow point can be approximated by a linear dynamical system:
where E is the identity matrix, M denotes the state transition matrix of the dynamical system, and are stimulus input representation directions. Similar to previous work (Mante et al., 2013), we find that for every network, the linear dynamical system near the slow point in each context is roughly a linear attractor. Specifically, the transition matrix M has a single eigenvalue close to zero, while all other eigenvalues have negative real parts. The right eigenvector of M associated with the eigenvalue close to zero defines the stable direction of the dynamical system, forming the line attractor ρ (unit norm). The left eigenvector of M associated with that eigenvalue defines the direction of the selection vector s. The norm of selection vector s is chosen such that s ⋅ ρ = 1. Previous work (Mante et al., 2013) has shown that a perturbation Δr0 from the line attractor will eventually converge to the line attractor, with the distance from the starting point being s ⋅ Δr0.
Based on linearized dynamical systems analysis, Pagan et al. recently defined input modulation and selection vector modulation (Pagan et al., 2022) as:
Specifically, the input modulation and selection vector modulation for stimulus input 1 are defined as and , respectively. Similarly, the input modulation and selection vector modulation for stimulus input 2 are defined as and , respectively.
Training of rank-1 RNNs using backpropagation (Figure 2)
The rank-1 RNNs in Figure 2 are trained using backpropagation-through-time with the PyTorch framework. These networks are trained to minimize a loss function defined as:
Here, zk is the target output, is the network output, and the indices t and k represent time and trial, respectively. Mt is a temporal mask with value {0, 1}, where Mt is 1 only during the decision period. Lreg is the L2 regularization loss. For full-rank RNNs (Figure 7), and for low-rank RNNs,. The loss function is minimized by computing gradients with respect to all trainable parameters. We use Adam optimizer in PyTorch with the decay rates for the first and second moments set to 0.9 and 0.99, respectively, and a learning rate of 10−3. Each RNN is trained for 5000 steps with a batch size of 256 trials.
For Figure 2, we trained 100 RNNs of N = 512 neurons with the same training hyperparameters. The rank of the connectivity matrix is constrained to be rank-1, represented as . We trained the elements of the input vectors , connectivity vector mdv, ndv, and the readout vector w. All trainable parameters were initialized with random independent Gaussian weights with a mean of 0 and a variance of 1/N2. The regularization coefficient wreg is set to 10−4.
Proof of no selection vector modulation in rank-1 RNNs (Figure 2)
The transition matrix of neuron activity in the rank-1 RNN is given by:
Multiplying MT on the right by ndv, we obtain
where the ⟨⋅,⋅⟩ symbol is defined as for two vectors of length-N. The requirement for the linear attractor approximation, , is met by all our trained and handcrafted RNNs (Figure 4-supplement figure 1). This demonstrates that ndv is the left eigenvector of the transition matrix, and hence is the selection vector in each context. Therefore, the direction of the selection vector is invariant across different contexts for the rank-1 model, indicating no selection vector modulation and this is consistent with the training results shown in Figure 2.
Handcrafting rank-3 RNNs with pure selection vector modulation (Figure 3)
First, we provide the implementation details of the rank-3 RNNs used in Figure 3. The network consists of 30,000 neurons divided into three populations, each with 10,000 neurons. The first population (neurons 1-10,000) receives stimulus input, accounting for both the information flow from the stimulus input to the intermediate variables and the recurrent connection of the decision variable. The second (neurons 10,001-20,000) and third populations (neurons 20,001-30,000) handle the information flow from the intermediate variables to the decision variable and are modulated by the context signal. To achieve this, we generate three Gaussian random matrices (M(1), M(2), M(3)) of shape 10,000 × 3. Let denote the r-th column of matrix M(p). The stimulus input I1 is given by the concatenation of three length-10,000 vectors , where 0 denotes a length-10,000 zero vector. The stimulus input I2 is given by . The context input is given by . The context input is given by . The connectivity vectors miv1, miv2 and mdv are given by , and , respectively. The input-selection vectors niv1, niv2, and ndv are given by , and , respectively, where represents the average gain of the second population in context 1 or the third population in context 2. The readout vector w is given by . We generate 100 RNNs based on this method to ensure that the conclusions in Figure 3 do not depend on the specific realization of random matrices. Linearized dynamical system analysis reveals that all these RNNs perform flexible computation through pure selection vector modulation. Please see the section “The construction of rank-3 RNN models” for a mean-field-theory-based understanding.
Pathway-based information flow graph analysis of low-rank RNNs (Figure 4)
The dynamical evolution equation for low-rank RNN with R ranks and S input channels in context c is given by
Assuming at t = 0, the dynamics of x(t) − x(0) are always constrained in the subspace spanned by {mr, r = 1, …, R} and {Is, s = 1, …, S}. Therefore, x(t) can be expressed as a linear combination of these vectors: , leading to the following evolving dynamics of task variables:
where denotes activation along the s-th input vector, termed the input task variable and kr(t) denotes activation along the j-th output vector, termed the internal task variable.
The rank-1 RNN case
Therefore, for rank-1 RNNs, the latent dynamics of decision variable (internal task variable associated with mdv) in context c is given by
Similar to the linearized dynamical systems analysis introduced earlier, we linearized equation (15) around , obtaining the following linearized equation:
where (termed as the decision variable representation direction) and (termed as the input representation direction), with Gc equal to . By denoting and as and , respectively, together with the fact that is close to zero for all trained rank-1 RNNs, we obtain
which is Eq. 3 in the main text.
The rank-3 RNN case
Using a similar method, we can uncover the latent dynamics of rank-3 RNNs shown in Figure 5. Note that the rank-3 RNN in Figure 3 is a special case of this more general form. The latent dynamics for internal task variables in context c can be written as:
By applying the same first-order Taylor expansion, we obtain the following equations:
We consider the case in which the intermediate variables (, internal task variables associated with , respectively) only receive information from the corresponding stimulus input, and the effective coupling of the recurrent connection is 1 in both contexts.
Specifically, we assume:
Our construction methods for rank-3 RNNs in Figures 3 and 5 guarantee these conditions when the network is large enough (for example, N = 30,000 in our setting). Under these conditions, equations (20) and (21) can be simplified to:
Suppose at t = 0, the network receives a pulse from input 1 with size A1 and a pulse from input 2 with size A2, which correspond to u1(t) = A1τδ(t), u2(t) = A2τδ(t). Under this condition, the expression for kdv is given by
From equation (27), as t → ∞, kdv will converge to ,providing a theoretical basis for the pathway-based information flow formula presented in Figures 4 and 5.
Building rank-3 RNNs with both input and selection vector modulations (Figure 5)
Understanding low-rank RNNs in the mean-field limit
The dynamics of task variables in low-rank RNNs can be mathematically analyzed under the mean-field limit (N → +∞) when each neuron’s connectivity component is randomly drawn from a multivariate Gaussian mixture model (GMM) (Beiran et al., 2021; Dubreuil et al., 2022). Specifically, we assume that, for the i-th neuron, the connectivity component vector is drawn independently from a GMM with P components. The weight for the j-th component is αj, and this component is modeled as a Gaussian distribution with mean zero and covariance matrix Σ(j). Let denote the upper-left (S + R + 2) × (S + R + 2) submatrix of Σ(j), which represents the covariance matrix of within the j-th component. Let denote the covariance of a(i) and b(i), where the i-th neuron belongs to the p-th component.
Given these assumptions, under the mean-field limit (N → 2∞), equation (14) can be expressed as
where if i = c, otherwise . Under the condition of small task variables and kr, r = 1, …, R), is approximately equal to and the quantities ⟨ϕ′⟩p, p = 1, …, P are determined solely by the covariance of the context c signal input within each population. Clearly, the effective coupling from the input task variable to the internal task variable kr is given by , and the effective coupling between the internal task variables kj and kr is given by .
Mean-field-theory-based model construction
Utilizing this theory, we can construct RNNs tailored to any given ratio of input modulation to selection vector modulation by properly setting the connectivity vectors . The RNN we built consists of 30,000 neurons divided into three populations. The first population (neurons 1-10,000) receives the stimulus input, accounting for information flow from stimulus input to intermediate variables (the connection strength is controlled by β) and the recurrent connection of the decision variable. The second (neurons 10,001-20,000) and third populations (neurons 20,001-30,000) receive the stimulus input, accounting for the information flow from the stimulus input to the decision variable (the connection strength is controlled by α) and the information flow from the intermediate variables to the decision variable (the connection strength is controlled by η), and are modulated by contextual input. To achieve this, we generate three Gaussian random matrices (M(1), M(2), M(3)) of shape 10,000 × 3, 10,000 × 5 and 10,000 × 5, respectively. Let denote the r-th column of matrix M(p). The stimulus input I1 is given by the concatenation of three length-10,000 vectors . The stimulus input I2 is given by . The context input is given by .The context input is given by . The connectivity vectors miv1, miv2, and mdv are given by and , respectively. The input-selection vectors and are given by , respectively. ndv is given by where is the average gain of the second population in context 1 or third population in context 2. The readout vector w is given by .
For the network, in context 1, ⟨ϕ′⟩p is only determined by the covariance of the context 1 signal. That is,
Our construction method guarantees that for p = 1,2,3, respectively. Hence, ⟨ϕ′⟩p = 1, g, 1 for p = 1,2,3, respectively. Then the effective coupling from input variable to in context 1 is given by
Similarly, we can get that:
Other unmentioned effective couplings are zero. Similarly, the effective couplings in context 2 are given by
Thus, for each input, the input modulation is α and the selection vector modulation is β × η. Therefore, any given ratio of input modulation to selection vector modulation can be achieved by varying the parameters α, β, and η. The example in Figure 3 with pure selection vector modulation is a special case with , and η = (1 − g2)/3.
Pathway-based definition of selection vector (Figure 6)
Next, we will consider the rank-3 RNN with latent dynamics depicted in equations (18) and (19). The input representation produced by a pulse input us = τAsδ(t), s = 1,2 in a certain context at t = 0 is given by . We have proven that this pulse input will ultimately reach the dv slot with a magnitude of . This equation can be rewritten as
Therefore, we define as the pathway-based definition of the selection vector. Through calculations, we can prove that this pathway-based definition of the selection vector is equivalent to the classical definition based on linearized dynamical systems. In fact, the transition matrix of neuron activity in this rank-3 RNN is given by:
Multiplying MT on the right by ntol, we obtain:
Moreover, under the condition that the choice axis is invariant across contexts, i.e., is invariant across contexts (Mante et al., 2013; Pagan et al., 2022),
This demonstrates that is indeed the left eigenvector of the transition matrix as well as the classical selection vector of the linearized dynamical systems.
The equivalence between two definitions of selection vector modulation (Figure 5)
Here, we use the rank-3 RNN and input 1 as an example to explain why there is an equivalence between our definition of selection vector modulation and the classical one (Pagan et al., 2022). In the previous section, we have proven that the input representation direction and selection vector are given by and , respectively. According to the classical definition (Pagan et al., 2022), the input modulation and selection vector modulation are given by and , respectively. Since and ndv are invariant across different contexts, we have:
Substituting these into equations (7) and (8) yields:
a result fully consistent with the pathway-based definition in Eq. 5 of the main text.
Pathway-based analysis of higher order low-rank RNNs (Figures 7, A and B)
In this section, we will consider RNNs with more intermediate variables (Figure 7A). Here, we consider only one stimulus modality for simplicity and use the same notation convention in the previous sections. The network consists of one input variable kinp, L intermediate variables , and one decision variable kdv. For simplicity, we let be an alias for the input variable and be an alias for the decision variable. We use the shorthand Ei→j and Ei→dv to denote the effective coupling and , respectively. The dynamics of these task variables are given by
Consider the case when, at time t = 0, the network receives a pulse input with unit size (i.e., u(t) = τδ(t)). Then, we have
where . In equation (52), Γ stands for the incomplete Gamma function and . These expressions tell us that, as time goes to infinity, all intermediate task variables will decay to zero and the decision variable will converge to , which means that a pulse input of unit size will ultimately reach the dv slot with a magnitude of . Therefore, we define the total effective coupling from the input variable to the decision variable in this higher-order graph as
The difference of the total effective coupling between relevant context and irrelevant context can be decomposed into:
The first term, caused by changing of stimulus input representation, is defined as the input modulation. The second term, the one without changing the stimulus input representation, is defined as the selection vector modulation.
Using a similar method in rank-3 RNNs, the selection vector for RNNs of higher order is given by:
Training of full-rank vanilla RNNs using backpropagation (Figure 7)
For Figure 7, we trained full-rank RNNs of N = 128 neurons. We trained the elements of the input vectors, the connectivity matrix, and the readout vector. We tested a large range of regularization coefficients ranging from 0 to 0.1. For each rreg chosen from the set {0, 0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1}, we trained 100 full-rank RNNs. A larger wreg value results in trained connectivity matrix J with lower rank, making the network more similar to a rank-1 RNN. All trainable parameters were initialized with random independent Gaussian weights with a mean of 0 and variance of 1/N2. Only trained RNNs with their largest eigenvalue of the activity transition matrix falling within the -0.05 to 0.05 range in both contexts were selected for subsequent analysis.
To ensure that the conclusions drawn from Figure 7 are robust and not dependent on specific hyperparameter settings, we conducted similar experiments under different model hyperparameter settings. First, we trained RNNs with the softplus activation function and regularization coefficients chosen from {0.1, 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.008, 0.004, 0.002, 0.001} (Figure 7-figure supplement 2A). Unlike tanh, softplus does not saturate on the positive end. Additionally, we tested the initialization of trainable parameters with a variance of 1/N (Figure 7-figure supplement 2B). Together, these experiments confirmed that the main results do not depend on the specific model hyperparameter settings.
Estimating matrix dimension and extra-dimension (Figure 7)
Effective dimension of connectivity matrix (Figure 7D)
Let σ1 ≥ σ2 ≥⋯≥ σn be the singular values of the connectivity matrix. Then, the effective dimension of matrix J is defined as its stable rank (Sanyal et al., 2020):
Single neuron response kernel for pulse input (Figure 7E)
We apply pulse-based linear regression (Pagan et al., 2022) to access how pulse input affects neuron activity. The activity of neuron i in trial k at time step t is given by:
where choice(k) is the RNN’s choice on trial k defined as the sign of its output during the decision period, context(k) is the context of trial k (1 for context 1 and 0 for context 2), ui,k indicates signed input i evidence as defined previously in Methods. The first three regression coefficients, βchoice;i,t, βcontext;i,t and βtime;i,t, capture the influent of choice, context, and time on the neuron’s activity at each time step, each being a 40-dimension vector. The remaining coefficient, and reflect the impact of pulse input on the neuron activity in specific context, each also a 40-dimension vector. The asterisk (*) indicates the convolution operation between the response kernel and input evidence, described by:
Therefore, there are a total of 280 regression coefficients for each neuron. We obtained these coefficients using ridge regression with 1000 trails for each RNN. The coefficient is termed the single neuron response kernel for input i in context c (Figure 7E).
Response kernel modes and normalized percentage of explained variance (PEV) (Figure 7F)
For each RNN, we construct a matrix B with shape Nt × (2N) from neuron response kernels for input 1 (or input 2) across both contexts, where Nt = 40 represents time steps and N is the number of neurons, with each column as a neuron’s response kernel in a context. Then we apply singular value decomposition (SVD) to the matrix:
where S is a diagonal square matrix of size Nt, with diagonal element being the singular value of the matrix. The first column of U serves as a persistent mode (mode 0), while the second and following columns are transient modes (transient dynamical modes 1, 2, etc.). The normalized PEV of transient dynamical mode r, r ≥ 1 is defined as:
PEV of extra dynamical modes
The PEV of extra dynamical modes for input 1 (or 2) is defined based on the normalized PEV of transient dynamical mode:
The counterintuitive extreme example (Figure 7-figure supplement 3)
We can manually construct two models (RNN1 and RNN2) with distinct circuit mechanisms (input modulation versus selection vector modulation) but showing the same neural activities. Specifically, we generated three Gaussian random matrices (M(1), M(2), M(3)) of shape 10,000 × 5, 10,000 × 5 and 10,000 × 1, respectively. Let denote the r-th column of matrix M(p). The two models have the same input vectors, output vectors, and input-selection vectors for intermediate task variables ( and ), given by:
where . The difference between these two RNNs lies in their input-selection vector ndv for the decision variable. For RNN1, ndv is given by
For RNN2, ndv is given by:
In this construction, the information flow graphs from input 1 to the decision variable (dv) in each context for the two RNNs are shown in Figure 7-figure supplement 3A.
Although the connectivity matrices of the two networks are different (Figure 7-figure supplement 3B), when provided with the same input, the neuron activity of the i-th neuron in RNN1 is exactly the same as that of the i-th neuron in RNN2 at the same time point (Figure 7-figure supplement 3C). The statistical results of similarity between all neuron pairs are given in Figure 7-figure supplement 3D.
References
- Prefrontal cortex exhibits multidimensional dynamic encoding during decision-makingNature Neuroscience 23:1410–1420https://doi.org/10.1038/s41593-020-0696-5
- The dimensionality of neural representations for controlCurrent Opinion in Behavioral Sciences 38:20–28https://doi.org/10.1016/j.cobeha.2020.07.002
- Early selection of task-relevant features through population gatingNature Communications 14https://doi.org/10.1038/s41467-023-42519-5
- Shaping Dynamics With Multiple Populations in Low-Rank Recurrent NetworksNeural Computation 33:1572–1615https://doi.org/10.1162/neco_a_01381
- Parametric control of flexible timing through low-dimensional neural manifoldsNeuron 111:739–753https://doi.org/10.1016/j.neuron.2022.12.016
- The Geometry of Abstraction in the Hippocampus and Prefrontal CortexCell 183:954–967https://doi.org/10.1016/j.cell.2020.09.031
- From Numerosity to Ordinal Rank: A Gain-Field Model of Serial Order Representation in Cortical Working MemoryThe Journal of Neuroscience 27:8636–8642https://doi.org/10.1523/JNEUROSCI.2110-07.2007
- Flexible control of sequence working memory in the macaque frontal cortexNeuron 112:1–13https://doi.org/10.1016/j.neuron.2024.07.024
- Cognitive Control: Core Constructs and Current ConsiderationsThe Wiley Handbook of Cognitive Control Wiley :1–28https://doi.org/10.1002/9781118920497.ch1
- Neural mechanisms of selective visual attentionAnnual Review of Neuroscience 18:193–222https://doi.org/10.1146/annurev.ne.18.030195.001205
- The role of population structure in computations through neural dynamicsNature Neuroscience 25:783–794https://doi.org/10.1038/s41593-022-01088-4
- Orthogonal representations for robust context-dependent task performance in brains and neural networksNeuron 110:1258–1270https://doi.org/10.1016/j.neuron.2022.01.005
- Why neurons mix: High dimensionality for higher cognitionCurrent Opinion in Neurobiology 37:66–74https://doi.org/10.1016/j.conb.2016.01.010
- The Neurobiology of Individual Differences in Complex Behavioral TraitsAnnual Review of Neuroscience 32:225–247https://doi.org/10.1146/annurev.neuro.051508.135335
- Predictive coding in balanced neural networks with noise, chaos and delaysAdvances in Neural Information Processing Systems 33:16677–16688
- A divisive model of evidence accumulation explains uneven weighting of evidence over timeNature Communications 11https://doi.org/10.1038/s41467-020-15630-0
- Coherent chaos in a recurrent neural network with structured connectivityPLOS Computational Biology 14https://doi.org/10.1371/journal.pcbi.1006309
- How recurrent networks implement contextual processing in sentiment analysisProceedings of the 37th International Conference on Machine Learning :6608–6619
- Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamicsAdvances in Neural Information Processing Systems 32
- Context-dependent computation by recurrent dynamics in prefrontal cortexNature 503:78–84https://doi.org/10.1038/nature12742
- Linking Connectivity, Dynamics, and Computations in Low-Rank Recurrent Neural NetworksNeuron 99:609–623https://doi.org/10.1016/j.neuron.2018.07.003
- An Integrative Theory of Prefrontal Cortex FunctionAnnual Review of Neuroscience 24:167–202https://doi.org/10.1146/annurev.neuro.24.1.167
- An approximate line attractor in the hypothalamus encodes an aggressive stateCell 186:178–193https://doi.org/10.1016/j.cell.2022.11.027
- Neural knowledge assembly in humans and neural networksNeuron 111:1504–1516https://doi.org/10.1016/j.neuron.2023.02.014
- TOP-DOWN CONTROL OF VISUAL ATTENTIONCurrent Opinion in Neurobiology 20:183–190https://doi.org/10.1016/j.conb.2010.02.003
- Neural Mechanisms That Make Perceptual Decisions FlexibleAnnual Review of Physiology 85:191–215https://doi.org/10.1146/annurev-physiol-031722-024731
- Computational role of structure in neural activity and connectivityTrends in Cognitive Sciences 28:677–690https://doi.org/10.1016/j.tics.2024.03.003
- A new theoretical framework jointly explains behavioral and neural variability across subjects performing flexible decision-makingbiorxiv https://doi.org/10.1101/2022.11.28.518207
- Brief technical note on linearizing recurrent neural networks (RNNs) before vs after the pointwise nonlinearityarXiv
- Individual differences in cognition, affect, and performance: Behavioral, neuroimaging, and molecular genetic approachesNeuroImage 59:70–82https://doi.org/10.1016/j.neuroimage.2011.04.040
- Prefrontal Cortex Activity during Flexible CategorizationJournal of Neuroscience 30:8519–8528https://doi.org/10.1523/JNEUROSCI.4837-09.2010
- Sampling from large matrices: An approach through geometric functional analysisJournal of the ACM 54https://doi.org/10.1145/1255443.1255449
- Abstract Context Representations in Primate Amygdala and Prefrontal CortexNeuron 87:869–881https://doi.org/10.1016/j.neuron.2015.07.024
- Stable Rank Normalization For Improved Generalization in Neural Networks and GansarXiv
- Dynamics of random recurrent networks with correlated low-rank structurePhysical Review Research 2https://doi.org/10.1103/PhysRevResearch.2.013111
- Cortical information flow during flexible sensorimotor decisionsScience (New York, N.Y 348:1352–1355https://doi.org/10.1126/science.aab0551
- Inferring context-dependent computations through linear approximations of prefrontal cortex dynamicsbioRxiv https://doi.org/10.1101/2023.02.06.527389
- Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible FrameworkPLOS Computational Biology 12https://doi.org/10.1371/journal.pcbi.1004792
- Opening the Black Box: Low-Dimensional Dynamics in High-Dimensional Recurrent Neural NetworksNeural Computation 25:626–649https://doi.org/10.1162/NECO_a_00409
- Adapting non-invasive human recordings along multiple task-axes shows unfolding of spontaneous and over-trained choiceeLife 10https://doi.org/10.7554/eLife.60988
- Mental Programming of Spatial Sequences in Working Memory in Macaque Frontal CortexScience
- Probing the Relationship Between Latent Linear Dynamical Systems and Low-Rank Recurrent Neural Network ModelsNeural Computation 34:1871–1892https://doi.org/10.1162/neco_a_01522
- Geometry of sequence working memory in macaque prefrontal cortexScience 375:632–639https://doi.org/10.1126/science.abm0204
- Artificial Neural Networks for Neuroscientists: A PrimerNeuron 107:1048–1070https://doi.org/10.1016/j.neuron.2020.09.005
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2024, Zhang et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 70
- download
- 1
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.