Aligned and oblique dynamics in recurrent neural networks

  1. Faculty of Electrical Engineering and Computer Science, Technical University Berlin, Germany
  2. Science of Intel ligence, Research Cluster of Excel lence, Berlin, Germany
  3. Champalimaud Research, Lisbon, Portugal
  4. Laboratoire de Neurosciences Cognitives et Computationnel les, INSERM U960, Ecole Normale Superieure - PSL Research University, Paris, France
  5. Rappaport Faculty of Medicine and Network Biology Research laboratories, Technion - Israeli Institute of Technology, Haifa, Israel

Editors

  • Reviewing Editor
    Tatyana Sharpee
    Salk Institute for Biological Studies, La Jolla, United States of America
  • Senior Editor
    Michael Frank
    Brown University, Providence, United States of America

Reviewer #1 (Public Review):

Summary:

In this work, the authors utilize recurrent neural networks (RNNs) to explore the question of when and how neural dynamics and the network's output are related from a geometrical point of view. The authors found that RNNs operate between two extremes: an 'aligned' regime in which the weights and the largest PCs are strongly correlated and an 'oblique' regime where the output weights and the largest PCs are poorly correlated. Large output weights led to oblique dynamics, and small output weights to aligned dynamics. This feature impacts whether networks are robust to perturbation along output directions. Results were linked to experimental data by showing that these different regimes can be identified in neural recordings from several experiments.

Strengths:

A diverse set of relevant tasks.

A well-chosen similarity measure.

Exploration of various hyperparameter settings.

Weaknesses:

One of the major connections found BCI data with neural variance aligned to the outputs. Maybe I was confused about something, but doesn't this have to be the case based on the design of the experiment? The outputs of the BCI are chosen to align with the largest principal components of the data.

Proposed experiments may have already been done (new neural activity patterns emerge with long-term learning, Oby et al. 2019). My understanding of these results is that activity moved to be aligned as the manifold changed, but more analyses could be done to more fully understand the relationship between those experiments and this work.

Analysis of networks was thorough, but connections to neural data were weak. I am thoroughly convinced of the reported effect of large or small output weights in networks. I also think this framing could aid in future studies of interactions between brain regions.

This is an interesting framing to consider the relationship between upstream activity and downstream outputs. As more labs record from several brain regions simultaneously, this work will provide an important theoretical framework for thinking about the relative geometries of neural representations between brain regions.

It will be interesting to compare the relationship between geometries of representations and neural dynamics across connected different brain areas that are closer to the periphery vs. more central.

It is exciting to think about the versatility of the oblique regime for shared representations and network dynamics across different computations.

The versatility of the oblique regime could lead to differences between subjects in neural data.

Reviewer #2 (Public Review):

Summary:

This paper tackles the problem of understanding when the dynamics of neural population activity do and do not align with some target output, such as an arm movement. The authors develop a theoretical framework based on RNNs showing that an alignment of neural dynamics to output can be simply controlled by the magnitude of the read-out weight vector while the RNN is being trained. Small magnitude vectors result in aligned dynamics, where low-dimensional neural activity recapitulates the target; large magnitude vectors result in "oblique" dynamics, where encoding is spread across many dimensions. The paper further explores how the aligned and oblique regimes differ, in particular, that the oblique regime allows degenerate solutions for the same target output.

Strengths:

- A really interesting new idea that different dynamics of neural circuits can arise simply from the initial magnitude of the output weight vector: once written out (Eq 3) it becomes obvious, which I take as the mark of a genuinely insightful idea.

- The offered framework potentially unifies a collection of separate experimental results and ideas, largely from studies of the motor cortex in primates: the idea that much of the ongoing dynamics do not encode movement parameters; the existence of the "null space" of preparatory activity; and that ongoing dynamics of the motor cortex can rotate in the same direction even when the arm movement is rotating in opposite directions.

- The main text is well written, with a wide-ranging set of key results synthesised and illustrated well and concisely.

- The study shows that the occurrence of the aligned and oblique regimes generalises across a range of simulated behavioural tasks.

- A deep analytical investigation of when the regimes occur and how they evolve over training.

- The study shows where the oblique regime may be advantageous: allows multiple solutions to the same problem; and differs in sensitivity to perturbation and noise.

- An insightful corollary result that noise in training is needed to obtain the oblique regime.

- Tests whether the aligned and oblique regimes can be seen in neural recordings from primate cortex in a range of motor control tasks.

Weaknesses:

- The magnitude of the output weights is initially discussed as being fixed, and as far as I can tell all analytical results (sections 4.6-4.9) also assume this. But in all trained models that make up the bulk of the results (Figures 3-6) all three weight vectors/matrices (input, recurrent, and output) are trained by gradient descent. It would be good to see an explanation or results offered in the main text as to why the training always ends up in the same mapping (small->aligned; large->oblique) when it could, for example, optimise the output weights instead, which is the usual target (e.g. Sussillo & Abbott 2009 Neuron).

- It is unclear what it means for neural activity to be "aligned" for target outputs that are not continuous time-series, such as the 1D or 2D oscillations used to illustrate most points here. Two of the modelled tasks have binary outputs; one has a 3-element binary vector.

- It is unclear what criteria are used to assign the analysed neural data to the oblique or aligned regimes of dynamics.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation