Neuroscience

Feedback control of recurrent circuits imposes dynamical constraints on learning

Harsha Gurnani author has email address
Weixuan Liu
Bingni W Brunton author has email address

Center for Neural Science, New York University, New York, United States
Department of Biology, University of Washington, Seattle, United States
eScience Institute, University of Washington, Seattle, United States
Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, United States
Department of Applied Mathematics, University of Washington, Seattle, United States
Computational Neuroscience Center, University of Washington, Seattle, United States

https://doi.org/10.7554/eLife.111322.1

Open access
Copyright information

Figures and data

Low-dimensional dynamics in networks trained with feedback.
(a) Network schema and behavior. (Left) Model architecture of recurrent neural network (RNN) controlling a cursor and receiving sensory feedback. The network activity r depends on recurrent interactions as well as two sets of inputs: feedforward and feedback. A linear readout of the network activity controls the velocity of an external cursor that is to be moved to a target location. (Right) Example cursor trajectories showing reaches to each of 8 radial targets without an external perturbation (“unperturbed”) or with an external cursor jump perturbation (example perturbations indicated by black triangles). Reaches to different targets are shown in different colors. Target locations are shown as triangles, endpoints of example reach trajectories are indicated by red circles. (b) Task performance in the presence of external perturbations (cursor jump), as a function of perturbation amplitude; hit rate (left) and mean acquisition time (right). (c) Activity of 13 example RNN units in task-trained networks; (left) an unperturbed trial with stimulus and Go-cue onset times indicated by dashed lines, and (right) a trial with cursor perturbation with the perturbation time indicated by dashed line. The two examples are reaches to different targets, with small additive activity noise (σ = 0.03). (d) Average hit rate for different levels of activity noise (n=5 networks). (e) Dimensionality of RNN activity measured as number of PCs to get 95% CEV and as participation ratio using the eigenvalues of the covariance matrix. Horizontal lines indicate the 5th, 50th and 95th percentile values. (f) Movement-related activity projected on the movement period PCs (right), colored by trial target. (g) A modified latent linear dynamical system was fit to RNN firing rates as observations. A low-dimensional latent dynamical system was modeled with linear dynamics A, which was fit along with input weights B, observation weights C, and bias terms d.(Right) Cross-validated R² for models with different number of latents (k=1 to 12).

Input plasticity leads to differential learning outcomes during adaptation to decoder perturbations.
(a) New decoders W_pert could be either within-manifold perturbations (WMPs, red) or outside-manifold perturbations (OMPs, blue). (Right) Distribution of angle between top PCs (intrinsic manifold) and WMPs/OMPs across different perturbations. (b) Model architecture with weights being retrained during adaptation to new decoders indicated by red arrow. (c) Example cursor trajectories after decoder perturbations (top) and after successful retraining via changes to input weights (bottom) for an example WMP and OMP. (d) Histogram of performance (hit rate) after decoder perturbation (grey; Pre-retraining) and after retraining for 200 trials (red/blue) for n=170 WMPs (left) and n=135 OMPs (middle). (Right) Histogram of normalized change in performance (hit rate) after 200 trials. (e) Distribution of mean target acquisition time for different OMPs and WMPs after retraining. (f) Behavioral progress asymmetry (across targets) after retraining. (g) (Left) Hit rate over training for 2 example WMPs (in pink and cyan). Each line is a different training iteration, with hit rate quantified as fraction of successful reaches in a 20-trial period. (Right) Average training curves for the two WMPs (black) are overlaid with a logistic fit, which is used to estimate learning speed. (Right) Distribution of learning speeds (left) across WMPs and OMPs. Dashed lines indicate median of the respective distributions.

Statistical structure is largely conserved under input plasticity.
(a) Activity covariance structure for an example network, for the baseline task (left), after introducing the new perturbed decoder W_pert (middle), and after retraining input weights (right). (b) Fractional variance within original intrinsic manifold (n=8 top PCS) for different WMPs (red, n=170) and OMPs (blue, n=135), before and after retraining. Scatter denotes individual decoder perturbations, bars indicate median across perturbations. (c) Distributions of change in fractional variance within original top PCs. Dashed lines indicate the medians. (d) Ratio of post- to pre-retraining activity variance along the perturbed decoder, shown as a distribution. Horizontal lines indicate the 5th, 50th and 95th percentile values. (e) Similarity of covariance structure was measured as the overlap between vectors describing the relative variance along different original PCs (oPCs). (Right) Distribution of covariance similarity for different WMPs and OMPs. (f) Covariance similarity versus similarity of perturbed and original decoder. (g) Summary: Activity remains within the intrinsic manifold after retraining, with similar covariance structure.

Dynamical constraints in controlled networks.
(a) Neural activity over time is shaped by both recurrent dynamics and control inputs provided via input weights B1 and B2. (Right) An example neural trajectory with net contribution of recurrent flow (blue), and control inputs via B1 (green) and B2 (orange). The corresponding controller output, and hence the control cost, varies across parts of the trajectory as differences in controllability arise due to recurrent dynamics and unequal input weights. (b) Trajectories T1-T4 differ in control costs due to varying alignment with intrinsic dynamics and controllable directions. (The parts of the trajectory that require control inputs are highlighted by green ellipses.) Input weights B1 are larger than B2, creating more and less controllable directions. Trajectory T1 requires minimal input as it is aligned with highly controllable directions whereas T2 requires larger inputs to push along a less controllable direction. T3 requires strong external control inputs to overcome the recurrent flow field. T4 may be infeasible, or require even greater inputs, if it is poorly aligned with the low-dimensional dynamics/controllable subspace.

Dynamical features of adaptation to decoder perturbations.
(a) The controllability spectrum can be calculated locally around any point in neural state space, which defines highly or poorly controllable directions. (Right) Average feedback controllability spectrum across different parts of state space and n=5 networks. The steep falloff suggests only a few directions are highly controllable locally around any point. (b) Overlap of the most controllable directions with the decoder (WMP or OMP W_pert and intuitive decoder) for n=170 WMPs, n=135 OMPs, n=5 intuitive decoders. (c-e) Retraining leads to a change in flow fields (or vector-field VF), supporting a small dynamical reorganization to produce new trajectories. (c) Target-specific neural trajectories for an example network, after introducing new decoder W_pert (pre-retraining; dashed), and after adapting input weights (post-retraining; solid). Trajectories are visualized in the space of top preparatory (prep) and movement (move) period principal components. (d) Distribution across WMPs and OMPs, of the mean change in the total flow (left) and the feedback-driven flow component (right), normalized by the total flow at each point in state space, and averaged across observed neural states during post-retraining task performance. (e) Normalized change in flow along the perturbed decoder direction is much higher as new behavioral trajectories are produced with successful adaptation. (f) Summary: Changes in effective input-driven dynamics produces new neural trajectories within the same intrinsic manifold. Altered neural trajectory, passed through the new BCI decoder, restores desired behavior.

Dynamical constraints explain learning variability.
(a) (Left) Learning speed versus change in feedback-driven flow, for WMPs (red, n=170) and OMPs (blue, n=135). (Right) Learning speeds predicted using the FBK model, against the true speeds. FBK model includes feedback controllability and changes in feedback-driven flow as predictors, as well as a group-specific (WMP/OMP) intercept. Scatter denote individual decoder perturbations, and shaded areas show different quantiles. (b) Similar to (a), but for normalized change in hit rate. (c-d) Cross-validated R² (n=40 iterations) for 4 different models, predicting learning speeds in (c) and normalized change in hit rate in (d). Horizontal lines show extrema and median values of the distribution. All models include 2 predictors along with a group-specific intercept (fixed effect) that captures categorical differences between OMPs and WMPs. Models G1, G2 and G3 use different geometrical predictors; see Methods for details. Also see Figure S5.

Recurrent plasticity does not lead to variability in learning outcomes.
(a-d) Learning outcomes of recurrent plasticity in full-rank RNNs. (a) (Left) Schematic showing retraining of recurrent weights. (Right) Cursor trajectories after decoder perturbation (top) and retraining recurrent weights (bottom). (b) Distribution of normalized change in hit rate, (c) change in variance within original PCs and (d) change in recurrent flowfields, for n=35 WMPs (red) and n=35 OMPs (blue). (e-g) Learning outcomes of recurrent plasticity in low-rank RNNs (rank=4). (f) Distribution of normalized change in hit rate and (g) change in recurrent flowfields, for n=30 WMPs (red) and n=30 OMPs (blue). Also see Figure S7.

Learning outcomes depend on dimensionality of control input.
(a-d) Networks with different controller architectures, where adaptation to decoder perturbations occurs via plasticity of input weights and controller network F. Controller networks in (a)-(c) have low-dimensional output, whereas controller network in (d) has a higher-dimensional output. Two-layer feedforward controllers in (a), (b) and (d) sample both the cursor position error and the RNN state, process these in a hidden layer, and have different number of output units (4 or 12); learning occurs at the weights from the hidden to the output layer of F. In (c), networks have a recurrent controller, with 50 hidden units and a 4-dimensional output layer, and learning occurs at the recurrent weights of F. For all architectures, the projection weights W_fbk that transform the k-dimensional controller output into input currents for the main RNN are fixed. (e) Cumulative variance for varying number of activity principal components (averaged across n=3 networks of each architecture). Color indicates different architectures from (a)-(d), as indicated in the key. (f) Feedback (left) and feedforward (right) controllability spectra for network architectures in (a)-(d), averaged across n=3 networks each. (g) Distribution of overlap of different within-manifold (WMP, red) and outside-manifold (OMP, blue) decoders with the feedback controllable subspace, normalized by the overlap of the intuitive decoder (Int). The difference between WMPs and OMPs within each type of network is highlighted as Δ. (h) Amount of learning for within-manifold (WMP) and outside-manifold (OMP) decoders for networks with different controller architectures, as shown in (a)-(d). For networks in (d) with higher-dimensional controller output, both WMPs and OMPs have good learning outcomes.

Network parameters.

Feedback controller (F) parameters.

Initial training parameters.

Altered training parameters for decoder adaptation.

Sign up for email alerts