Fast and slow synaptic plasticity enables concurrent control and learning

Brendan A Bicknell author has email address
Peter E Latham

Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom

https://doi.org/10.7554/eLife.105043.1

Open access
Copyright information

Figures and data

A theory of fast and slow synaptic plasticity.
a) Left: A synapse must make online adjustments to its strength by integrating local signals, such as its own input and error feedback. We propose that these signals can be optimally exploited through two timescales of plasticity. ‘Fast weights’, δw_i, fluctuate rapidly to suppress downstream error, whereas ‘slow weights’, w_i, converge gradually to the values required of a given task. Right: In the toy-model simulation from panel (b), as the slow weights find the solution, fast-weight fluctuations are reduced. Shown are example weight trajectories from one randomly selected synapse out of 20. b) In an illustrative online regression task, a neuron must learn to match its output to a time-varying target (gray dashed line). With a classical delta rule (black line), weights adapt over time to eventually correct the output. With two timescales of plasticity (purple line), fast weights can pin the output to the target from the outset, while slow weights evolve in the background to learn the task.

Synaptic plasticity as optimal control.
a) Performance comparison between the classical gradient-based rule (black) and Bayesian rule without fast weights (blue), with plastic fast weights and frozen slow weights (red), and with plastic slow and fast weights (purple). Bars denote root-mean-squared error (RMSE) between output, y, and target output, y*, averaged over the entire task. Data points denote different random seeds. The dashed line gives the average error when slow weights are set to their target values at every point in time. The learning rate for the classical rule and control cost for the Bayesian rule were selected via grid search to minimize output error. b) The Bayesian rules yield faster convergence of weights to their target values compared to the classical rule, quantified as RMSE between weight vectors. Shaded areas are standard deviation from 10 random seeds. c) Root-mean-squared output error versus fast weight fluctuations, the latter computed as the root-mean-squared fast weight divided by root-mean-squared slow weight (averages taken over entire task). The size of the fast weight fluctuations were controlled by the cost parameter λ_u (see equation (12)). In the full plasticity rule (purple), even small fluctuations of ∼ 10% lead to a large reduction in error. Without slow-weight learning (red), much more control is needed for a similar level of performance. Points denote the average over seeds for a fixed λ_u; the shaded ellipses denote standard deviation across seeds. d) Control is effective for feedback delays shorter than the output time constant (here, τ_y = 100 ms). Shaded areas are standard deviation from 10 random seeds.

Control and learning with spiking feedback.
a) When feedback is communicated via noisy spiking activity, optimal plasticity requires the synapse to infer the true underlying continuous error signal. The estimated error is then used to drive both control and learning. Fast weights seek to improve performance by canceling the estimated error (lower panel), while slow weights seek to reduce the error permanently. b) Performance depends on the rate of feedback. High feedback spike rates increase the precision of error estimates, thereby enhancing learning and control. A population of neurons acting on multiple independent feedback signals (dashed purple trace) can compensate for very low individual feedback rates. Shaded areas are standard deviation from 10 random seeds.

Fast and slow plasticity in the cerebellum.
a) Schematic of the cerebellar microzone model. A group of Purkinje cells receiving synaptic input from parallel fibers project to a common downstream output. Each Purkinje cell receives feedback spikes from a separate climbing fiber, signaling the error between actual and target outputs. b) Schematic of the learning task. Ten patterns of temporally organized parallel-fiber input must be mapped by a population of Purkinje cells to time-varying target outputs. c) Example outputs early and late in learning, replicating the results from the toy model in Fig. 1b. d) Performance during early (100 − 200 s) and late (5000 − 5100 s) stages of learning. Bars denote RMSE between output and target output, averaged over 100 s. e) Left: After training on 10 input-output mappings, half of the target outputs were changed. Right: Bayesian plasticity confers faster recovery than the classical approach. With fast weights, the output is largely insensitive to the perturbation. Error curves have been smoothed with a moving average filter of width 10 s. Shaded areas are standard deviations from simulations pooled over 10 random training seeds × 10 perturbations.

Signatures of fast and slow synaptic plasticity.
a) Suppression of Purkinje-cell firing rates after climbing-fiber input is a signature of fast-weight updates. The solid lines denote averages of firing rates aligned to climbing-fiber feedback spikes during the cerebellar learning task. Shaded areas are the standard deviation of the average firing rate change of 20 neurons × 10 random seeds. b) The duration and magnitude of the predicted firing rate suppression vary systematically with the time constant of the downstream output and the feedback delay. c) A standard experimental protocol can discriminate between classical and Bayesian plasticity. After training on the cerebellar learning task, synapses were stimulated with conjunctions of parallel-fiber input bursts and climbing-fiber feedback spikes (50 repetitions at 2 reps/s). d) Left: The Bayesian rules produce LTD over a narrower range of PF-CF intervals than the classical rule, and also exhibit greater variability across synapses. Right: The variability is due to the adaptive learning rate: in the Bayesian rules, unlike the classical rule, the magnitude of weight changes depend on the average rate of input during the task. Solid lines denote synaptic weight changes measured at the end of the protocol, averaged over 100 tested synapses. Shaded areas are standard deviation.

Model and simulation parameters.

Approximation of the Bayesian learning rule.
Comparison of the full Bayesian rule, given by equation (76), and the approximation presented for ease of interpretation in equations (13) and (16), in terms of output error (a) and weight error (b). The approximation is valid whenever the output time constant dominates the model dynamics, τ_r, τ_I ≪ τ_y. Here we use parameters τ_I = 1 ms, τ_r = 10 ms, and τ_y = 100 ms.

Simulations with spiking feedback.
Results from the spiking feedback model are consistent with the continuous feedback model (Fig. 2), although learning takes longer and performance depends strongly on the spontaneous feedback rate γ₀. a) Performance as a function of feedback rate (replotted from Fig. 3 for context). Shaded areas are s.d. from 10 random seeds. b) Detailed comparison between learning rules with γ₀ = 64 spikes/s. c) Control remains effective in the presence of feedback delays. Shaded areas are s.d. from 10 random seeds. d) Weights converge to their target values more quickly with increasing rates of feedback.

Parameter dependence of simulated plasticity experiments.
The LTP/LTD curve produced using the protocol in Fig. 5c depends systematically on the parameters of the model and protocol. a) Tuning to the PF-CF interval becomes broader as the output time constant τ_y increases. b) Peaks in the LTP and LTD lobes of the curve are shifted in proportion to the feedback delay. c) The magnitude of plasticity increases with the number of spikes in the parallel fiber burst. d) LTD is amplified and shifted to earlier PF-CF intervals with increasing numbers of climbing fiber spikes.

Sign up for email alerts