A neuronal least-action principle for real-time learning in cortical circuits

  1. Walter Senn  Is a corresponding author
  2. Dominik Dold
  3. Akos F Kungl
  4. Benjamin Ellenberger
  5. Jakob Jordan
  6. Yoshua Bengio
  7. João Sacramento
  8. Mihai A Petrovici
  1. Department of Physiology, University of Bern, Switzerland
  2. Kirchhoff-Institute for Physics, Heidelberg University, Germany
  3. European Space Research and Technology Centre, European Space Agency, Netherlands
  4. Insel Data Science Center, University Hospital Bern, Switzerland
  5. Electrical Engineering, Yale University, United States
  6. MILA, University of Montreal, Canada
  7. Department of Computer Science, ETH Zurich, Switzerland
6 figures, 1 table and 1 additional file

Figures

Somato-dendritic mismatch energies and the neuronal least-action (NLA) principle.

(a1) Sketch of a cross-cortical network of pyramidal neurons described by NLA. (a2) Correspondence between elements of NLA and biological observables such as membrane voltages and synaptic weights. (b1) The NLA principle postulates that small variations δ𝒖~ (dashed) of the trajectories 𝒖~ (solid) leave the action invariant, δA=0. It is formulated in the look-ahead coordinates 𝒖~ (symbolized by the spyglass) in which `hills' of the Lagrangian (shaded gray zones) are foreseen by the prospective voltage so that the trajectory can turn by early enough to surround them. (b2) In the absence of output nudging (β=0), the trajectory 𝒖(t) is solely driven by the sensory input, and prediction errors and energies vanish (L=0, outer blue trajectory at bottom). When nudging the output neurons towards a target voltage (β>0), somatodendritic prediction errors appear, the energy increases (red dashed arrows symbolising the growing ‘volcano’) and the trajectory 𝒖(t) moves out of the L=0 hyperplanes, riding on top of the `volcano' (red trajectory). Synaptic plasticity W˙ reduces the somatodendritic mismatch along the trajectory by optimally ‘shoveling down the volcano’ (blue dashed arrows) while the trajectory settles in a new place on the L=0 hyperplane (inner blue trajectory at bottom).

Prospective coding in cortical pyramidal neurons enables instantaneous voltage-to-voltage transfer.

(a1) The instantaneous spike rate of cortical pyramidal neurons (top) in response to sinusoidally modulated noisy input current (bottom) is phase-advanced with respect to the input adapted from Köndgen et al., 2008. (a2) Similiarly, in neuronal least-action (NLA), the instantaneous firing rate of a model neuron (r=ρ(u)+τρ˙(u), black) is phase-advanced with respect to the underlying voltage (u, red, postulating that the low-pass filtered rate is a function of the voltage, r=ρ(u)). (b) Dendritic input in the apical tree (here called e) is instantaneously causing a somatic voltage modulation (u, modeling data from Ulrich, 2002). The low-pass filtering with τ along the dendritic shaft is compensated by a lookahead mechanism in the dendrite (e=e+τe˙). In (Ulrich, 2002) a phase advance is observed even with respect to the dendritic input current, not only the dendritic voltage, although only for slow modulations (as here). (c) While the voltage of the first neuron (u1) integrates the input rates rin from the past (bottom black upward arrows), the output rate r1 of that first neuron looks ahead in time, r1=ρ(u1)+τρ˙(u1) (red dashed arrows pointing into the future). The voltage of the second neuron (u2) integrates the prospective rates r1 (top black upwards arrows). By doing so, it inverts the lookahead operation, resulting in an instantaneous transfer from u1(t) to u2(t) (blue arrow and circles).

Moving equilibrium hypothesis for motor control and real-time learning of cortical activity.

(a) A voluntary movement trajectory can be specified by the target length of the muscles in time, 𝒖𝒐*, encoded through the γ-innervation of muscle spindles, and the deviation of the effective muscle lengths from the target, 𝒖𝒐𝒖𝒐*=𝒆𝒐*. The Ia-afferents emerging from the spindles prospectively encode the error, so that their low-pass filtering is roughly proportional to the length deviation, truncated at zero (red). The moving equilibrium hypothesis states that the low-pass filtered input 𝒓in, composed of the movement plan 𝒓inplan and the sensory input (here encoding the state of the plant e.g., through visual and proprioceptive input, 𝒓invis and 𝒓inprop), together with the low-pass filtered error feedback from the spindles, 𝒆𝒐*, instantaneously generate the muscle lengths, 𝒖𝒐=𝑭W(𝒓in,𝒆𝒐*), and are thus at any point in time in an instantaneous equilibrium (defined by Equation 7a, Equation 7b). (b1) Intracortical intracortical electroencephalogram (iEEG) activity recorded from 56 deep electrodes and projected to the brain surface. Red nodes symbolize the 56 iEEG recording sites modeled alternately as input or output neurons, and blue nodes symbolize the 40 ‘hidden’ neurons for which no data is available, but used to reproduce the iEEG activity. (b2) Corresponding NLA network. During training, the voltages of the output neurons were nudged by the iEEG targets (black input arrows, but for all red output neurons). During testing, nudging was removed for 14 out of these 56 neurons (here, represented by neurons 1, 2, 3). (c1) Voltage traces for the 3 example neurons in a2, before (blue) and after (red) training, overlaid with their iEEG target traces (gray). (c2) Total cost, integrated over a window of 8 s of the 56 output nodes during training with sequences of the same duration. The cost for the test sequences was evaluated on a 8 s window not used during training.

On-the-fly learning of finger responses to visual input with real-time dendritic error propagation (rt-DeEP).

(a) Functionally feedforward network with handwritten digits as visual input (𝒓in(2)(t) in Figure 3a, here from the MNIST data set, 5 ms presentation time per image), backprojections enabling credit assignment, and activity of the 10 output neurons interpreted as commands for the 10 fingers (forward architecture: 784×500×10 neurons). (b) Example voltage trace (b1) and local error (b2) of a hidden neuron in neuronal least-action (NLA) (red) compared to an equivalent network without lookahead rates (orange). Note that neither network achieves a steady state due to the extremely short input presentation times. Errors are calculated via exact backpropagation, i.e., by using the error backpropagation algorithm on a pure feedforward NLA network at every simulation time step (with output errors scaled by β), shown for comparison (blue dots). (c) Comparison of network models during and after learning. Color scheme as in (b). (c1) The test error under NLA evolves during learning on par with classical error backpropagation performed each Euler dt based on the feedforward activities. In contrast, networks without lookahead rates are incapable of learning such rapidly changing stimuli. (c2) With increasing presentation time, the performance under NLA further improves, while networks without lookahead rates stagnate at high error rates. This is caused by transient, but long-lasting misrepresentation of errors following stimulus switches: when plasticity is turned off during transients and is only active in the steady state, comparably good performance can be achieved (dashed orange). (d) Receptive fields of 6 hidden-layer neurons after training, demonstrating that even for very brief image presentation times (5ms), the combined neuronal and synaptic dynamics are capable of learning useful feature extractors such as edge filters.

Hierarchical plastic microcircuits implement real-time dendritic error learning (rt-DeEL).

(a) Microcircuit with ‘top-down’ input (originating from peripheral motor activity, blue line) that is explained away by the lateral input via interneurons (dark red), with the remaining activity representing the error el. Plastic connections are denoted with a small red arrow and nudging with a dashed line. (b1) Simulated network with 784-300-10 pyramidal-neurons and a population of 40 interneurons in the hidden layer used for the MNIST learning task where the handwritten digits have to be associated with the 10 fingers. (b2) Test errors for rt-DeEL with joint tabula rasa learning of the forward and lateral weights of the microcircuit. A similar performance is reached as with classical error backpropagation. For comparability, we also show the performance of a shallow network (dashed line). (b3) Angle derived from the Frobenius norm between the lateral pathway 𝑾IPl𝑾PIl and the feedback pathway 𝑩l𝑾l+1. During training, both pathways align to allow correct credit assignment throughout the network. Indices are dropped in the axis label for readability.

Appendix 1—figure 1
Recovering presynaptic potentials through short term depression.

(a1) Relative voltage response of a depressing cortical synapse recreated from Abbott et al., 1997, identified as synaptic release probability p. (a2) The product of the low-pass filtered presynaptic firing rate r¯(u) times the synaptic release probability is p(r¯) proportional to the presynaptic membrane potential, p(r¯)r¯u. (a3) Average in vivo firing rate of a neuron in the visual cortex as a function of the somatic membrane potential recreated from Anderson et al., 2000, which can be qualitatively identified as the stationary rate r¯(u) derived in Equation 43.

Tables

Table 1
Mathematical symbols.
Mathematical expressionNamingComment
uiInstantaneous (somatic) voltageonly for network neurons
ri=ρ(ui)+τρ˙(ui)Instantaneous firing rate of neuron ithat looks linearly ahead in time
r¯(t)=1τ-tr(t)e-t-tτdtDefinition of low-pass filteringSee Equation 15
r¯i=ρ(ui)=ri+τr˙i¯Low-pass filtered firing ratepostulated to be a function of ui
𝒓=𝒓¯+τ𝒓¯˙Self-consistency eq.for low-pass filtered rate
𝒓inInput rate vector, columnprojects to selected neurons
𝒓¯inLow-pass filter input ratesinstantaneously propagates
ei=(ui+τu˙i)jWijrjProspective error of neuroniin apical dendrite
e¯i=uijWijr¯jError of neuroniin soma
EMi=12e¯i2=12(uijWijr¯j)2Mismatch energy in neuron ibetween soma and basal dendrite
uo*Target voltage for output neuron ocould impose target on ro
or r¯o
e¯o*=uo*-uoError of output neuron oalso called target error
Co=12(e¯o*)2Cost contribution of output neuron obetween soma and basal dendrite
L=iNEMi+βoOCoLagrangianoutput Onetwork N
u~(t)=1τtu(t)𝒆(t-t)/τdtDiscounted future voltageprospective coordinates for NLA
𝒖=𝒖~-τ𝒖~˙Self-consistency eq.for discounted future voltage
A=t1t2L[𝒖~(t),𝒖~˙(t)]dtNeuronal Least Action (NLA)expressed in prospect. coordinates
Lu~i-ddtLu~˙i=(1+ddt)uiL=0Euler-Lagrange equationsturned into lookahead operator
𝑾inweights from input neurons 𝒓indim(N)×dim(𝒓in)
, most0
𝑾netweights between network neuronsdim(N)×dim(N)
𝑾=(𝑾in,𝑾net)total weight matrixdim(N)×(dim(𝒓in)+dim(N))
𝒓=(𝒓in,𝒓net)Tinstantaneous firing rate vectorcolumn (indicated by transpose)
𝑾˙𝒆¯𝒓¯TPlasticity of 𝑾e¯
is a column, 𝒓¯T a row vector
𝒖𝒐*(t)=𝑭*(𝒓¯in(t))Target function formulated for r¯in(t)a functional of 𝒓in(t)
𝒖𝒐(t)=𝑭W(𝒓¯in(t),𝒆¯𝒐*(t))Func. implemented by forward networkinstant. func. of 𝒓¯in(t)
, not 𝒓in(t)
NLayers in forward network, w/o rinLast-layer voltages:𝒖N=𝒖𝒐
WlIPWeights from pyr to interneuronslateral, within layer l
WlPIWeights from inter- to pyr’neuronslateral, within layer l
𝑾lBottom-up weights from layerl–1
tol
between pyramidal neurons
𝑩lTop-down weights from layerl+1
tol
between pyramidal neurons
e¯lA=Blul+1WPIluIlLow-pass filtered apical error in layerltop-down minus lateral feedback
e¯l=r¯le¯lA=r¯lBle¯l+1Somato-basal prediction erroris correct error for learning
ElIP=12ulIWlIPr¯l2Interneuron mismatch energyminimized to learn WlIP
ElPI=12Blul+1WPIluIl2Apical mismatch energyminimized to learn WlPI
η,ηIP,ηPILearning rates for plasticity of…Wl;WlIP;WlPI
𝑯=2L𝒖2=𝟏-𝑾net𝝆-𝒆¯Hessian,2L𝒖2=𝒇𝒖. If pos. definite⇒ stable dynamics
𝒇(𝒖,t)=L𝒖=𝒖-𝑾𝒓¯(𝒖)-𝒆¯(𝒖)Corrected errorbecomes 0
with τ
𝒇(𝒖,t)+τ𝒇˙(𝒖,t)=0Euler-Lagrange equationssatisfy f(u,t)=f0e(tt0)/τ
𝒇(𝒖,t)=0Always the case after transientexponentially decaying with τ
𝒖˙=-1τ𝑯-1(𝒖)(𝒇(𝒖)+τ𝒇t)Explicit diff. eq.obtained by solving for 𝒖˙
𝒈(𝒖,t)=-1τ𝑯-1(𝒖)(𝒇(𝒖)+τ𝒇t)Used to write the explicit diff. eq.𝒖˙=𝒈(𝒖,t)
𝑮(𝒚,𝒖˙)=(1+τddt)L𝒖=𝒇+τ𝒇˙Used for contraction anaylsis, Equation 53𝒚=(𝒓in,𝒖𝒐*,𝒖)
𝑴,𝑲Used to iteratively converge to𝒖˙see Equation 46
𝒖˘=𝒖+τ𝒖˙Linear lookahead voltageLatent Equilibrium, Appendix 4

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Walter Senn
  2. Dominik Dold
  3. Akos F Kungl
  4. Benjamin Ellenberger
  5. Jakob Jordan
  6. Yoshua Bengio
  7. João Sacramento
  8. Mihai A Petrovici
(2024)
A neuronal least-action principle for real-time learning in cortical circuits
eLife 12:RP89674.
https://doi.org/10.7554/eLife.89674.3