A neuronal least-action principle for real-time learning in cortical circuits

  1. Walter Senn  Is a corresponding author
  2. Dominik Dold
  3. Akos F Kungl
  4. Benjamin Ellenberger
  5. Jakob Jordan
  6. Yoshua Bengio
  7. Joรฃo Sacramento
  8. Mihai A Petrovici
  1. Department of Physiology, University of Bern, Switzerland
  2. Kirchhoff-Institute for Physics, Heidelberg University, Germany
  3. European Space Research and Technology Centre, European Space Agency, Netherlands
  4. Insel Data Science Center, University Hospital Bern, Switzerland
  5. Electrical Engineering, Yale University, United States
  6. MILA, University of Montreal, Canada
  7. Department of Computer Science, ETH Zurich, Switzerland
6 figures, 1 table and 1 additional file

Figures

Somato-dendritic mismatch energies and the neuronal least-action (NLA) principle.

(a1) Sketch of a cross-cortical network of pyramidal neurons described by NLA. (a2) Correspondence between elements of NLA and biological observables such as membrane voltages and synaptic weights. (b1) The NLA principle postulates that small variations ฮด๐’–~ (dashed) of the trajectories ๐’–~ (solid) leave the action invariant, ฮดA=0. It is formulated in the look-ahead coordinates ๐’–~ (symbolized by the spyglass) in which `hills' of the Lagrangian (shaded gray zones) are foreseen by the prospective voltage so that the trajectory can turn by early enough to surround them. (b2) In the absence of output nudging (ฮฒ=0), the trajectory ๐’–(t) is solely driven by the sensory input, and prediction errors and energies vanish (L=0, outer blue trajectory at bottom). When nudging the output neurons towards a target voltage (ฮฒ>0), somatodendritic prediction errors appear, the energy increases (red dashed arrows symbolising the growing โ€˜volcanoโ€™) and the trajectory ๐’–(t) moves out of the L=0 hyperplanes, riding on top of the `volcano' (red trajectory). Synaptic plasticity Wห™ reduces the somatodendritic mismatch along the trajectory by optimally โ€˜shoveling down the volcanoโ€™ (blue dashed arrows) while the trajectory settles in a new place on the L=0 hyperplane (inner blue trajectory at bottom).

Prospective coding in cortical pyramidal neurons enables instantaneous voltage-to-voltage transfer.

(a1) The instantaneous spike rate of cortical pyramidal neurons (top) in response to sinusoidally modulated noisy input current (bottom) is phase-advanced with respect to the input adapted from Kรถndgen et al., 2008. (a2) Similiarly, in neuronal least-action (NLA), the instantaneous firing rate of a model neuron (r=ฯ(u)+ฯ„ฯห™(u), black) is phase-advanced with respect to the underlying voltage (u, red, postulating that the low-pass filtered rate is a function of the voltage, rโ€พ=ฯ(u)). (b) Dendritic input in the apical tree (here called eโ€พ) is instantaneously causing a somatic voltage modulation (u, modeling data from Ulrich, 2002). The low-pass filtering with ฯ„ along the dendritic shaft is compensated by a lookahead mechanism in the dendrite (e=eโ€พ+ฯ„eโ€พห™). In (Ulrich, 2002) a phase advance is observed even with respect to the dendritic input current, not only the dendritic voltage, although only for slow modulations (as here). (c) While the voltage of the first neuron (u1) integrates the input rates rin from the past (bottom black upward arrows), the output rate r1 of that first neuron looks ahead in time, r1=ฯ(u1)+ฯ„ฯห™(u1) (red dashed arrows pointing into the future). The voltage of the second neuron (u2) integrates the prospective rates r1 (top black upwards arrows). By doing so, it inverts the lookahead operation, resulting in an instantaneous transfer from u1(t) to u2(t) (blue arrow and circles).

Moving equilibrium hypothesis for motor control and real-time learning of cortical activity.

(a) A voluntary movement trajectory can be specified by the target length of the muscles in time, ๐’–๐’*, encoded through the ฮณ-innervation of muscle spindles, and the deviation of the effective muscle lengths from the target, ๐’–๐’โˆ’๐’–๐’*=โˆ’๐’†โ€พ๐’*. The Ia-afferents emerging from the spindles prospectively encode the error, so that their low-pass filtering is roughly proportional to the length deviation, truncated at zero (red). The moving equilibrium hypothesis states that the low-pass filtered input ๐’“โ€พin, composed of the movement plan ๐’“โ€พinplan and the sensory input (here encoding the state of the plant e.g., through visual and proprioceptive input, ๐’“โ€พinvis and ๐’“โ€พinprop), together with the low-pass filtered error feedback from the spindles, ๐’†โ€พ๐’*, instantaneously generate the muscle lengths, ๐’–๐’=๐‘ญW(๐’“โ€พin,๐’†โ€พ๐’*), and are thus at any point in time in an instantaneous equilibrium (defined by Equation 7a, Equation 7b). (b1) Intracortical intracortical electroencephalogram (iEEG) activity recorded from 56 deep electrodes and projected to the brain surface. Red nodes symbolize the 56 iEEG recording sites modeled alternately as input or output neurons, and blue nodes symbolize the 40 โ€˜hiddenโ€™ neurons for which no data is available, but used to reproduce the iEEG activity. (b2) Corresponding NLA network. During training, the voltages of the output neurons were nudged by the iEEG targets (black input arrows, but for all red output neurons). During testing, nudging was removed for 14 out of these 56 neurons (here, represented by neurons 1, 2, 3). (c1) Voltage traces for the 3 example neurons in a2, before (blue) and after (red) training, overlaid with their iEEG target traces (gray). (c2) Total cost, integrated over a window of 8 s of the 56 output nodes during training with sequences of the same duration. The cost for the test sequences was evaluated on a 8 s window not used during training.

On-the-fly learning of finger responses to visual input with real-time dendritic error propagation (rt-DeEP).

(a) Functionally feedforward network with handwritten digits as visual input (๐’“in(2)(t) in Figure 3a, here from the MNIST data set, 5 ms presentation time per image), backprojections enabling credit assignment, and activity of the 10 output neurons interpreted as commands for the 10 fingers (forward architecture: 784ร—500ร—10 neurons). (b) Example voltage trace (b1) and local error (b2) of a hidden neuron in neuronal least-action (NLA) (red) compared to an equivalent network without lookahead rates (orange). Note that neither network achieves a steady state due to the extremely short input presentation times. Errors are calculated via exact backpropagation, i.e., by using the error backpropagation algorithm on a pure feedforward NLA network at every simulation time step (with output errors scaled by ฮฒ), shown for comparison (blue dots). (c) Comparison of network models during and after learning. Color scheme as in (b). (c1) The test error under NLA evolves during learning on par with classical error backpropagation performed each Euler dt based on the feedforward activities. In contrast, networks without lookahead rates are incapable of learning such rapidly changing stimuli. (c2) With increasing presentation time, the performance under NLA further improves, while networks without lookahead rates stagnate at high error rates. This is caused by transient, but long-lasting misrepresentation of errors following stimulus switches: when plasticity is turned off during transients and is only active in the steady state, comparably good performance can be achieved (dashed orange). (d) Receptive fields of 6 hidden-layer neurons after training, demonstrating that even for very brief image presentation times (5ms), the combined neuronal and synaptic dynamics are capable of learning useful feature extractors such as edge filters.

Hierarchical plastic microcircuits implement real-time dendritic error learning (rt-DeEL).

(a) Microcircuit with โ€˜top-downโ€™ input (originating from peripheral motor activity, blue line) that is explained away by the lateral input via interneurons (dark red), with the remaining activity representing the error eโ€พl. Plastic connections are denoted with a small red arrow and nudging with a dashed line. (b1) Simulated network with 784-300-10 pyramidal-neurons and a population of 40 interneurons in the hidden layer used for the MNIST learning task where the handwritten digits have to be associated with the 10 fingers. (b2) Test errors for rt-DeEL with joint tabula rasa learning of the forward and lateral weights of the microcircuit. A similar performance is reached as with classical error backpropagation. For comparability, we also show the performance of a shallow network (dashed line). (b3) Angle derived from the Frobenius norm between the lateral pathway ๐‘พIPl๐‘พPIl and the feedback pathway ๐‘ฉl๐‘พl+1. During training, both pathways align to allow correct credit assignment throughout the network. Indices are dropped in the axis label for readability.

Appendix 1โ€”figure 1
Recovering presynaptic potentials through short term depression.

(a1) Relative voltage response of a depressing cortical synapse recreated from Abbott et al., 1997, identified as synaptic release probability p. (a2) The product of the low-pass filtered presynaptic firing rate rยฏ(u) times the synaptic release probability is p(rยฏ) proportional to the presynaptic membrane potential, p(rยฏ)rยฏโˆu. (a3) Average in vivo firing rate of a neuron in the visual cortex as a function of the somatic membrane potential recreated from Anderson et al., 2000, which can be qualitatively identified as the stationary rate rยฏ(u) derived in Equation 43.

Tables

Table 1
Mathematical symbols.
Mathematical expressionNamingComment
uiInstantaneous (somatic) voltageonly for network neurons
ri=ฯโข(ui)+ฯ„โขฯห™โข(ui)Instantaneous firing rate of neuron ithat looks linearly ahead in time
rยฏโข(t)=1ฯ„โขโˆซ-โˆžtrโข(tโ€ฒ)โขe-t-tโ€ฒฯ„โขdtโ€ฒDefinition of low-pass filteringSee Equation 15
rยฏi=ฯโข(ui)=ri+ฯ„โขrห™iยฏLow-pass filtered firing ratepostulated to be a function ofโ€‰ui
๐’“=๐’“ยฏ+ฯ„โข๐’“ยฏห™Self-consistency eq.for low-pass filtered rate
๐’“inInput rate vector, columnprojects to selected neurons
๐’“ยฏinLow-pass filter input ratesinstantaneously propagates
ei=(ui+ฯ„uห™i)โˆ’โˆ‘jWijrjProspective error of neuroniin apical dendrite
eยฏi=uiโˆ’โˆ‘jWijrยฏjError of neuroniin soma
EMi=12eยฏi2=12(uiโˆ’โˆ‘jWijrยฏj)2Mismatch energy in neuron ibetween soma and basal dendrite
uo*Target voltage for output neuron ocould impose target onโ€‰roโ€‰
orโ€‰rยฏo
eยฏo*=uo*-uoError of output neuron oalso called target error
Co=12โข(eยฏo*)2Cost contribution of output neuron obetween soma and basal dendrite
L=โˆ‘iโˆˆNEMi+ฮฒโˆ‘oโˆˆOCoLagrangianoutput โขOโŠ‚network โขN
u~โข(t)=1ฯ„โขโˆซtโˆžuโข(tโ€ฒ)โข๐’†(t-tโ€ฒ)/ฯ„โขdโขtโ€ฒDiscounted future voltageprospective coordinates for NLA
๐’–=๐’–~-ฯ„โข๐’–~ห™Self-consistency eq.for discounted future voltage
A=โˆซt1t2Lโข[๐’–~โข(t),๐’–~ห™โข(t)]โขdโขtNeuronal Least Action (NLA)expressed in prospect. coordinates
โˆ‚โกLโˆ‚โกu~i-ddโขtโขโˆ‚โกLโˆ‚โกu~ห™i=(1+ddโขt)โขโˆ‚โˆ‚โกuiโขL=0Euler-Lagrange equationsturned into lookahead operator
๐‘พinweights from input neuronsโ€‰๐’“indimโข(N)ร—dimโข(๐’“in)
, most0
๐‘พnetweights between network neuronsdimโข(N)ร—dimโข(N)
๐‘พ=(๐‘พin,๐‘พnet)total weight matrixdimโข(N)ร—(dimโข(๐’“in)+dimโข(N))
๐’“=(๐’“in,๐’“net)Tinstantaneous firing rate vectorcolumn (indicated by transpose)
๐‘พห™โˆ๐’†ยฏโข๐’“ยฏTPlasticity ofโ€‰๐‘พeยฏโ€‰
is a column,โ€‰๐’“ยฏT a row vector
๐’–๐’*โข(t)=๐‘ญ*โข(๐’“ยฏinโข(t))Target function formulated forโ€‰rยฏin(t)a functional ofโ€‰๐’“inโข(t)
๐’–๐’โข(t)=๐‘ญWโข(๐’“ยฏinโข(t),๐’†ยฏ๐’*โข(t))Func. implemented by forward networkinstant. func. ofโ€‰๐’“ยฏinโข(t)
, notโ€‰๐’“inโข(t)
NLayers in forward network, w/oโ€‰rinLast-layer voltages:๐’–N=๐’–๐’
WlIPWeights from pyr to interneuronslateral, within layer l
WlPIWeights from inter- to pyrโ€™neuronslateral, within layer l
๐‘พlBottom-up weights from layerlโ€“1
tol
between pyramidal neurons
๐‘ฉlTop-down weights from layerl+1
tol
between pyramidal neurons
eยฏlA=Blul+1โˆ’WPIluIlLow-pass filtered apical error in layerltop-down minus lateral feedback
eยฏl=rยฏlโ€ฒโ‹…eยฏlA=rยฏlโ€ฒโ‹…Bleยฏl+1Somato-basal prediction erroris correct error for learning
ElIP=12โ€–ulIโˆ’WlIPrยฏlโ€–2Interneuron mismatch energyminimized to learnโ€‰WlIP
ElPI=12โ€–Blul+1โˆ’WPIluIlโ€–2Apical mismatch energyminimized to learnโ€‰WlPI
ฮท,ฮทIP,ฮทPILearning rates for plasticity ofโ€ฆโ€ฆWl;WlIP;WlPI
๐‘ฏ=โˆ‚2โกLโˆ‚โก๐’–2=๐Ÿ-๐‘พnetโข๐†โ€ฒ-๐’†ยฏโ€ฒHessian,โˆ‚2โกLโˆ‚โก๐’–2=โˆ‚โก๐’‡โˆ‚โก๐’–. If pos. definiteโ‡’ stable dynamics
๐’‡โข(๐’–,t)=โˆ‚โกLโˆ‚โก๐’–=๐’–-๐‘พโข๐’“ยฏโข(๐’–)-๐’†ยฏโข(๐’–)Corrected errorbecomes 0
withโ€‰ฯ„
๐’‡โข(๐’–,t)+ฯ„โข๐’‡ห™โข(๐’–,t)=0Euler-Lagrange equationssatisfyโ€‰f(u,t)=f0eโˆ’(tโˆ’t0)/ฯ„
๐’‡โข(๐’–,t)=0Always the case after transientexponentially decaying withโ€‰ฯ„
๐’–ห™=-1ฯ„โข๐‘ฏ-1โข(๐’–)โข(๐’‡โข(๐’–)+ฯ„โขโˆ‚โก๐’‡โˆ‚โกt)Explicit diff. eq.obtained by solving forโ€‰๐’–ห™
๐’ˆโข(๐’–,t)=-1ฯ„โข๐‘ฏ-1โข(๐’–)โข(๐’‡โข(๐’–)+ฯ„โขโˆ‚โก๐’‡โˆ‚โกt)Used to write the explicit diff. eq.๐’–ห™=๐’ˆโข(๐’–,t)
๐‘ฎโข(๐’š,๐’–ห™)=(1+ฯ„โขddโขt)โขโˆ‚โกLโˆ‚โก๐’–=๐’‡+ฯ„โข๐’‡ห™Used for contraction anaylsis, Equation 53๐’š=(๐’“in,๐’–๐’*,๐’–)
๐‘ด,๐‘ฒUsed to iteratively converge to๐’–ห™see Equation 46
๐’–ห˜=๐’–+ฯ„โข๐’–ห™Linear lookahead voltageLatent Equilibrium, Appendix 4

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Walter Senn
  2. Dominik Dold
  3. Akos F Kungl
  4. Benjamin Ellenberger
  5. Jakob Jordan
  6. Yoshua Bengio
  7. Joรฃo Sacramento
  8. Mihai A Petrovici
(2024)
A neuronal least-action principle for real-time learning in cortical circuits
eLife 12:RP89674.
https://doi.org/10.7554/eLife.89674.3