Evolving interpretable plasticity for spiking networks

  1. Jakob Jordan  Is a corresponding author
  2. Maximilian Schmidt
  3. Walter Senn
  4. Mihai A Petrovici
  1. Department of Physiology, University of Bern, Switzerland
  2. Ascent Robotics, Japan
  3. RIKEN Center for Brain Science, Japan
  4. Kirchhoff-Institute for Physics, Heidelberg University, Germany
12 figures, 3 tables and 1 additional file

Figures

Artificial evolution of synaptic plasticity rules in spiking neuronal networks.

(A) Sketch of cortical microcircuits consisting of pyramidal cells (orange) and inhibitory interneurons (blue). Stimulation elicits action potentials in pre- and postsynaptic cells, which, in turn, influence synaptic plasticity. (B) Synaptic plasticity leads to a weight change (Δw) between the two cells, here measured by the change in the amplitude of post-synaptic potentials. The change in synaptic weight can be expressed by a function f that in addition to spike timings (tpre,tpost) can take into account additional local quantities, such as the concentration of neuromodulators (ρ, green dots in A) or postsynaptic membrane potentials. (C) For a specific experimental setup, an evolutionary algorithm searches for individuals representing functions f that maximize the corresponding fitness function . An offspring is generated by modifying the genome of a parent individual. Several runs of the evolutionary algorithm can discover phenomenologically different solutions (f0,f1,f2) with comparable fitness. (D) An offspring is generated from a single parent via mutation. Mutations of the genome can, for example, exchange mathematical operators, resulting in a different function f.

Representation and mutation of mathematical expressions in Cartesian genetic programming.

(A) The genotype of an individual is a two-dimensional Cartesian graph (top). In this example, the graph contains three input nodes (0-2), six internal nodes (3-8) and a single output node (9). In each node, the genes of a specific genotype are shown, encoding the operator used to compute the node’s output and its inputs. Each operator gene maps to a specific mathematical function (bottom). Special values (-1,-2) represent input and output nodes. For example, node four uses the operator 1, the multiplication operation '*', and receives input from nodes 0 and 2. This node’s output is hence given by x0*x2. The number of input genes per node is determined by the operator with the maximal arity (here two). Fixed genes that cannot be mutated are highlighted in red. ∅ denotes non-coding genes. (B) The computational graph (phenotype) generated by the genotype in A. Input nodes (x0,x1,x2) represent the arguments of the function f. Each output node selects one of the other nodes as a return value of the computational graph, thus defining a function from input 𝒙 to output 𝒚=𝒇(𝒙). Here, the output node selects node four as a return value. Some nodes defined in the genotype are not used by a particular realization of the computational graph (in light gray, e.g., node 6). Mutations that affect such nodes have no effect on the phenotype and are therefore considered ‘silent’. (C) Mutations in the genome either lead to a change in graph connectivity (top, green arrow) or alter the operators used by an internal node (bottom, green node). Here, both mutations affect the phenotype and are hence not silent.

Cartesian genetic programming evolves various efficient reward-driven learning rules.

(A) Network sketch. Multiple input neurons with Poisson activity project to a single output unit. Pre- and postsynaptic activity generate an eligibility trace in each synapse. Comparison between the output activity and the target activity generates a reward signal. R¯, and R¯+, R¯- represent the expected reward, the expected positive and the expected negative reward, respectively. Depending on the hyperparameter settings either the former or the latter two are provided to the plasticity rule. (B) Raster plot of the activity of input neurons (small black dots) and output neuron (large golden dots). Gray (white) background indicate patterns for which the output should be active (inactive). Top indicates correct classifications (+) and incorrect classifications (-). We show 10 trials at the beginning (left) and the end of training (right) using the evolved plasticity rule: Δwj=η(R-1)Ejr. (C) Fitness of best individual per generation as a function of the generation index for multiple example runs of the evolutionary algorithm with different initial conditions but identical hyperparameters. Labels show the expression f at the end of the respective run for three runs resulting in well-performing plasticity rules. Gray lines represent runs with functionally identical solutions or low final fitness. (D) Fitness of a selected subset of evolved learning rules on the 10 experiments used during the evolutionary search (blue) and additional 80 fitness evaluations, each on 10 new experiments consisting of sets of frozen noise patterns and associated class labels not used during the evolutionary search (orange). Horizontal boxes represent mean, error bars indicate one standard deviation over fitness values. Gray line indicates mean fitness of LR0 for visual reference. Black stars indicate significance (p<10-16) with respect to LR0 according to Welch’s T-tests (Welch, 1947). See main text for the full expressions for all learning rules.

Cartesian genetic programming evolves efficient error-driven learning rules.

(A) Network sketch. Multiple input neurons with Poisson activity project to two neurons. One of the neurons (the teacher) generates a target for the other (the student). The membrane potentials of teacher and student as well as the filtered pre-synaptic spike trains are provided to the plasticity rule that determines the weight update. (B) Root mean squared error between the teacher and student membrane potential over the course of learning using the evolved plasticity rule: Δwj(t)=η[v(t)-u(t)]s¯j(t). (C) Synaptic weights over the course of learning corresponding to panel B. Horizontal dashed lines represent target weights, that is, the fixed synaptic weights onto the teacher. (D) Fitness of the best individual per generation as a function of the generation index for multiple runs of the evolutionary algorithm with different initial conditions. Labels represent the rule at the end of the respective run. Colored markers represent fitness of each plasticity rule averaged over 15 validation tasks not used during the evolutionary search; error bars indicate one standard deviation.

Cartesian genetic programming evolves diverse correlation-driven learning rules.

(A) Network sketch. Multiple inputs project to a single output neuron. The current synaptic weight wj and the eligibility trace Ejc are provided to the plasticity rule that determines the weight update. (B) Membrane potential u of the output neuron over the course of learning using Equation 17. Gray boxes indicate presentation of the frozen-noise pattern. (C) Fitness (Equation 13) of the best individual per generation as a function of the generation index for multiple runs of the evolutionary algorithm with different initial conditions. Blue and red curves correspond to the two representative plasticity rules selected for detailed analysis. Blue and red markers represent fitness of the two representative rules and the orange marker the fitness of the homeostatic STDP rule (Equation 17; Masquelier, 2018), respectively, on 20 validation tasks not used during the evolutionary search. Error bars indicate one standard deviation over tasks. (D, E): Learning rules evolved by two runs of CGP (D: LR1, Equation 19; E: LR2, Equation 20). (F): Homeostatic STDP rule Equation 17 suggested by Masquelier, 2018. Top panels: STDP kernels Δwj as a function of spike timing differences Δtj for three different weights wj. Bottom panels: homeostatic mechanisms for those weights. The colors are specific to the respective learning rules (blue for LR1, red for LR2), with different shades representing the different weights wj. The learning rate is η=0.01.

Appendix 1—figure 1
Fitness of best individual per generation as a function of the generation index for multiple runs of the evolutionary algorithm with different initial conditions for hyperparameter set 0.
Appendix 1—figure 2
Fitness of best individual per generation as a function of the generation index for multiple runs of the evolutionary algorithm with different initial conditions for hyperparameter set 1.
Appendix 1—figure 3
Fitness of best individual per generation as a function of the generation index for multiple runs of the evolutionary algorithm with different initial conditions for hyperparameter set 2.
Appendix 1—figure 4
Fitness of best individual per generation as a function of the generation index for multiple runs of the evolutionary algorithm with different initial conditions for hyperparameter set 3.
Appendix 1—figure 5
Causal and homeostatic terms of LR-LR6 over trials.

c+,c- represent causal terms (prefactors of eligibility trace), h+,h- represent homeostatic terms, for positive and negative rewards, respectively.

Appendix 1—figure 6
Cumulative reward of LR-LR5 over trials.

Solid line represent mean, shaded regions indicate plus/minus one standard deviation over 80 experiments. Cumulative reward of LR0 shown in all panels for comparison. Gray line indicates maximal performance (maximal reward received in each trial).

Appendix 1—figure 7
Evolution of membrane potential for two evolved learning rules.

Membrane potential u of the output neuron over the course of learning using the two evolved learning rules LR1 (top row, Equation 19) and LR2 (bottom row, Equation 20) (compare Figure 5B). Gray boxes indicate presentation of the frozen-noise pattern.

Tables

Appendix 1—table 1
Description of the network model used in the reward-driven learning task (4.5).
A model summary
Populations2
Topology
ConnectivityFeedforward with fixed connection probability
Neuron modelLeaky integrate-and-fire (LIF) with exponential post-synaptic currents
PlasticityReward-driven
MeasurementsSpikes
B populations
NameElementsSize
InputSpike generators with pre-defined spike trains (see 4.5)N
OutputLIF neuron1
C connectivity
SourceTargetPattern
InputOutputFixed pairwise connection probability p; synaptic delay d; random initial weights from 𝒩(0,σw2)
D neuron model
TypeLIF neuron with exponential post-synaptic currents
Subthreshold dynamicsdu(t)dt=-u(t)-ELτm+Is(t)Cm if not refractory
u(t)=ur else Is(t)=i,kwke-(t-tik)/τsΘ(t-tik), k: neuron index, i: spike index
SpikingStochastic spike generation via inhomogeneous Poisson process with intensity ϕ(u)=ρe(u-uth)/Δu; reset of u to ur after spike emission and refractory period of τr
E synapse model
PlasticityReward-driven with episodic update (Equation 2, Equation 3)
OtherEach synapse stores an eligibility trace (Equation 22)
F simulation parameters
PopulationsN=50
Connectivityp=0.8,σw=103pA
Neuron modelρ=0.01Hz,Δu=0.2mV,EL=70mV,ur=70mV,uth=55mV,τm=10ms,Cm=250pF,τr=2ms,τs=2ms
Synapse modelη=10,τM=500ms,d=1ms
InputM=30,r=6Hz,T=500ms,ntraining=500,nexp=10
Otherh=0.01ms,R{-1,1},mr=100
G CGP parameters
Populationμ=1,pmutation=0.035
Genomeninputs={3,4},noutputs=1,nrows=1,ncolumns={12,24},lmax={12,24}
PrimitivesAdd, Sub, Mul, Div, Const(1.0), Const(0.5)
EAλ=4,nbreeding=4,ntournament=1,reorder={true,false}
Othermaxgenerations=1000,minimalfitness=500
Appendix 1—table 2
Description of the network model used in the error-driven learning task (4.6).
A model summary
Populations3
Topology
ConnectivityFeedforward with all-to-all connections
Neuron modelLeaky integrate-and-fire (LIF) with exponential post-synaptic currents
PlasticityError-driven
MeasurementsSpikes, membrane potentials
B populations
NameElementsSize
InputSpike generators with pre-defined spike trains (see 4.6)N
TeacherLIF neuron1
StudentLIF neuron1
C connectivity
SourceTargetPattern
InputTeacherAll-to-all; synaptic delay d; random weights w𝒰[wmin,wmax]; weights randomly shifted by wshift on each trial
InputStudentAll-to-all; synaptic delay d; fixed initial weights w0
D neuron model
TypeLIF neuron with exponential post-synaptic currents
Subthreshold dynamicsdu(t)dt=-u(t)-ELτm+Is(t)CmIs(t)=i,kJke-(t-tik)/τsΘ(t-tik)k: neuron index, i: spike index
SpikingStochastic spike generation via inhomogeneous Poisson process with intensity ϕ(u)=ρe(u-uth)/Δu; no reset after spike emission
E synapse model
PlasticityError-driven with continuous update (Equation 7, Equation 9)
F simulation parameters
PopulationsN=5
Connectivitywmin=-20,wmax=20,wshift{-15,15},w0=5
Neuron modelρ=0.2Hz,Δu=1.0mV,EL=-70mV,uth=-55mV,τm=10ms,Cm=250pF,τs=2ms
Synapse modelη=1.7,d=1ms,τI=100.0ms
Inputrmin=150Hz,rmax=850Hz,T=10,000ms,nexp=15
Otherh=0.01ms,δt=5ms
G CGP parameters
Populationμ=4,pmutation=0.045
Genomeninputs=3,noutputs=1,nrows=1,ncolumns=12,lmax=12
PrimitivesAdd, Sub, Mul, Div, Const(1.0)
EAλ=4,nbreeding=4,ntournament=1
Othermaxgenerations=1000,minimalfitness=0.0
Appendix 1—table 3
: Description of the network model used in the correlation-driven learning task (4.7).
A model summary
Populations2
Topology
ConnectivityFeedforward with fixed connection probability
Neuron modelLeaky integrate-and-fire (LIF) with exponential post-synaptic currents
PlasticityReward-driven
MeasurementsSpikes
B populations
NameElementsSize
InputSpike generators with pre-defined spike trains (see 4.5)N
OutputLIF neuron1
C connectivity
SourceTargetPattern
InputOutputFixed pairwise connection probability p; synaptic delay d; random initial weights from 𝒩(0,σw2)
D neuron model
TypeLIF neuron with exponential post-synaptic currents
Subthreshold dynamicsdu(t)dt=-u(t)-ELτm+Is(t)Cm if not refractory
u(t)=ur else Is(t)=i,kwke-(t-tik)/τsΘ(t-tik), k: neuron index, i: spike index
SpikingStochastic spike generation via inhomogeneous Poisson process with intensity ϕ(u)=ρe(u-uth)/Δu; reset of u to ur after spike emission and refractory period of τr
E synapse model
PlasticityReward-driven with episodic update (Equation 2, Equation 3)
OtherEach synapse stores an eligibility trace (Equation 22)
F simulation parameters
PopulationsN=50
Connectivityp=0.8,σw=103pA
Neuron modelρ=0.01Hz,Δu=0.2mV,EL=70mV,ur=70mV,uth=55mV,τm=10ms,Cm=250pF,τr=2ms,τs=2ms
Synapse modelη=10,τM=500ms,d=1ms
InputM=30,r=6Hz,T=500ms,ntraining=500,nexp=10
Otherh=0.01ms,R{-1,1},mr=100
G CGP parameters
Populationμ=8,pmutation=0.05
Genomeninputs=2,noutputs=1,nrows=1,ncolumns=5,lmax=5
PrimitivesAdd, Sub, Mul, Div, Pow, Const(1.0)
EAλ=8,nbreeding=8,ntournament=1
Othermaxgenerations=2000,minimalfitness=10.0

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Jakob Jordan
  2. Maximilian Schmidt
  3. Walter Senn
  4. Mihai A Petrovici
(2021)
Evolving interpretable plasticity for spiking networks
eLife 10:e66273.
https://doi.org/10.7554/eLife.66273