1. Computational and Systems Biology
  2. Neuroscience
Download icon

Continuous attractors for dynamic memories

  1. Davide Spalla  Is a corresponding author
  2. Isabel Maria Cornacchia
  3. Alessandro Treves
  1. SISSA – Cognitive Neuroscience, Via Bonomea, Italy
  2. University of Turin – Physics Department, Italy
Research Article
  • Cited 0
  • Views 1,031
  • Annotations
Cite this article as: eLife 2021;10:e69499 doi: 10.7554/eLife.69499

Abstract

Episodic memory has a dynamic nature: when we recall past episodes, we retrieve not only their content, but also their temporal structure. The phenomenon of replay, in the hippocampus of mammals, offers a remarkable example of this temporal dynamics. However, most quantitative models of memory treat memories as static configurations, neglecting the temporal unfolding of the retrieval process. Here, we introduce a continuous attractor network model with a memory-dependent asymmetric component in the synaptic connectivity, which spontaneously breaks the equilibrium of the memory configurations and produces dynamic retrieval. The detailed analysis of the model with analytical calculations and numerical simulations shows that it can robustly retrieve multiple dynamical memories, and that this feature is largely independent of the details of its implementation. By calculating the storage capacity, we show that the dynamic component does not impair memory capacity, and can even enhance it in certain regimes.

eLife digest

When we recall a past experience, accessing what is known as an ‘episodic memory’, it usually does not appear as a still image or a snapshot of what occurred. Instead, our memories tend to be dynamic: we remember how a sequence of events unfolded, and when we do this, we often re-experience at least part of that same sequence. If the memory includes physical movement, the sequence combines space and time to remember a trajectory. For example, a mouse might remember how it went down a hole and found cheese there.

However, mathematical models of how past experiences are stored in our brains and retrieved when we remember them have so far focused on snapshot memories. ‘Attractor network models’ are one type of mathematical model that neuroscientists use to represent how neurons communicate with each other to store memories. These models can provide insights into how circuits of neurons, for example those in the hippocampus (a part of the brain crucial for memory), may have evolved to remember the past, but so far they have only focused on how single moments, rather than sequences of events, are represented by populations of neurons.

Spalla et al. found a way to extend these models, so they could analyse how networks of neurons can store and retrieve dynamic memories. These memories are represented in the brain as ‘continuous attractors’, which can be thought of as arrows that attract mental trajectories first to the arrow itself, and once on the arrow, to the arrowhead. Each recalled event elicits the next one on the arrow, as the mental trajectory advances towards the arrowhead. Spalla et al. determined that memory networks in the hippocampus of mammals can store large numbers of these ‘arrows’, up to the same amount of ‘snapshot’ memories predicted to be stored with similar models.

Spalla et al.’s results may allow researchers to better understand memory storage and recall, since they allow for the modelling of complex and realistic aspects of episodic memories. This could provide insights into processes such as why our minds wander, as well as having implications for the study of how neurons physically interact with each other to transmit information.

Introduction

The temporal unfolding of an event is an essential component of episodic memory. When we recall past events, or we imagine future ones, we do not produce static images but temporally structured movies, a phenomenon that has been referred to as 'mental time travel' (Eichenbaum and Cohen, 2004; Tulving, 2002).

The study of the neural activity of the hippocampus, known for its first-hand involvement in episodic memory, has provided many insights on the neural basis of memory retrieval and its temporal dynamics. An interesting example is the phenomenon of hippocampal replay, that is the reactivation, on a compressed time scale, of sequences of cells active in previous behavioral sessions. Replay takes place during sharp wave ripples, fast oscillations of the hippocampal local field potential that are particularly abundant during sleep and restful wakefulness (Buzsáki et al., 1983; Buzsáki et al., 1992). Indeed, replay has been observed during sleep (Skaggs and McNaughton, 1996; Nádasdy et al., 1999), inter-trial rest periods (Foster and Wilson, 2006; Jackson et al., 2006), and during still periods in navigational tasks (Dupret et al., 2010; Pfeiffer and Foster, 2013). Replay activity has been hypothesized to be crucial for memory consolidation (O'Neill et al., 2010) and retrieval (Karlsson and Frank, 2009), as well as for route planning (Pfeiffer and Foster, 2013; Ólafsdóttir et al., 2018).

A temporally structured activation takes place also before the exposure to an environment (Dragoi and Tonegawa, 2011), a phenomenon known as preplay, and a recent study showed that this dynamical feature emerges very early during development, preceding the appearance of theta rhythm (Farooq and Dragoi, 2019) in the hippocampus. The fact that hippocampal sequences are present before the exposure to the environment suggests that their dynamical nature is not specific to a role in spatial cognition, but is inherent to hippocampal operation in general. Moreover, in a recent study Stella et al., 2019 have shown that the retrieved sequences of positions during slow wave sleep are not always replaying experienced trajectories, but are compatible with a random walk on the low dimensional manifold that represents the previously explored environment. This suggests that what is essential are not the sequences themselves, but the tendency to produce them: neural activity tends to move, constrained to abstract low-dimensional manifolds which can then be recycled to represent spatial environments, and possibly non-spatial ones as well. This dynamic nature extends to multiple timescales, as suggested by the observation that the neural map of the same environment progressively changes its component cells over time (Ziv et al., 2013).

Low-dimensional, dynamic activity is not constrained to a single subspace: replay in sleep can reflect multiple environments (Dragoi and Tonegawa, 2013; Gridchyn et al., 2020), the content of awake replay reflects both the current and previous environments (Karlsson and Frank, 2009), and during behavior fast hippocampal sequences appear to switch between possible future trajectories (Kay et al., 2020). Further evidence comes from a recent study with human participants learning novel word pair associations (Vaz et al., 2020). The study shows that the same pair-dependent neural sequences are played during the encoding and the retrieval phase.

A similar phenomenon – a dynamic activity on low dimensional manifolds – is present in memory schemata, cognitive frameworks that constrain and organize our mental activity (Ghosh and Gilboa, 2014), and have been shown to have a representation in the medial temporal lobe (Baraduc et al., 2019). Yet another example of dynamical, continuous memories is offered by motor programs, which have been described as low-dimensional, temporally structured neural trajectories (Shenoy et al., 2013; Gallego et al., 2017; Oztop and Arbib, 2002), or as dynamical flows on manifolds (Huys et al., 2014; Pillai and Jirsa, 2017).

We refer to these objects as dynamical continuous attractors, since they involve a continuous subspace that constrains and attracts the neural activity, and a dynamical evolution in this subspace. Figure 1 schematically illustrates the concept of dynamical continuous attractors and their possible role in some of the neural processes described above.

Schematic illustration of dynamic continuous attractors as a basis of different neural processes.

Top row: a scheme of continuous attractive manifolds, with a dynamic component in 1D (a), 2D (b) and 3D (c). The neural activity quickly converges on the attractive manifold (dotted arrows), then slides along it (full arrows), producing a dynamics that is temporally structured and constrained to a low dimensional subspace. Bottom row: multiple dynamic memories could be useful for route planning (top left), involved in mind wandering activity (bottom left) or represent multiple learned motor programs (right).

In most cases, computational analyses of low-dimensional neural dynamics are not concerned with memory, and focus on the description of the features of single attractors, more than on their possible coexistence. On the other hand, mechanistic models of memory usually neglect dynamical aspects, treating memories as static objects, either discrete (Amit et al., 1985) or continuous (Battaglia and Treves, 1998; Monasson and Rosay, 2013; Samsonovich and McNaughton, 1997; Spalla et al., 2019). The production of sequences of discrete memories can be implemented with a heteroassociative component (Sompolinsky and Kanter, 1986), usually dependent on the time integral of the instantaneous activity, that brings the network out of equilibrium and to the next step in the sequence. A similar effect can be obtained with an adaptation mechanism in a coarse grained model of cortical networks (Kropff and Treves, 2005), with the difference that in this case the transitions are not imposed, but driven by the correlations between the memories in so-called latching dynamics (Russo et al., 2008; Kang et al., 2017). Moreover, adaptation-based mechanisms have been used to model the production of random sequences on continuous manifolds (Azizi et al., 2013), and shown to be crucial in determining the balance between retrieval and prediction in a network describing CA3-CA1 interactions (Treves, 2004). In the case of continuous attractor networks, movement can be induced also by mechanisms that integrate an external velocity input and make use of asymmetric synaptic strengths. Models of this kind have been used for the description of head direction cells (Zhang, 1996), spatial view cells (Stringer et al., 2005), and grid cells (Fuhs and Touretzky, 2006; Burak and Fiete, 2009), and can represent simultaneously the positions of multiple features and their temporal evolution (Stringer et al., 2004). In the simplest instantiation, these systems do not necessarily reflect long-term memory storage: the activity is constrained on a single attractive manifold, which could well be experience independent.

Here, we propose a network model able to store and retrieve multiple independent dynamic continuous attractors. The model relies on a map-dependent asymmetric component in the connectivity that produces a robust shift of the activity on the retrieved attractive manifold. This connectivity profile is conceived to be the result of a learning phase in which the mechanism of spike timing dependent plasticity (STDP) (Markram et al., 1997) produces the asymmetry. Crucially, the asymmetry is not treated here as a 'pathological' feature, assumed to level out in the limit of long learning, but as a defining trait of the stored memories. The balance between two components – one symmetric and trajectory-averaged, the other asymmetric and trajectory-dependent – is explicit in the formulation of the model, and allows to study their effects on memory storage.

In what follows we develop an analytical framework that allows to derive the dependence of important features of the dynamics, such as the replay speed and the asymmetry of the activity cluster, as a function of the relevant parameters of the model. We show with numerical simulations that the behavior of the model is robust with respect to its details, and depends weakly on the shape of the interactions. Finally, we estimate the storage capacity for dynamical memories and we find it to be of the same order of the capacity for static continuous attractors, and even higher in some regimes.

Modeling framework

A mechanistic model for dynamic retrieval

The model we consider is a continuous attractor neural network, with an additional anti-symmetric component in the connectivity strength. We consider a population of N neurons, with recurrent connectivity described by an interaction matrix Jij, whose entries represent the strength of the interaction between neuron i and j. The activity of each neuron is described by a positive real number Vi+ representing its instantaneous firing rate. The dynamic evolution of the network is regulated by the equations:

(1) τVit+Vi=g[(jiJijVj-h0)]+

where []+ is the threshold linear activation function

(2) [x]+=xθ(x)

with the gain g modulating the slope and the Heaviside step function θ() setting to zero sub-threshold inputs. The first term on the right hand side of Equation 1 represent the excitatory inputs provided to neuron i from the rest of the network through recurrent connections. The threshold h0 and the gain g are global parameters that regulate the average activity and the sparsity of the activity pattern (Treves, 1990).

In numerical simulations, these parameters are dynamically adjusted at each time step to constrain the network to operate at a certain average activity (usually fixed to one without loss of generality) and at a certain sparsity f, defined as the fraction of active neuron at each time (see appendix A). The connectivity matrix J of the network encodes a map of a continuous parameter x spanning a low-dimensional manifold, e.g. the position in an environment. To do so, each neuron is assigned a preferential firing location xi in the manifold to encode, and the strength of the interaction between pairs of neuron is given by a decreasing, symmetric function of the distance between their preferred firing locations

(3) JijKS(|xi-xj|).

This shape of the interactions is a typical one in the framework of continuous attractor neural networks (Samsonovich and McNaughton, 1997; Battaglia and Treves, 1998; Tsodyks, 1999) and is thought to come from a time-averaged Hebbian plasticity rule: neurons with nearby firing fields will fire concurrently and strengthen their connections, while firing fields that are far apart will produce weak interactions. The symmetry of the function KS, usually called interaction kernel, ensures that the network reaches a static equilibrium, where the activity of the neurons represents a certain position in the manifold and, if not pushed, remains still.

The shift mechanism

The assumption of symmetric interactions neglects any temporal structure in the learning phase. In case of learning a spatial map, for example, the order in which recruited neurons fire along a trajectory may produce an asymmetry in the interactions as a consequence of Spike Timing Dependent Plasticity (Markram et al., 1997), that requires the postsynaptic neuron to fire after the presynaptic one in order to strengthen the synapse. This phenomenon can be accounted for in the definition of the interaction kernel. Any asymmetric kernel can be decomposed in two contributions:

(4) K(|xi-xj|)=KS(|xi-xj|)+γKA(|xi-xj|)

where KS is the usual symmetric component and KA is an anti-symmetric function (KA(xi-xj)=-KA(xj-xi)). The parameter γ regulates the relative strength of the two components. The presence of KA will generate a flow of activity along the direction of asymmetry: neuron i activates neuron j that, instead of reciprocating, will activate neurons downstream in the asymmetric direction. Mechanisms of this kind have been shown to produce a rigid shift of the encoded position along the manifold, without loss of coherence (Zhang, 1996; Burak and Fiete, 2009; Fuhs and Touretzky, 2006). In the quantitative analysis that follows we will concentrate, when not stated otherwise, on a kernel K with the exponential form

(5) K(|xi-xj|)=e-|xi-xj|+γsign((xi-xj)n)e-|xi-xj|/ξ

where n is a unit vector pointing in the – constant – direction along which the asymmetry is enforced, and ξ is the spatial scale of the asymmetric component, which is fixed to one where not explicitly stated otherwise. Moreover, we make the simplifying assumption of periodic boundary conditions on the manifold spanned by x, such that the dynamics follows a periodic cycle. Both the kernel form and the periodic boundary conditions simplify the analytical description of the model, but are not a required feature of the model. In fact, all the results presented hold for a large class of interaction kernels, and the boundary conditions can be modified (e.g. with the introduction of interactions between different memories) without compromising the functionality of the network. Both points will be addressed in the analyses of the model in the next sections.

Results

Asymmetric recurrent connections produce dynamic retrieval

The spontaneous dynamics produced by the network is constrained to the low dimensional manifold codified in the connectivity matrix and spanned by the parameter x. The short-range interactions and the uniform inhibition enforced by the firing threshold h0 produce a localized 'bump' of activity in the manifold. The presence of an asymmetry in the connection strengths prevents the system from remaining in a stationary equilibrium. Instead, it generates a steady flow of activity in the direction of the asymmetry.

This flow is illustrated in Figure 2 (a),(b) and (c), obtained with numerical simulation of a network encoding a one-, two-, or three-dimensional manifold, respectively. In the simulation, each neuron is assigned to a preferential firing location xi in the manifold to encode, and the plots show the activity of the network organized according to this disposition. The activity of the population clusters in a bump around a certain position at each time point (t1t2, and t3), and the bump shifts by effect of the asymmetric component of the interactions. In this way, the neural population collectively encodes an evolving coordinate on the manifold spanned by x. The coherence of the representation is not affected by the presence of the asymmetric term: the movement of the activity bump happens without dissipation.

Dynamic retrieval of a continuous manifold.

First row: each plot presents three snapshots of the network activity at three different times (t1, t2 and t3), for a system encoding a one dimensional (a), two dimensional (b) and three dimensional (c) manifold. In (c), activity is color-coded (blue represents low activity, red is high activity, silent neurons are not plotted for better readability). In all cases, the anti-symmetric component is oriented along the x axis. (d) Dependence of the speed on γ and f. Dots are data from numerical simulations, full lines are the fitted curves. (e) Retrieval of two crossing trajectories. Black arrows represent the two intersecting encoded trajectories, each parallel to one of the axis. Full colored lines show the trajectories actually followed by the center of mass of the activity from the same starting point. Blue curve: low γ, the activity switches trajectories when it reaches the crossing point. Orange curve: high γ, successful crossing. In both cases ξ=10. The blue and the orange insets show the activity bumps in the corresponding cases; the top-right inset shows the dependence of the value γ*, required for crossing, on ξ.

The speed of movement of the bump is modulated by the value of the parameter γ and by the sparsity level of the representation f, that is the fraction of neurons active at each given time during the dynamics. The dependence of the speed on these two parameters is illustrated in 2(d). Stronger asymmetry (high γ) produces a faster shift. Interestingly, the sparsity value f acts as a modulator of the influence of γ: sparser representations move more slowly than dense ones.

While γ describes a feature of the synaptic interactions, determined during the learning phase and relatively fixed at the short timescales of retrieval, f can be instantaneously modulated during retrieval dynamics. A change in the gain or the excitability of the population can be used to produce dynamic retrieval at different speeds. Thus, the model predicts an interaction between the sparsity and the speed of the reactivation of a continuous memory sequence, with increased activity leading to faster replay. It is worth noting, however, that the direction of the dynamics is fixed with J: the model is able to retrieve either forward or backward sequences, but not alternate between them.

The dependence of the retrieval speed s on γ and f is well described by the approximate functional form

(6) s(γ,f)=Aγfbγ+cf+dγf+eγf2

This dependence is shown in 2(b), where the dots are the values obtained with numerical simulations and the full curves the fitted relationship. The full understanding of the nature of this functional form remains an open challenge for future analysis. As we will see in the next section, the analytical solution of the model yields the same form to describe the dependence of speed on γ and f, but a closed-form solution is still lacking.

In the model presented, the asymmetry in the interactions is enforced uniformly along a single direction also for two- and three-dimensional manifolds, representing the case in which neural dynamics follows a forced trajectory along one dimension, but is free to move without energy costs along the others. However, the same mechanism can be used to produce one-dimensional trajectories embedded in low-dimensional manifolds, with the introduction of a positional dependence in the direction of the asymmetry (Blum and Abbott, 1996). In this case, an interesting problem is posed by the intersection of two trajectories embedded in the same manifold: is the network, during the retrieval of one trajectory, able to cross these intersections, or do they hinder dynamical retrieval? The investigation of the full phenomenology of position-dependent asymmetric kernels with intersecting trajectories is beyond the scope of the present work, but we present in Figure 2(e) a numerical study of a minimal version of this problem, with two orthogonal trajectories (Figure 2(e), black arrows) embedded in a 2D manifold and memorized simultaneously in the network. Notice that in this case the two trajectories are parallel to the main axis of the square environment, but they do not need to be: any pair of orthogonal trajectories will behave in the same manner. When the network is cued to retrieve the horizontal trajectory, the behavior at the intersection depends on the strength γ and scale ξ of the asymmetric component. At low γ, the dynamics spontaneously switch trajectory at the intersection (Figure 2(e), blue curve), while for γ sufficiently large the retrieval of the horizontal trajectory is successful (Figure 2(e), orange curve). The value γ* required for a successful crossing depends on the spatial scale ξ: larger ξ allow for crossing with lower values of γ, as shown in the top-right inset of Figure 2(e), in which γ*(ξ) is plotted. Intuitively, the ability of the network to retrieve crossing trajectories depends on the shape of the activity bump, which needs to be sufficiently elongated in the direction of retrieval for the successful crossing of the intersection. The blue and orange insets of Figure 2(e) show the difference in shape of the bump in the case of a trajectory switch (blue) and successful crossing (orange).

Analytical solution for the single manifold case

The simplicity of continuous attractor models often allows to extract important computational principles from their analytical solution (Wu et al., 2008; Fung et al., 2010). In our case, the dynamic behavior of the system and its features can be fully described analytically with a generalization of the framework developed by Battaglia and Treves, 1998. For this purpose, it is easier to formulate the problem in the continuum, and describe the population activity {Vi} by its profile V(x) on the attractive manifold parametrized by the coordinate x, and the dynamical evolution as a discrete step map, equivalent to Equation 1.

(7) V(x,t+1)=g[h(x,t)]+
(8) h(x,t)=𝑑xK(x-x)V(x,t)-h0

The requirement of a rigid shift of population activity is then imposed by setting the activity at time t+1 to be equal at the activity at time t, but translated by an amount Δx=τsn, proportional to the speed s of the shift and in the direction n of the asymmetry in the connections. The timescale τ sets the time unit in which the duration of the evolution is measured and does not have an impact on the behavior of the system.

The activity profile V(x) is then found as the self-consistent solution to the integral equation

(9) V(x+Δx)=g[𝑑xK(x-x)V(x)-h0]+

Equation 9 is valid in general. We will focus here, for the explicit derivation (reported in appendix C), on the case of a one dimensional manifold with an exponential interaction kernel

(10) K(x-x)=e-|x-x|+γsign(x-x)e-|x-x|

In this case, the activity bump will take the form:

(11) V(x)={Cek1xcos(k2x)+gθ1-2gif -RxR0if -R>x or x>R

The parameters k1=k1(γ,s) and k2=k2(γ,s) determine the properties of the solution and they depend on the values of γ and speed s=Δx/τ.

k2 is related to the bump width by the relation

(12) R=π2k2

where R is the point at which V(x)=0. k1 is related to the asymmetry of the bump: in the limit case γ=0, s=0 (Figure 3(a), first column) k1=0, and we recover the cosine solution of the symmetric kernel case studied in Battaglia and Treves, 1998. Larger k1 values result in more and more asymmetric shapes (Figure 3(a), second and third columns).

Analytical solution of the model.

(a) The shape of the bump for increasing values of γ. (b) Dependence of the sparsity f on the gain g of the network. (c) Dependence of the speed of the shift on γ, at different values of sparsity. Dots show the numerical solution (note some numerical instability at low f and γ), full curves are the best fits.

From this analytical solution, we can determine the dependence of the speed s on the asymmetry strength γ and on the sparsity f=2R/L (note that in the continuum case the fraction of active neurons is given by the ratio between the bump size 2R and the manifold size L). The sparsity f is modulated by the value of the gain g, as shown in Figure 3(b): a larger gain in the transfer function corresponds to a sparser activity. The exact relation s(γ,f) can be obtained from the numerical solution of a transcendental equation (see appendix C), and can be approximated with a functional shape analogous to the one used for the simulated network:

(13) s(γ,f)=Aγfbγ+cf+dγf+eγf2

The full transcendental solution and the fitted curves are reported in Figure 3(c).

Dynamic retrieval is robust

The analytical solution of the model shows that the network performs dynamical retrieval for all values of the asymmetry strength γ, and that this parameter influences the retrieval speed and the shape of the activity bump. To further investigate the robustness of dynamical retrieval to parameter changes, we investigate with numerical simulations the behavior of the model with respect to another important parameter: the scale ξ of the anti-symmetric component. We run several dynamics of a network with interaction kernel given by

(14) K(d)=e-d+γsign(d)e-d/ξ

varying the parameters γ and ξ and measuring the retrieval speed, the peak value of the activity bump and its skeweness. The joint effects of γ and ξ are shown in Figure 4. In the whole range of parameters analyzed, spanning four orders of magnitude for both parameters, the network was able to produce dynamic retrieval. γ and ξ affect the speed of the shift, the peak values of the activity distribution and the skewness of the activity bump, without hindering network functionality. Moreover, all these feature vary gradually and mildly with the parameters values, producing dynamical behavior qualitatively similar in the full parameter range. This analysis shows that dynamical retrieval does not require any fine tuning of network parameters, but relies on the assumption of an exponential shape for the interaction kernel. How robust is the behavior of the network to the details of the kernel shape?

Dynamical retrieval in a wide range of parameters.

Effect of the kernel strength γ and its spatial scale ξ, in the case of the exponential kernel K(d)=e-|d|+γsign(d)e-|d|/ξ .(a) Retrieval speed (b) Peak value of the activity (c) Skewness of the activity bump.

We addressed this question by simulating the network dynamics with alternative kernel choices. We kept fixed the symmetric component, and explored three different anti-simmetric shapes: a gaussian-derivative shape (Figure 5a), a sinusoidal shape (Figure 5b) and a double step function (Figure 5c). Each of these simulations produced the same retrieval dynamics (a stable bump shifting at constant speed), the only effect of the kernel shape being on the details of the shape of the activity bump (Figure 5, bottom row). This shows that the dynamic retrieval mechanism, much like standard continuous attractors, is robust with respect to the precise shape of the interactions. Importantly, no particular relationship is required between the shape of the symmetric and the anti-symmetric components of the kernel.

Different interaction kernels produce similar behavior.

Three examples of dynamics with the same symmetric component and three different anti-symmetric components. Top row: shape of the anti-symmetric component KA. Bottom row: three snapshots of the retrieval dynamics for the corresponding KA. (a) Gaussian derivative; (b) Sinusoidal; (c) Anti-symmetric step function, θ*=θ(d)θ(1-d)-θ(-d)θ(d-1).

A dynamical memory: storing multiple manifolds

We described in detail the behavior of a neural network with asymmetric connectivity in the case of a single manifold encoded in the synaptic connectivity. For the network to behave as an autoassociative memory, however, it needs to be able to store and dynamically retrieve multiple manifolds. This is possible if we construct the interaction matrix Jij as the sum of the contributions from p different, independently encoded manifolds:

(15) Jij=1Nμ=1pK(xiμ-xjμ)

Here, each xiμ represent the preferred firing location of neuron i in the manifold μ, and K is the same interaction kernel as in Equation 4, containing a symmetric and anti-symmetric component.

The resulting dynamics show multiple continuous attractors, corresponding to the stored manifolds. Given an initial configuration, the networks rapidly converges to the nearest (i.e. most correlated) attractor, forming a coherent bump that then moves along the manifold as a consequence of the asymmetric component of the connectivity. The same dynamics, if projected on the other unretrieved manifolds, appear as random noise. This is illustrated in Figure 6 obtained with numerical simulations of a network encoding three different manifolds (of dimension one in (a), dimension two in (b)), and dynamically retrieving the first one. How does this shifting activity bump relate to the activity of single cells? To clarify this aspect we simulated an electrophysiological recording from the dynamical retrieval of Figure 6(a). We selected a random subset of 15 of the 1000 cells of the network, and generated spike trains using a poisson point process with an instantaneous firing rate proportional to the activity level of each cell yielded by the retrieval dynamics at each time step, plus a small noise. With this procedure we obtain the spike trains of each cell, that we can visualize with a rasterplot (Figure 6c). We can sort the cells according to their firing field position on each of the three manifolds (if these were, e.g., linear tracks, this would correspond to sorting according to place field positions in each of the tracks). When ordered as in the retrieved manifold, the recorded cells show a structured pattern that is considered the hallmark of sequential activity in experimental studies (Figure 6c, first column). If the same spikes are ordered according to the unretrieved manifolds, the pattern is lost (Figure 6c, second and third column), indicating that the population activity is retrieving the dynamical structure of the first manifold specifically.

Dynamic retrieval in the presence of multiple memories.

(a) In one dimension (b) In two dimensions. Each row represents a snapshot of the dynamics at a point in time. The activity is projected on each of the three attractors stored in the network. In both cases, the first attractor is retrieved, and the activity organizes in a coherent bump that shifts in time. The same activity, projected onto the two non-retrieved maps looks like incoherent noise ((a) and (b), second and third columns). (c) Spiking patterns from a simulated recording of a subset of 15 cells in the network. When cells are sorted according to their firing field on the retrieved manifold (first column), they show sequential activity. The same activity is scattered if looked from the point of view of the unretrieved manifolds (second and third columns).

Multiple dynamic manifolds can be memorized and retrieved by the network, with different speeds. Figure 7 (b) shows the result of the numerical simulation of a network with five different one-dimensional manifold stored in its connectivity matrix, each encoded with a different value of γ (see appendix D). These manifold are dynamically retrieved by the network at different speeds, depending on the corresponding γ. This allows the model to simultaneously store memories without the constraint of a fixed dynamical timescale, an important feature for the description of biological circuits that need to be able to operate at different temporal scales.

Retrieval speed and memory interactions.

(a) Multiple mainfolds with different velocity can be stored in a network with manifold-dependent asymmetric connectivity (b) The retrieved position at different timesteps during the retrieval dynamics of five different manifolds, stored in the same network, each with a different value of γ. (c) Manifolds memorized in the same network can be linked together (d) Sequential retrieval of five manifolds. Top row: overlap, measuring the overall coherence with the manifold, as a function of time. Bottom row: retrieved position in each manifold as a function of time.

Different memories stored in the same neural population can interact with each other, building more retrieval schemes in which, for example, the retrieval of a memory cues the retrieval of another one. To investigate this possibility, we have incorporated in the model a mechanism for interaction between memories, in which the endpoint of a dynamical, one-dimensional manifold elicits the activation of the start point of a different one (see Appendix E). This results in the sequential retrieval of multiple memories, one after the other, as illustrated in Figure 7 (d). The top row shows the evolution in time of the overlaps mμ:

(16) mμ(t)=1N2i,jKS(xiμ-xjμ)Vi(t)Vj(t)

These order parameters quantify the coherence of the population activity V(t) with each of the manifolds. Localized activity in manifold μ results in a large mμ, while a low mμ corresponds to an incoherent scattering of the activity. The network retrieves the manifolds in sequence, one at a time, following the instructed transitions encoded in its connectivity. The all-or-nothing behavior of the coherence parameters segments the continuous dynamics of the network into a sequence of discrete states.

The bottom row shows the evolution of the retrieved position, given in each manifold by the center of mass:

(17) <x>μ(t)=1NixiμVi(t)

The dynamic runs across the retrieved manifold, from its beginning to its end, then jumps to next one and repeats the process. Note that the position in each of the un-retrieved manifold fluctuates around L/2, as a consequence of the incoherence of the activity. Within each of the retrieved manifolds, the dynamic retains its continuous nature in the representation of the evolving position.

This sequential dynamic goes beyond the simple cued retrieval of independent memories that is the focus of most autoassociative memory models, and provides an example of a hybrid computational system, encoding both continuous and discrete features.

The interaction mechanism introduced here provides the opportunity to investigate the effect of more complex interactions than the simple memory chain presented here. We present here this first example as a proof of principle of the possibility of storing interacting dynamical memories, and will proceed to the investigation of more complex structures (e.g. interaction networks, probabilistic interactions, etc.) in future studies.

Storage capacity

The number of maps that can be stored and retrieved by an attractor network of this kind is typically proportional to the number of inputs per neuron C (Treves and Rolls, 1991). The memory load α=p/C crucially determines the behavior of the system: when α is increased above a certain threshold value αc, the network is not able to retrieve any of the stored memories, falling instead into a disordered state. Therefore it is the magnitude of αc, that is the storage capacity of the system, that determines how effectively it can operate as a memory. To estimate the storage capacity of dynamic continuous attractors, and to investigate how it is impacted by the presence of asymmetric connections, we proceed along two complementary paths.

In the case a fully connected network, where the analytical tools developed for equilibrium systems are not applicable, we take advantage of the fact that numerical simulations can be effective for the estimation of the capacity, since the number of connections per neuron C (the relevant parameters in the definition of the storage capacity αc=p/C) coincides with the number of neurons, minus one. For a highly diluted system, on the other hand, the number of neurons is much larger than C, making the simulation of the system very difficult in practice. We then resort to an analytical formulation based on a signal-to-noise analysis (Battaglia and Treves, 1998), that exploits the vanishing correlations between inputs of different neurons in a highly diluted network, and does not require symmetry in the connectivity (Derrida et al., 1987). The quantification of the effect of loops in the dense connectivity regime, developed in Shiino and Fukai, 1992 and Roudi and Treves, 2004 for the case of static, discrete attractors, is beyond the scope of the present work and remains an interesting open direction.

In both the fully connected and the highly diluted case we study the dependence of the capacity on two important parameters: the map sparsity, that is the ratio between the width of the connectivity kernel (fixed to one without loss of generality) and the size L of the stored manifolds, and the asymmetry strength γ. Note that the map sparsity 1/L is different from the activity sparsity f: the former is a feature of the stored memories, that we will treat as a control parameter in the following analysis; the latter is a feature of the network dynamics, and its value will be fixed by an optimization procedure in the calculation of the maximal capacity.

Analytical calculation of the capacity in the highly diluted limit

The signal-to-noise approach we follow, illustrated in details in Battaglia and Treves, 1998, involves writing the local field hi as the sum of two contributions: a signal term, due to the retrieved – ‘condensed’ – map, and a noise term consisting of the sum of the contributions of all the other, ‘uncondensed’ maps. In the diluted regime (C/N0), these contributions are independent and can be summarized by a Gaussian term ρz, where z is a random variable with zero mean and unit variance. In the continuous limit, assuming without loss of generality that map μ=1 is retrieved, we can write:

(18) h(x1)=gL𝑑x1K(x1-x1)V(x1)+ρz

The noise will have variance:

(19) ρ2=αyL2K2(x-x)

where L is the size of the map, K2(x-x) is the spatial variance of the kernel and

(20) y=1NiVi2

is the average square activity.

We can write the fixed point equation for the average activity profile m1(x), incorporating the dynamic shift with an argument similar to the one made for the single map case:

(21) m1(x+Δx)=g+Dz(h(x)-h0)

where Dz=(e-z2/2/2π)dz and +f(x)𝑑x=f(x)θ(x)𝑑x. The average square activity y, entering the noise term, reads 

(22) y=g2L𝑑x+Dz(h(x)-h0)2

Introducing the rescaled variables 

(23) w=-h0ρ
(24) v(x)=m1(x)ρ

And the functions

(25) 𝒩(x)=xΦ(x)+σ(x)
(26) (x)=(1+x2)Φ(x)+xσ(x)

where Φ(x) and σ(x) are the Gaussian cumulative and the Gaussian probability mass function respectively, we can rewrite the fixed-point equation as 

(27) v(x+Δx)=g𝒩(𝑑xK(x-x)v(x)+w)
(28) y=ρ2g2dxL(𝑑xK(x-x)v(x)+w)

Substituting Equation 28 in the expression for the noise variance 19 we obtain

(29) 1α=g2LK2𝑑x(𝑑xK(x-x)v(x)+w)

If we are able to solve Equation 27 for the rescaled activity profile v(x), we can use Equation 29 to calculate α. We can then maximize α with respect to g and w: this yields the maximal value αc for which retrieval solutions can be found.

These equations are valid in general and have to be solved numerically. Here we present the results for the case of one-dimensional manifolds and interactions given by the exponential kernel of Equation 36. In this case, we have

(30) K2(x-x)=(1+γ2)KS2(x-x).

where KS(x-x)=e-|x-x| is the symmetric component of the kernel. A simple approximation, illustrated in appendix F along with the detailed solution procedure, allows to decouple the dependence of αc on γ and L, with the former given by the spatial variance given by Equation 30 and the latter by the solution of Equations 27 and 29 in the γ=0 case. We therefore have:

(31) αc(L,γ)αc(L,0)/(1+γ2)

The storage capacity is plotted in Figure 8(a) as a function of γ and L.

Storage capacity.

(a) Storage capacity of a diluted network: dependence on γ and 1/L (represented as log10(1/L)). (b) Storage capacity of a fully connected network: non monotonic dependence of the capacity on γ. Retrieval / no retrieval phase transition for different values of γ, obtained from simulations with N=1000, NS=10 and L=10, for 1D manifolds. Error bars show the standard error of the observed proportion of successful retrievals. The non-monotonic dependence of the capacity from γ can be appreciated here: the transition point moves toward the right with increasing γ up to γ1, then back to the left. (c) Storage capacity of a fully connected network as a function of map sparsity 1/L and asymmetry strength γ, for a one-dimensional and a two-dimensional dynamic continuous attractor.

For sparse maps and small values of the asymmetry, the capacity scales as

(32) αc-1ln(1/L)(1+γ2)

The scaling with 1/L is the same found by Battaglia and Treves, 1998 in the analysis of the symmetric case, as expected: for γ=0 the two models are equivalent.

The presence of asymmetry decreases the capacity, but does not have a catastrophic effect: the decrease is continuous and scales with a power of γ. There is therefore a wide range of values of asymmetry and map sparsity in which a large number of dynamic manifolds can be stored and retrieved.

Numerical estimation of the capacity for a fully connected network

To estimate the storage capacity for a fully connected network, we proceed with numerical simulations. For a network of fixed size N, and for given γ, L and number of maps p, we run a number of simulations NS, letting the network evolve from a random initial configuration. We consider a simulation to have performed a successful retrieval if the global overlap 

(33) mμ=1N2ijViVjKS(xiμ-xjμ)

that quantifies the coherence of the activity with map μ, is large for one map μ* (at least 95% of the overlap value obtained in the case of a single map) and low in all others maps μμ*. We then define the retrieval probability as pr=NR/NS, where NR is the number of observed retrievals.

We repeat the process varying the storage load, that is the number of stored manifolds p. As p is increased, the system reaches a transition point, at which the retrieval probability rapidly goes to zero. This transition is illustrated, for various values of γ, in Figure 8(b).

The number of maps pc at which the probability reaches zero defines the storage capacity αc(γ,L)=pc(γ,L)/N. Repeating this procedure for a range of values of γ and L, we obtain the plots shown in Figure 8(c), for networks encoding one dimensional and two dimensional dynamical memories.

The first thing that can be noticed is that the network can store a large number of maps in the fully connected case as well, for a wide range of γ and L. A network with size in the order of ten thousand neurons could store from tens up to hundreds of dynamical memories.

The capacity for one dimensional attractors is higher than the one for their two dimensional counterparts. This is in line with what was found for symmetric networks (Battaglia and Treves, 1998).

Finally, we see that the peak of the capacity is found not only for intermediate values of map sparsity – again in line with what is known from the symmetric case – but also for intermediate values of the coefficient γ. This shows that moderate values of asymmetry can be beneficial for the storage of multiple continuous attractors, a non-trivial phenomenon that may be crucial for the memory capacity of biological networks. In particular this suggests that the natural tendency of the neural activity to show a rich spontaneous dynamics not only does not hinder the possibility for multiple memories to coexist in the same population, but can be a crucial ingredient for the correct functioning of memory mechanisms.

Discussion

The results presented show how a continuous attractor neural network with memory-dependent asymmetric components in the connectivity can function as a dynamic memory. Our model is simple enough to be treated analytically, robustly produces dynamic retrieval for a large range of the relevant parameters and shows a storage capacity that is comparable to – and in some cases higher than – the capacity for static continuous attractors.

The analytical solution of the single manifold case shows that the interaction between the strength of the asymmetry and the velocity of the shift can be modulated by global features of the network activity such as its sparsity. This makes the network able to retrieve at different velocities in different regimes, without necessarily requiring short term synaptic modifications. The dependence of the retrieval speed from the sparsity of the activity yields a testable prediction in the context of hippocampal replay: faster reactivations are to be expected in association with an increase in the excitability of the population.

The insensitivity of the general features of the dynamics to the fine details of the shape of the interactions suggests that this mechanism could robustly emerge from learning or self organization processes in the presence of noise. The quantitative analysis of the learning process needed to effectively memorize low-dimensional dynamic manifolds is an interesting open direction, which goes beyond the scope of this work. However, the asymmetric Hebbian plasticity rule used here provides a simple and biologically realistic starting point.

Our analysis shows that the simple introduction of an aysmmetric Hebbian plasticity rule is sufficient to describe a dynamic memory able to store and retrieve multiple manifolds with different speeds, and it can incorporate interactions between them, producing the chained retrieval of a sequence of continuous memories.

The central result of the paper is the quantification of the storage capacity for dynamic continuous attractors, that we find to be large in magnitude, only mildly impacted by the asymmetry in diluted networks, and even higher than the capacity for static attractors in fully connected networks with moderate degrees of asymmetry. The storage capacity of out of equilibrium continuous attractors has been calculated, in a different scenario, by Zhong et al., 2020. The authors considered the case of an external signal driving the activity bump along the attractor, in a network of binary neurons, and proceeded to calculate the storage capacity with several assumptions that allowed to model the interference of multiple maps as thermal noise. Interestingly, their main result is broadly compatible with what we show here: in the highly diluted regime the velocity of the external signal has a mild – detrimental – effect on the capacity. This hints that out of equilibrium effects could show some form of universality across different network models and implementations of the shift mechanism. Moreover, a high capacity for dynamical sequences has shown to be achievable also in the case of discrete items (Gillett et al., 2020). Together these results suggest that the introduction of a temporal structure is compatible with the functioning of autoassociative memory in recurrent networks, and they open the way to the use of attractor models for the quantitative analysis of complex memory phenomena, such as hippocampal replay and memory schemata.

The model we propose suggests that the tendency of the activity to move in the neural population is a natural feature of networks with asymmetric connectivity, when the asymmetry is organized along a direction in a low dimensional manifold, and that static memories could be the exception rather than the rule. Indeed, Mehta et al., 2000 have shown that place fields can become more asymmetric in the course of spatial learning, demonstrating that the idea that symmetry emerges from an averaging of trajectory-dependent effects (Sharp, 1991) does not always hold true. The structural role of the asymmetry has important implications for the analysis methods used to describe the activity of large populations of neurons, which often rely on the assumption of symmetry in the interactions (e.g. in the analysis of pairwise correlations) or equilibrium of the neural activity (e.g. the standard inverse Ising inference).

In most of the two- and three-dimensional cases analysed here, the asymmetry is constant along a single direction in each attractor. This can describe the situation in which the temporal evolution of the memory is structured along a certain dimension, and free to diffuse, without energy costs, in the remaining ones. The description of several one-dimensional trajectories embedded in a two dimensional or three dimensional space requires a position-dependent asymmetric component. A systematic analysis of this situation is left for future analysis. However, the simple case of two intersecting trajectories embedded in a 2D map, analysed here, provides a proof of concept that several intersecting trajectories can be correctly retrieved, provided that the activity bump is sufficiently elongated in the direction of the trajectory. A progressive elongation of the place fields in the running direction has been observed in rats running on a linear track (Mehta et al., 1997), and our analysis predicts that an analogous effect would be observed also in open-field environments, when restricting the analysis to trajectories in the same running direction.

Another challenge is posed by the evidence that replayed sequences can be organized both forward and backward in time (Foster and Wilson, 2006). The model in its current formulation can produce the retrieval of a given sequence either forward or backward, but cannot alternate between the two. This suggests that, if replay relies on asymmetric connections, the hippocampus would have to use different representations for the forward and the backward component. The fact that a change in reward uniquely modulates backward replay (Ambrose et al., 2016) provides some evidence in this direction, but this question remains open to experimental investigation.

The dynamical retrieval of the model generalizes, in the framework of attractor networks, the idea of cognitive maps, incorporating a temporal organization in the low-dimensional manifold encoding the structure of the memory. This feature is reminiscent of the idea of memory schemata – constructs that can guide and constrain our mental activity when we reminisce about the past, imagine future or fictional scenarios or let our minds free to wander (Ciaramelli and Treves, 2019). The use of the present model to describe memory schemata will require further steps, such as an account of the interaction between hippocampus and neocortex, and the expansion of the mechanism describing the transition between different dynamical memories. Nevertheless, the idea of dynamic retrieval of a continuous manifold and the integration of the model presented here with effective models of cortical memory networks (Boboeva et al., 2018) open promising perspectives.

Finally, the full analytical description of a densely connected asymmetric attractor network is a challenge that remains open, and can yield valuable insights on the workings of the neural circuits underlying memory.

Appendix 1

A Numerical simulations

Numerical simulations are performed with python code, available at: https://github.com/davidespalla/CADM (copy archived at swh:1:rev:65a5bdfa291840cf5bf10e1da48aadb0b316a445), Spalla, 2021.

In the single map case, to each of the N units (N=1000 in 1D, N=1600 in 2D) is assigned a preferential firing location xi on a regular grid spanning the environment with linear dimension L. From this preferred firing locations the interaction matrix Jij is constructed, with the formula:

(34) Jij=KS(|xi-xj|)+γKA(|xi-xj|)

The precise shape of the symmetric and anti-symmetric parts of the kernel are chosen differently in different simulations, according to the feature the analysis focused on, as specified in the main text. Once the network is assembled, the dynamics is initialized either with a random assignment of activity values to each unit in the range [0,1], or with a gaussian bump centered in the middle of the environment (note that, due to the periodic boundary conditions and the translational invariance of the connectivity, the choice of the starting point does not influence the outcome). The dynamics is then evolved in discrete time steps, with the iteration of the following operations:

  • Calculation of the local fields hi(t)=jJijVj(t-1)

  • Calculation of the activity values Vi(t)=g(hi(t)-h0)θ(hi(t)-h0)

  • Dynamic adjustment of the threshold h0 such that only the fN most active neurons remain active: h0=Vf, where j,Vj>Vf=Nf

  • Recalculation of the activity Vi(t) with the adjusted threshold

  • Dynamic adjustment of the gain g such that the mean activity {Vi(t)} is fixed to 1: g=g/{Vi(t)}

  • Recalculation of the activity Vi(t) with the adjusted gain

The adjustment of the parameters of the transfer function is enforced to constrain the network to operate at fixed sparsity f and fixed mean, set to one without loss of generality. The dynamics is iterated for a given number of steps (usually 200), large enough to assure the convergence to the attractive manifold (reached usually in < five steps) and the observation of the dynamical evolution on the manifold.

In the case of multiple maps, the implemented dynamical evolution is the same, but the interaction matrix is constructed with multiple assignments of the preferred firing locations xi, one for each of the p stored maps:

(35) Jij=μ=1p(KS(|xiμ-xjμ|)+γKA(|xiμ-xjμ|))

The multiple assignments of the preferred firing locations are performed by a random shuffling of the labels of the units before the assignment to the position on the regular grid spanning each map.

B Simulation of electrophysiological recordings

To simulate the recording from a subset of the network during dynamical retrieval, we constructed a network of 1000 neurons with three different manifolds encoded in its connectivity matrix. We simulated a dynamic of the network from an initial condition correlated with the first manifold, for T=400 time steps. Since we used circular manifolds, we then selected a chunk of the dynamics corresponding to a single lap on the circle, to have an easier scenario to compare with experimental work. We then simulated an experimental recording by selecting a random subset of 15 observed cells. We used the activity values of each cell during the dynamics as the instantaneous firing rate of a Poisson random process to generate the spiking activity, using a conversion of 1/25 to convert the arbitrary value of activity to a plausible range of firing rates, in the order of some Hertz. Spikes produced by this process are our simulated activity recording, and can be plotted according to the preferred firing position of each of the recorded cell in each of the three manifolds. The preferred firing position is supposed to be extracted, in an experimental setting, from the average rate maps of the recorded cells over many observations of the dynamics. In this way we obtain the plots reported in Figure 6(c).

C Analytical solution of the single map model in one dimension

To solve the integral Equation 9 in the case of a 1D manifold and the exponential kernel

(36) K(x-x)=e-|x-x|+γsign(x-x)e-|x-x|

we start by rewriting it as

(37) V(x+Δx)={g-RR𝑑xK(x-x)V(x)-h0,if xΩ0otherwise

where [-R,R],R>0 is a compact domain for which there exist a solution of Equation 9 that is zero at the boundary. This allows to exploit the fact that our threshold-linear system is, indeed, linear in the region in which V(x)>0.

We then differentiate twice to obtain the differential equation

(38) V′′(x+Δx)=V(x+Δx)+2gV(x)+2gγV(x)+gθ

This is a second order linear ODE, with constant coefficients. The presence of the shift term Δx inside the unknown function makes the equation non-trivial to solve. To solve it, we proceed in the following way: first, we look for a particular solution, that is easily found in the constant function

(39) Vc=gθ1-2g

Then, we consider the associated homogeneous equation, and look for a solution in the form V(x)=ekx, where k is a solution of the characteristic equation C(k)=0, with

(40) C(k)=k2ekΔx+2gγk+2g-ekΔx.

This trascendental equation has to be solved graphically in the complex domain, as shown in Appendix 1—figure 1.

Appendix 1—figure 1
Analytic solution of Equation (38).

The top row shows the graphical procedure to find the complex zeros of the characteristic C(k) given in (40), for three different values of γ. Black and red lines show the zeros of the real and imaginary part of C(k), respectively. Their intersections are the complex solutions to C(k)=0. The blue line represents the sparsity constraint ki=π/2R. The bottom row shows the corresponding solution shapes.

For each value of γ and Δx, the equation shows a pair of complex conjugate solutions

(41) k1,2*(γ,Δx)=kr(γ,Δx)±iki(γ,Δx)

The general solution of the equation will therefore have the form

(42) V(x)={Cekrxcos(kix)+gθ1-2gif -RxR0if -R>x or x>R

From Equation 42 we can see that the absolute value of ki is related to the width of the bump, and therefore to the sparsity of the solution, by the relation

(43) R=π2ki.

R does in turn depend only on the free parameter g, through the relation that can be derived in the symmetric case (γ=0, Δx=0)

(44) tan(2g-1R)=-2g-1

We can look for a solution with given sparsity f=2R/L (where L is the length of the manifold) by setting R=fL/2. Requiring the the sparsity to be fixed (i.e. setting R constant) while varying γ constrains the zeros of 40 to lie in the subspace ki=π/2R. This imposes a relation between γ and both the speed s=Δx/τ (related to the speed of the shift) and kr (related to the asymmetry of the shape of the solution). Varying R we can study the dependence of the speed on both γ and f.

D Storing multiple manifolds with different retrieval speeds

To investigate the possibility to store and dynamically retrieve manifolds at different speeds, we have performed numerical simulations of a network with recurrent connectivity given by the formula

(45) Jij=1Nμ=1p11+γμ[KS(xiμ-xjμ)+γμKA(xiμ-xjμ)]

Each manifold μ is encoded with a different asymmetry strength γμ. The normalization factor 1/(1+γμ) is added to ensure that each manifold contributes equally to the synaptic efficacies, and does not affect the ratio of strenghts of the symmetric and asymmetric components.

E Linking multiple manifolds together

In order to model the interaction between different stored manifolds, we add to the connectivity matrix an ”heteroassociative’ term, whose strength JHijμν is proportional to the distance between the preferred firing location xiμ of neuron i in manifold μ and the one of neuron j in manifold ν, shifted by the length L of the first manifold, which we denote by x~ν. This shift enforces the fact that the interaction happens between the end of the first manifold and the beginning of the second. Then, the connectivity matrix will be given by

(46) Jij=1Nμ=1pK(xiμ-xjμ)+μνGμνK(xiμ-x~jν)

With

(47) x~ν=xν+L

and Gμν=1 if there is a transition between μ and ν, and zero otherwise.

F Analytical calculation of αc in the highly diluted limit

To calculate the maximum capacity, we first need to solve Equation 27 numerically for v(x), for given g and w. The procedure illustrated here focuses, for the sake of analytical simplicity, on the case of a one-dimensional, exponential kernel

(48) K(x-x)=e-|x-x|-γsign(x-x)e-|x-x|

We start from equation 27

(49) v(x+Δx)=g𝒩(𝑑xK(x-x)v(x)+w)

First, following Battaglia and Treves, 1998 we rewrite it with the transformation 

(50) u(x)=𝒩-1(v(x)g)

obtaining 

(51) u(x+Δx)=g𝑑xK(x-x)𝒩(u(x))+w.

We then transform this integral equation in a differential one, by differentiating twice. We obtain

(52) u′′(x+Δx)+2gγΦ(u(x))u(x)+2g𝒩(u(x))-u(x+Δx)+w=0

where we have used the fact that 𝒩(x)=Φ(x). Equation 52 is a second order, nonlinear delayed differential equation. To solve it, it is not sufficient to impose an initial condition on a single point for the solution and the first derivative (i.e. something like u(x0)=u0,u(x0)=u0): we have to specify the value of the function and its derivative in an interval [x0,x0+Δx].

To do so, we reason that, if we want a bump solution, u(x) has to be finite for x± and cannot diverge. We then require the function to be constant (u(x)=u0, u(x)=0) before a certain value x0, whose value can be set arbitrarily without loss of generality.

The value u0, at γ=0 and Δx=0 determines the shape of u(x), as shown by the numerical solution presented in Appendix 1—figure 2. For u0<u* the solution will diverge at x, while for u0>u* it will oscillate. We are then left with a single value u0(g,w)=u*(g,w) for which the solution has the required form.

Appendix 1—figure 2
Solutions to Equation 52 for g=1, w=-1.8, γ=0, Δx=0.

Then, keeping u0 fixed, we can repeat a similar procedure to find Δx for different values of γ. Also in this case, the solution either diverges or oscillates, apart from a single value Δx*, for which the solution has the desired shape (see Appendix 1—figure 3). This eliminates the arbitrariness in the choice of Δx since it imposes, for given g and w, a relation Δx=Δx*(γ).

Appendix 1—figure 3
Solutions to Equation 52 for g=1, w=-1.8, γ=0.2.

We can then find the shape of the bump u(x) for given values of g, +w and γ, from which we can obtain the profile v(x)=g𝒩(u(x)) that we need for the calculation of the storage capacity. Some examples of the obtained profiles, for different values of γ, are shown in Appendix 1—figure 4.

Appendix 1—figure 4
Activity profile v(x), obtained for the same g=0.7 and w=-1.3, at different values of γ.

Plugging the obtained form of v(x) into Equation 29, we can calculate the capacity. The dependence of the capacity on γ is shown, for L=60, in Appendix 1—figure 5.

Appendix 1—figure 5
Dependence of the storage capacity on γ, for L=60.

The crosses show the full solution of Equations 27 and 29. The dashed line is obtained by taking the value of the capacity α(0) obtained with full solution at γ=0, and multiplying it by the scaling of the kernel variance (1+γ2). Full dots show the value of capacity obtained with the full solution and the contribution of the kernel variance factored out.

We can see from the full dots in the figure that the contribution of the integral in Equation 29 is remarkably constant in γ. This is due to the fact that the distortions of the bump shape induced by the presence of the asymmetry have a negligible effect on the average square activity y, whose value is dominated by the dependence on γ of the spatial variance of the kernel (Equation 19).

This allows us to approximate the value of the integral in Equation 29 with its value in the γ=0 case. We can then calculate the capacity as a function of γ and L by solving the symmetric case for different Ls, and then incorporating the dependence on γ given by the kernel variance:

(53) αc(L,γ)αc(L,0)/(1+γ2)

This approximation yields the results reported in the main text and in Figure 8

Data availability

The work did not generate any experimental dataset. The code used for numerical simulations is publicly available on Github (https://github.com/davidespalla/CADM (copy archived at https://archive.softwareheritage.org/swh:1:rev:65a5bdfa291840cf5bf10e1da48aadb0b316a445)).

References

  1. Book
    1. Eichenbaum H
    2. Cohen NJ
    (2004)
    From Conditioning to Conscious Recollection: Memory Systems of the Brain
    Oxford University Press.
    1. Sharp PE
    (1991)
    Computer simulation of hippocampal place cells
    Psychobiology 19:103–115.

Decision letter

  1. Adrien Peyrache
    Reviewing Editor; McGill University, Canada
  2. Laura L Colgin
    Senior Editor; University of Texas at Austin, United States
  3. Ulisse Ferrari
    Reviewer; Sorbonne Université, France

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The study presents a new and elegant theoretical framework that generalizes memory storage and retrieval in neural networks to the dynamical case. Specifically, the authors show that imposing asymmetric interaction between units is sufficient to learn and retrieve sequential activation of these units, resembling the dynamics of hippocampal place cells during memory encoding and replay.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Continuous attractors for dynamic memories" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Ulisse Ferrari (Reviewer #3).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that this manuscript will not be considered further for publication in eLife. However, as pointed out by the reviewers, your work is of potential interest for the field. You may thus consider to submit a new version if you think you can significantly improve the manuscript. In this case, please include the original manuscript reference number as well as the names of the senior (Laura L Colgin) and reviewer editor (Adrien Peyrache) in your cover letter.

The authors present a computational model that allows for storing and reactivating dynamical memory patterns on low dimensional manifolds. The model presented in this manuscript is simple and elegant. Synaptic couplings are decomposed into a symmetric component, which makes low dimensional manifolds attractive, and an asymmetric component, which produces dynamic retrieval. This is of particular interest as sequential activation of neurons (i.e. trajectories on a manifold) is ubiquitous in the brain, especially in the hippocampus. This work is thus an important contribution to the field, bridging the gap between theoretical and experimental studies. However, the reviewers have raised several major concerns regarding the details of the model and its presentation. Furthermore, it seems necessary to demonstrate that the present model works well beyond the one-dimensional case and/or to test predictions of the model with actual experimental data.

Reviewer #2:

I really like the simplicity of the learning rule. However, the article lack clarity in many fundamental sections, and it took me really long to understand important details (if I understood them). When the authors introduce the model, they introduce the inhibition term, but then they remove it as they claim it can be reabsorbed in h0. From the equations this claim is not correct, assuming that h0 does not depend on the values of Vs. It's only by reading the methods that I understood that they actually choose dynamically both g and h0 to enforce a certain level of sparseness. I would say that already in the main text without introducing the term multiplying b at all, it is only confusing.

Even more confusing is the introduction of x. They define x as "preferential firing location in the stimulus space". They never spoke about stimuli, and I think they don't need to. I believe I finally understood what they mean, but I had to read the rest of the article (a couple of times). The description of the phenomenology is also unclear. In Figure 2 they never explain that they use x (y and z) to identify a neuron. So basically V is plotted as a function of the neuron number, parametrized by x. I guess that they discretized x and then assigned each x to a neuron. They should be more explicit about this, especially considering the broad audience of eLife. Moreover, they claim that the dynamics occurs in a low dimensional manifold, but in all the examples I see, their trajectory is always 1D, even in the cases in which they have multiple variables (e.g. figure 2c). Also in the dynamic kernel section they determine the kernel only in the case of one variable x, and obviously a 1D trajectory. The kernel they get is a for a rigid shift, under the assumption that the kernel is exponential. What would happen in a more general case? What would be the phenomenology if the dynamics really moves along a 2D or a 3D manifold? They only briefly mention in the Discussion that they consider only a single direction (they should say it from the very beginning). Also, what kind of asymmetric kernels do they consider? Is it only a rigid shift, or it could be a curvilinear trajectory when the underlying manifold has 2 or more dimensions? What is the maximal dimensionality of the trajectory? I guess it depends on the specific form of the asymmetric kernel. I have the impression that the authors have an interesting system to study, but they report only very few aspects of its dynamics (and in a rather confusing way).

Finally, they mention a few predictions in the Discussion, but they're rather generic. For a journal like eLife, in the absence of real data, quantitative predictions are important. The introduction, which is nicely written, contains several examples of phenomena that could be described by the proposed model.

There are several misprints and references missing.

Reviewer #3:

Spalla, Cornacchia and Treves presents a novel class of neural network models capable of storing and retrieving dynamical memory patterns. This result is a major advance in the field because it opens for many possible developments and importantly, it reduces the gap between theoretical and experimental works on memory. Previous models already showed the capacity of a neuronal network to store multiple memories, where each of them is represented by a pattern of neural activations. However, in these models when a particular memory is retrieved, neural activity aligns with the corresponding pattern, but it is static and does not flow along it. This issue prevents a clear connection between experimental and theoretical analysis of memory reactivations which are dynamical in nature, as for example in the case of the replay effect. Here instead, once neural activity aligns with a stored memory pattern, it evolves along it as a dynamical bump of activity. Therefore, here memories are dynamical objects, something closer to what has been observed in experiments.

Overall the paper presents some very relevant science and the claims are well supported by the presented analyses.

Strengths:

Having a simple but powerful model that allows for temporal evolution of reactivated memories represents a major step forward towards the construction neural network models with realistic phenomenology

Despite the added complexity of the present 'dynamical memory' model with respect to 'static memory' ones, storage capacity is not much affected, and can be even larger in certain regimes.

The authors' approach generalises previous static models, which are nicely recovered as limiting cases. This allows for encompassing previous results into a more flexible and powerful framework.

The behavior of the proposed model is robust against the fine details of its construction, meaning that a large class of models will present a similar phenomenology. This is important because it opens for considering more biologically plausible models that will still present the same phenomenological behavior.

Weaknesses:

Although the temporal evolution of the reactivated memory is the major advance of this work, their dynamics is very simple: once reactivated they evolve along a given (rigid) direction. It is not clear, for example, if the dynamics can be time-reversed, if cycles are allowed and if different memories can interact dynamically.

This work reduces the gap between computational and experimental studies of memory storing and retrieval. However it remains still very theoretical, and misses some development towards real neuronal networks. For example, it is suggested, but not shown that a biologically plausible mechanism of plasticity can result in systems that display the same phenomenology of the presented model. Additionally, the paper misses a clear discussion of how the model can empower the analysis of experimental results on memory storage and reactivation. Although it is fair to imagine that reaching these ambitious goals is a long term program that encompasses the current paper, the manuscript would benefit from a deep discussion of what are the main obstacles to be faced. As such, this weakness should be seen as suggestions for further developments, and not as limitation of the impact of the current work.

I find the dynamics of the retrieved memory either very simple, or not well explained and/or showcased. At first, it is not clear to me if the manifolds are compact and have endpoints. If it is the case, what happens when the dynamics reaches this point? Also, the velocity of the reactivation looks like a rigid property of the stored memory, but then the authors show that actually it might vary depending on the property of the systems (Figure 4, for example). Is there a way for actively tuning the velocity? This may link with the time-compressed reactivation observed in the replay? Additionally, it is not explained if the model can support more complex dynamical effects. For example, considering the effect of reverse replay (Foster and Wilson 2006), I wonder if the dynamics can be time-reversed, so that a dynamical memory can be reactivated in both directions. Also, is it possible to have cycles, so that the dynamics will run in a circle over and over? More generically, which kind of dynamics are supported by the current model? How to extend the model to increase the resulting phenomenology?

Additionally, I found the model's limitations discussed in LL 363->370 not clearly explained, and they should anyhow be expanded discussing what the model dynamics can or cannot support.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Continuous attractors for dynamic memories" for further consideration by eLife. Your revised article has been evaluated by Laura Colgin (Senior Editor) and a Reviewing Editor.

The manuscript has been greatly improved since the first submission. We believe that this manuscript is potentially of great interest for the community, however we think that the manuscript would benefit from one minor (but key) revision: to add a "real-life" example that would illustrate the potential of this theoretical work to a broad audience. Basically, this could be as simple as a toy model with place cells spiking along a linear/circular track. This would then entail to simulate the network spontaneous activity and measure the firing rate in time of a sub-sample of the cells (recorded cells), generate some spikes from the firing rate, show that the spiking activity of the network resembles that of replay events (apart from the reverse replay), and present the results in a way that an experimental reader will immediately understand the importance of the study.

We have also made some comments regarding the general presentation of the figures and equations, detailed below in the reviewer's specific comments. The authors can of course decide not to follow these suggestions as they are not critical for the main message of the paper.

Reviewer #2:

Spalla, Cornacchia and Treves have largely improved the manuscript and fully addressed all my main criticisms, and in my opinion also those of the other referee. I already found the paper interesting during the first review round, and now it has also improved in clarity and completeness. I particularly appreciated:

– The results of the paper are novel and very interesting. Even if the model is very simple and easy to understand, it presents an impressive phenomenology. Also, by just adding one more feature, it nicely extends previous models (the asymmetric component of the kernel), which are recovered when γ->0.

– The efforts that the authors have made to better characterise and describe the phenomenology of the dynamical retrieval: analyses as those of Figure 7 are very intriguing and in my opinion will open for further developments of the research field.

– The improvement of the exposure, which now better fits the scope and wide audience of eLife.

For all these reasons I recommend the publication of this paper.

Reviewer #3:

The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, while purely symmetrical synapses (as in canonical attractor networks) lead to the retrieval of static patterns, asymmetric synapses lead to sequential activation of units in the network, as observed for example during replay of hippocampal place cells. In addition, the study demonstrates that, as previous models had shown in the static case, such network can store multiple maps.

The main concerns with this study, in its present form, is the general presentation of the model and of its general properties, which could be hard to follow for a non-expert reader. The study would also benefit from concrete examples, such as simulation of replay in a minimal model of hippocampal place cells.

The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, asymmetric synapses lead to sequential activation of units in the network, instead of the classical static patterns emerging from purely symmetrical synapses (e.g. attractors in which connectivity kernels depends only on the "representational distance" between units in the feature space). The study also demonstrates that, as previous models had shown in the static case, such network can also store multiple maps.

While the authors have already addressed many comments made during a previous round of review, the general presentation of the manuscript should be greatly improved for publication in a life science journal targeting a broad audience. Importantly, the number of main figures should be drastically reduced, and figures should focus only on the most important claims of the study.

One major concern though is that the study would benefit from a concrete example. For example, displaying simulated "replayed" trajectories in 2D maps (as a neurophysiologist would present it in an experimental study) in different cases influencing the speed (and general behaviour) of replay, such as different levels of asymmetry and sparsity.

The remarks below are related to the specific presentation of the manuscript.

Top panels of Figure 3 and associated text should be put in supplementary info, as this section can be fairly confusing for the non-expert reader and does not contribute much to the understanding of the phenomenon.

Figure 4 conveys an important message: the speed of the bump is related to the asymmetry of the connectivity kernel. The figure legend should be clarified: x-axis should read "kernel asymmetry (\γ)", with arrows point to either left ("more symmetrical") to right ("more asymmetrical"). |dx| should be replace by "bump speed (a.u.)". The message of Figure 4b is unclear and should certainly be discarded.

While Figure 5 shows that the solution can be generalized to a broad class of asymmetric kernels, Figure 6 presents again the link between network parameters and bump features (speed, etc.) in the case of an exponential kernel. One idea would be to merge the bottom panels of Figure 3, Figure 4a and Figure 6 to convey one message about the exponential case. Then show (current) Figure 5 to show that it is valid in other cases.

Figure 7 presents an interesting simulation, illustrating that retrieval can be specific to a particular map. Here again, the presentation should be drastically improved: x_1,2,3 should be replaced by more explicit terms like "feature space (map 1)" or something similar. Y-axis should be labels "activity level (V)" or something similar, etc. Against, it is important to present figure that do not require the reader to rely too much on abbreviations and variable names.

The link between kernel asymmetry and storage capacity is important but should be made clearer. Most of the equations (17-29) should be put in appendix and be limited to equation 30 and 31. Figure 8 and 9 present overlapping data, they should be merged (or better: only one of the two should be shown) and the same remarks as for Figure 4 apply for the presentation of this figure.

The presentation of the non-monotonic dependence of storage capacity w.r.t \γ in the fully connected case is another aspect that should be improved. Figure 10 would benefit from an additional graph showing P=50% chance retrieval as a function of γ to illustrate the non-monotonic relationship. This figure should be merge in a multi-panel figure with Figure 11 so that the reader can immediately grasp the main messages regarding the fully connected case.

https://doi.org/10.7554/eLife.69499.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #2:

I really like the simplicity of the learning rule. However, the article lack clarity in many fundamental sections, and it took me really long to understand important details (if I understood them). When the authors introduce the model, they introduce the inhibition term, but then they remove it as they claim it can be reabsorbed in h0. From the equations this claim is not correct, assuming that h0 does not depend on the values of Vs. It's only by reading the methods that I understood that they actually choose dynamically both g and h0 to enforce a certain level of sparseness. I would say that already in the main text without introducing the term multiplying b at all, it is only confusing.

We have clarified in the main text (LL 120-126) how the inhibition mechanism works, and simplified the notation, avoiding the complication of the introduction of the b(x) term.

Even more confusing is the introduction of x. They define x as "preferential firing location in the stimulus space". They never spoke about stimuli, and I think they don't need to. I believe I finally understood what they mean, but I had to read the rest of the article (a couple of times).

We have expanded the section “A mechanistic model for dynamic retrieval” (LL 127-139), clarifying the meaning of the parameter x and avoiding references to stimuli, and striving for a clearer and more accessible explanation of the model.

The description of the phenomenology is also unclear. In Figure 2 they never explain that they use x (y and z) to identify a neuron. So basically V is plotted as a function of the neuron number, parametrized by x. I guess that they discretized x and then assigned each x to a neuron. They should be more explicit about this, especially considering the broad audience of eLife.

We have clarified the phenomenology presented in Figure 2, by rewriting and expanding the description of the plots in the main text (LL 174-183)

Moreover, they claim that the dynamics occurs in a low dimensional manifold, but in all the examples I see, their trajectory is always 1D, even in the cases in which they have multiple variables (e.g. figure 2c). Also in the dynamic kernel section they determine the kernel only in the case of one variable x, and obviously a 1D trajectory.

We clarify this aspect in LL 207-212 and in the discussion (LL 500-506), underlying how the dynamics in 2D and 3D is different from a simple 1D trajectory.

We have also added a new analysis of the case of two intersecting 1D trajectories embedded in a 2D manifold (Figure 2(e) and accompanying text), which explores the case of position-dependent asymmetry, and provides a proof of concept of the possibility to expand the model in this direction.

The kernel they get is a for a rigid shift, under the assumption that the kernel is exponential.

This problem is addressed in the section “dynamical retrieval is robust” (LL 266-283). There we investigate the effect of the kernel shape and its parameters on the behaviour of the model. We find this behaviour to be largely independent from the specific kernel shape, and able to produce dynamic retrieval in a remarkably broad range of parameters (see Figure 4 and Figure 5).

What would happen in a more general case? What would be the phenomenology if the dynamics really moves along a 2D or a 3D manifold? They only briefly mention in the Discussion that they consider only a single direction (they should say it from the very beginning). Also, what kind of asymmetric kernels do they consider? Is it only a rigid shift, or it could be a curvilinear trajectory when the underlying manifold has 2 or more dimensions? What is the maximal dimensionality of the trajectory? I guess it depends on the specific form of the asymmetric kernel. I have the impression that the authors have an interesting system to study, but they report only very few aspects of its dynamics (and in a rather confusing way).

We have added clarifications on these aspects as stated in the previous points.

In the case with a constant asymmetric direction, the presence of the asymmetry imposes no additional bound on the maximal dimensionality of the manifold.

As it is the case in continuous attractor networks of any kind, a trade-off is to be expected between the resolution and the dimensionality of the representation, if the size of the network is kept fixed.

Finally, they mention a few predictions in the Discussion, but they're rather generic. For a journal like eLife, in the absence of real data, quantitative predictions are important. The introduction, which is nicely written, contains several examples of phenomena that could be described by the proposed model.

We have added two key quantitative predictions by the model: the predicted elongation of place fields in the direction of the running trajectory (LL 224-227, and LL 506-513) and an interaction between the sparsity of the activity of the network and the retrieval speed (LL 191- 196 and LL 452-459).

These predictions complement the main result of the paper: that auto associative memory networks can encode continuous dynamic memory with large capacity, and that the asymmetry of connections, often considered as an afterthought in mechanistic models, can be a crucial ingredient of memory systems.

We believe these points are of interest to the broad eLife audience, beyond the technical apparatus developed to obtain them.

There are several misprints and references missing.

We have made an effort to correct all misprints and missing references, and we have added several more references to better put the work in the context of the existing literature.

Reviewer #3:

Spalla, Cornacchia and Treves presents a novel class of neural network models capable of storing and retrieving dynamical memory patterns. This result is a major advance in the field because it opens for many possible developments and importantly, it reduces the gap between theoretical and experimental works on memory. Previous models already showed the capacity of a neuronal network to store multiple memories, where each of them is represented by a pattern of neural activations. However, in these models when a particular memory is retrieved, neural activity aligns with the corresponding pattern, but it is static and does not flow along it. This issue prevents a clear connection between experimental and theoretical analysis of memory reactivations which are dynamical in nature, as for example in the case of the replay effect. Here instead, once neural activity aligns with a stored memory pattern, it evolves along it as a dynamical bump of activity. Therefore, here memories are dynamical objects, something closer to what has been observed in experiments.

Overall the paper presents some very relevant science and the claims are well supported by the presented analyses.

Strengths:

Having a simple but powerful model that allows for temporal evolution of reactivated memories represents a major step forward towards the construction neural network models with realistic phenomenology

Despite the added complexity of the present 'dynamical memory' model with respect to 'static memory' ones, storage capacity is not much affected, and can be even larger in certain regimes.

The authors' approach generalises previous static models, which are nicely recovered as limiting cases. This allows for encompassing previous results into a more flexible and powerful framework.

The behavior of the proposed model is robust against the fine details of its construction, meaning that a large class of models will present a similar phenomenology. This is important because it opens for considering more biologically plausible models that will still present the same phenomenological behavior.

Weaknesses:

Although the temporal evolution of the reactivated memory is the major advance of this work, their dynamics is very simple: once reactivated they evolve along a given (rigid) direction. It is not clear, for example, if the dynamics can be time-reversed, if cycles are allowed and if different memories can interact dynamically.

This work reduces the gap between computational and experimental studies of memory storing and retrieval. However it remains still very theoretical, and misses some development towards real neuronal networks. For example, it is suggested, but not shown that a biologically plausible mechanism of plasticity can result in systems that display the same phenomenology of the presented model. Additionally, the paper misses a clear discussion of how the model can empower the analysis of experimental results on memory storage and reactivation. Although it is fair to imagine that reaching these ambitious goals is a long term program that encompasses the current paper, the manuscript would benefit from a deep discussion of what are the main obstacles to be faced. As such, this weakness should be seen as suggestions for further developments, and not as limitation of the impact of the current work.

I find the dynamics of the retrieved memory either very simple, or not well explained and/or showcased. At first, it is not clear to me if the manifolds are compact and have endpoints. If it is the case, what happens when the dynamics reaches this point?

We have clarified in the presentation of the model (LL 158-165) that we consider manifolds with periodic boundary conditions, in which the dynamic retrieval produces periodic cycles. We highlight that this simplifying assumption is not a necessary feature of the model, and we study the alternative case in which this is replaced by a link between the endpoint of a manifold and the beginning of the next one (Figure 7 (c), (d) and accompanying text.). This analysis not only shows that dynamic retrieval does not rely on the specific boundary conditions, but also demonstrates that a chain of multiple continuous manifolds can be retrieved sequentially by the network.

Also, the velocity of the reactivation looks like a rigid property of the stored memory, but then the authors show that actually it might vary depending on the property of the systems (Figure 4, for example). Is there a way for actively tuning the velocity? This may link with the time-compressed reactivation observed in the replay?

We have further expanded the study of the dependence of the retrieval speed on the parameters of the network by adding an analysis of the effect of sparsity on the reactivation velocity (Figure 2 (d) and Figure 3 (b), (c)).

We find that the sparsity of the representation (a feature of the network activity, that can be regulated instantaneously during the dynamics) interacts with the asymmetry parameter \γ in determining the retrieval speed.

The possibility of instantaneously tuning the velocity by acting on the sparsity of the activity also yields a key new prediction: in a network with fixed connectivity, retrieval speed can be modulated by the activity level of the population.

Additionally, it is not explained if the model can support more complex dynamical effects. For example, considering the effect of reverse replay (Foster and Wilson 2006), I wonder if the dynamics can be time-reversed, so that a dynamical memory can be reactivated in both directions. Also, is it possible to have cycles, so that the dynamics will run in a circle over and over? More generically, which kind of dynamics are supported by the current model? How to extend the model to increase the resulting phenomenology?

We have added the clarification that the current model cannot handle backward and forward replay of the same trajectory (LL 196-198 and LL 514-521 in the discussion). The dynamics is naturally organized in cycles under the assumption of periodic boundary conditions, but this is not the only possibility the model can describe.

To clarify this point, we have added new analyses that demonstrate complex dynamical effects. In particular the linking of dynamical memories and the possibility to simultaneously memorize manifolds with different values of the asymmetric component, leading to different retrieval speeds (Figure 7 and accompanying text).

Additionally, I found the model's limitations discussed in LL 363->370 not clearly explained, and they should anyhow be expanded discussing what the model dynamics can or cannot support.

As stated in response to reviewer 2: we clarify this aspect in LL 207-212 and in the discussion (LL 500-506), underlying how the dynamics in 2D and 3D is different from a simple 1D trajectory.

We have also added a new analysis of the case of two intersection 1D trajectories embedded in a 2D manifold (Figure 2(e) and accompanying text), which explores the case of position-dependent asymmetry, and provides a proof of concept of the possibility to expand the model in this direction.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Reviewer #3:

The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, while purely symmetrical synapses (as in canonical attractor networks) lead to the retrieval of static patterns, asymmetric synapses lead to sequential activation of units in the network, as observed for example during replay of hippocampal place cells. In addition, the study demonstrates that, as previous models had shown in the static case, such network can store multiple maps.

The main concerns with this study, in its present form, is the general presentation of the model and of its general properties, which could be hard to follow for a non-expert reader. The study would also benefit from concrete examples, such as simulation of replay in a minimal model of hippocampal place cells.

The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, asymmetric synapses lead to sequential activation of units in the network, instead of the classical static patterns emerging from purely symmetrical synapses (e.g. attractors in which connectivity kernels depends only on the "representational distance" between units in the feature space). The study also demonstrates that, as previous models had shown in the static case, such network can also store multiple maps.

While the authors have already addressed many comments made during a previous round of review, the general presentation of the manuscript should be greatly improved for publication in a life science journal targeting a broad audience. Importantly, the number of main figures should be drastically reduced, and figures should focus only on the most important claims of the study.

We thank the reviewer for the suggestion. We have reduced the number of main figures by merging the last three figures in what is now Figure 8, and have strived to improve graphical quality and figure accessibility by a general standardization of figure design and a relabeling of figure axes throughout the manuscript, with the aim to make them more consistent and more readily interpretable.

One major concern though is that the study would benefit from a concrete example. For example, displaying simulated "replayed" trajectories in 2D maps (as a neurophysiologist would present it in an experimental study) in different cases influencing the speed (and general behaviour) of replay, such as different levels of asymmetry and sparsity.

We agree with the reviewer that a concrete example is beneficial for the interpretation of the study. To this end, we have added the analysis of a simulated electrophysiological experiment probing the population activity during dynamic retrieval (see also response to the editors).

This simulation is illustrated in Figure 6(c), constructed so as to be familiar to experimental readers in its design and interpretation.

The figure and the accompanying text (LL 314-328, plus appendix B for details) serve the double purpose to connect the behaviour of the model to experimentally observable quantities, and to further illustrate the concept of dynamic auto associative memory, that Figure 6 was originally built to exemplify.

The remarks below are related to the specific presentation of the manuscript.

Top panels of Figure 3 and associated text should be put in supplementary info, as this section can be fairly confusing for the non-expert reader and does not contribute much to the understanding of the phenomenon.

We believe that the analytical solvability of the model is a key feature of our work that can yield quantitative insights and a deep understanding of the mechanisms behind dynamic retrieval. Therefore, we chose to maintain Figure 3 and the accompanying paragraph in the main text. However, during the first revision we moved all technical details to Appendix C, including the previous Figure 3 illustrating the analytical solution procedure, leaving only results and interpretation in the main text.

We have tried to improve the graphical design and axis labeling to make the figure more accessible to a non-technical audience.

Figure 4 conveys an important message: the speed of the bump is related to the asymmetry of the connectivity kernel. The figure legend should be clarified: x-axis should read "kernel asymmetry (\γ)", with arrows point to either left ("more symmetrical") to right ("more asymmetrical"). |dx| should be replace by "bump speed (a.u.)". The message of Figure 4b is unclear and should certainly be discarded.

We think the reviewer is referring to the version of Figure 4 that was in the original manuscript. We indeed changed this figure during our first revision, by getting rid of Figure 4b, as the reviewer suggested, and incorporating Figure 4a, with an extended analysis on the effect of sparsity, in what is now Figure 3. We have also changed axis label names to improve clarity.

While Figure 5 shows that the solution can be generalized to a broad class of asymmetric kernels, Figure 6 presents again the link between network parameters and bump features (speed, etc.) in the case of an exponential kernel. One idea would be to merge the bottom panels of Figure 3, Figure 4a and Figure 6 to convey one message about the exponential case. Then show (current) Figure 5 to show that it is valid in other cases.

We agree with the reviewer and thank them for the suggestion. We swapped the two figures (they are now Figure 4, on the effects of the parameters in the exponential case and Figure 5, on different kernel shapes) and restructured the section on robustness of the asymmetric interactions by examining in depth the exponential kernel case first, and then extending the analysis to other kernel shapes.

Figure 7 presents an interesting simulation, illustrating that retrieval can be specific to a particular map. Here again, the presentation should be drastically improved: x_1,2,3 should be replaced by more explicit terms like "feature space (map 1)" or something similar. Y-axis should be labels "activity level (V)" or something similar, etc. Against, it is important to present figure that do not require the reader to rely too much on abbreviations and variable names.

We have changed the labels in this figure, as well as in the others throughout the manuscript, following this advice for which we are thankful to the reviewer. We have also made some changes in coloring and label sizes to improve readability, as well as adding here as a new panel the results of the new simulation of experimental recordings (see main points).

The link between kernel asymmetry and storage capacity is important but should be made clearer. Most of the equations (17-29) should be put in appendix and be limited to equation 30 and 31. Figure 8 and 9 present overlapping data, they should be merged (or better: only one of the two should be shown) and the same remarks as for Figure 4 apply for the presentation of this figure.

We believe that the reviewer is here referring to the first version of the manuscript.

During our first revision we heavily restructured this section putting most of the mathematical details to Appendix F. Also Figure 8 was moved to the appendix, since we agree with the reviewer that the data is overlapping with Figure 9 (now Figure 8a), and the point of the figure (to show the goodness of our approximation ansatz) is beyond the scope of the main text. Following the previous point on the number of main figures, we have aggregated all figures of this section in what is now Figure 8, that conveys in a single place the main messages on the storage capacity of the network.

The presentation of the non-monotonic dependence of storage capacity w.r.t \γ in the fully connected case is another aspect that should be improved. Figure 10 would benefit from an additional graph showing P=50% chance retrieval as a function of γ to illustrate the non-monotonic relationship. This figure should be merge in a multi-panel figure with Figure 11 so that the reader can immediately grasp the main messages regarding the fully connected case.

We have merged the figures in what is now Figure 8. We did not insert an additional graph, with the rationale not to show overlapping data and not to multiply further the number of figures. However, we have changed labels and other graphical aspects to improve understandability and readability. We believe that the non-monotonicity of the storage capacity is appreciable by the reader from both Figure 8b, where the transition point is shown to move back and forth with γ, and Figure 8c, in which the joint dependence of the capacity on γ and 1/L shows a clear peak in the bulk of the parameter ranges.

https://doi.org/10.7554/eLife.69499.sa2

Article and author information

Author details

  1. Davide Spalla

    SISSA – Cognitive Neuroscience, Via Bonomea, Trieste, Italy
    Contribution
    Conceptualization, Software, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft
    For correspondence
    dspalla@sissa.it
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0328-6476
  2. Isabel Maria Cornacchia

    1. SISSA – Cognitive Neuroscience, Via Bonomea, Trieste, Italy
    2. University of Turin – Physics Department, Torino, Italy
    Contribution
    Software, Validation, Investigation, Visualization, Methodology
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0704-7480
  3. Alessandro Treves

    SISSA – Cognitive Neuroscience, Via Bonomea, Trieste, Italy
    Contribution
    Conceptualization, Supervision, Funding acquisition, Validation, Investigation, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7246-5673

Funding

Human Frontier Science Program (RGP0057/2016)

  • Alessandro Treves

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

Work supported by the Human Frontier Science Program RGP0057/2016 collaboration. We are grateful for inspiring exchanges with Remi Monasson and others in the collaboration, and thank Silvia Girardi for her help with Figure 1.

Senior Editor

  1. Laura L Colgin, University of Texas at Austin, United States

Reviewing Editor

  1. Adrien Peyrache, McGill University, Canada

Reviewer

  1. Ulisse Ferrari, Sorbonne Université, France

Publication history

  1. Preprint posted: November 8, 2020 (view preprint)
  2. Received: April 18, 2021
  3. Accepted: August 12, 2021
  4. Version of Record published: September 14, 2021 (version 1)

Copyright

© 2021, Spalla et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,031
    Page views
  • 100
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Computational and Systems Biology
    Michael S Lauer, Deepshikha Roychowdhury
    Research Article Updated

    Previous reports have described worsening inequalities of National Institutes of Health (NIH) funding. We analyzed Research Project Grant data through the end of Fiscal Year 2020, confirming worsening inequalities beginning at the time of the NIH budget doubling (1998–2003), while finding that trends in recent years have reversed for both investigators and institutions, but only to a modest degree. We also find that career-stage trends have stabilized, with equivalent proportions of early-, mid-, and late-career investigators funded from 2017 to 2020. The fraction of women among funded PIs continues to increase, but they are still not at parity. Analyses of funding inequalities show that inequalities for investigators, and to a lesser degree for institutions, have consistently been greater within groups (i.e. within groups by career stage, gender, race, and degree) than between groups.

    1. Computational and Systems Biology
    2. Epidemiology and Global Health
    Hannah R Meredith et al.
    Research Article

    Human mobility is a core component of human behavior and its quantification is critical for understanding its impact on infectious disease transmission, traffic forecasting, access to resources and care, intervention strategies, and migratory flows. When mobility data are limited, spatial interaction models have been widely used to estimate human travel, but have not been extensively validated in low- and middle-income settings. Geographic, sociodemographic, and infrastructure differences may impact the ability for models to capture these patterns, particularly in rural settings. Here, we analyzed mobility patterns inferred from mobile phone data in four Sub-Saharan African countries to investigate the ability for variants on gravity and radiation models to estimate travel. Adjusting the gravity model such that parameters were fit to different trip types, including travel between more or less populated areas and/or different regions, improved model fit in all four countries. This suggests that alternative models may be more useful in these settings and better able to capture the range of mobility patterns observed.