Continuous attractors for dynamic memories
Abstract
Episodic memory has a dynamic nature: when we recall past episodes, we retrieve not only their content, but also their temporal structure. The phenomenon of replay, in the hippocampus of mammals, offers a remarkable example of this temporal dynamics. However, most quantitative models of memory treat memories as static configurations, neglecting the temporal unfolding of the retrieval process. Here, we introduce a continuous attractor network model with a memorydependent asymmetric component in the synaptic connectivity, which spontaneously breaks the equilibrium of the memory configurations and produces dynamic retrieval. The detailed analysis of the model with analytical calculations and numerical simulations shows that it can robustly retrieve multiple dynamical memories, and that this feature is largely independent of the details of its implementation. By calculating the storage capacity, we show that the dynamic component does not impair memory capacity, and can even enhance it in certain regimes.
eLife digest
When we recall a past experience, accessing what is known as an ‘episodic memory’, it usually does not appear as a still image or a snapshot of what occurred. Instead, our memories tend to be dynamic: we remember how a sequence of events unfolded, and when we do this, we often reexperience at least part of that same sequence. If the memory includes physical movement, the sequence combines space and time to remember a trajectory. For example, a mouse might remember how it went down a hole and found cheese there.
However, mathematical models of how past experiences are stored in our brains and retrieved when we remember them have so far focused on snapshot memories. ‘Attractor network models’ are one type of mathematical model that neuroscientists use to represent how neurons communicate with each other to store memories. These models can provide insights into how circuits of neurons, for example those in the hippocampus (a part of the brain crucial for memory), may have evolved to remember the past, but so far they have only focused on how single moments, rather than sequences of events, are represented by populations of neurons.
Spalla et al. found a way to extend these models, so they could analyse how networks of neurons can store and retrieve dynamic memories. These memories are represented in the brain as ‘continuous attractors’, which can be thought of as arrows that attract mental trajectories first to the arrow itself, and once on the arrow, to the arrowhead. Each recalled event elicits the next one on the arrow, as the mental trajectory advances towards the arrowhead. Spalla et al. determined that memory networks in the hippocampus of mammals can store large numbers of these ‘arrows’, up to the same amount of ‘snapshot’ memories predicted to be stored with similar models.
Spalla et al.’s results may allow researchers to better understand memory storage and recall, since they allow for the modelling of complex and realistic aspects of episodic memories. This could provide insights into processes such as why our minds wander, as well as having implications for the study of how neurons physically interact with each other to transmit information.
Introduction
The temporal unfolding of an event is an essential component of episodic memory. When we recall past events, or we imagine future ones, we do not produce static images but temporally structured movies, a phenomenon that has been referred to as 'mental time travel' (Eichenbaum and Cohen, 2004; Tulving, 2002).
The study of the neural activity of the hippocampus, known for its firsthand involvement in episodic memory, has provided many insights on the neural basis of memory retrieval and its temporal dynamics. An interesting example is the phenomenon of hippocampal replay, that is the reactivation, on a compressed time scale, of sequences of cells active in previous behavioral sessions. Replay takes place during sharp wave ripples, fast oscillations of the hippocampal local field potential that are particularly abundant during sleep and restful wakefulness (Buzsáki et al., 1983; Buzsáki et al., 1992). Indeed, replay has been observed during sleep (Skaggs and McNaughton, 1996; Nádasdy et al., 1999), intertrial rest periods (Foster and Wilson, 2006; Jackson et al., 2006), and during still periods in navigational tasks (Dupret et al., 2010; Pfeiffer and Foster, 2013). Replay activity has been hypothesized to be crucial for memory consolidation (O'Neill et al., 2010) and retrieval (Karlsson and Frank, 2009), as well as for route planning (Pfeiffer and Foster, 2013; Ólafsdóttir et al., 2018).
A temporally structured activation takes place also before the exposure to an environment (Dragoi and Tonegawa, 2011), a phenomenon known as preplay, and a recent study showed that this dynamical feature emerges very early during development, preceding the appearance of theta rhythm (Farooq and Dragoi, 2019) in the hippocampus. The fact that hippocampal sequences are present before the exposure to the environment suggests that their dynamical nature is not specific to a role in spatial cognition, but is inherent to hippocampal operation in general. Moreover, in a recent study Stella et al., 2019 have shown that the retrieved sequences of positions during slow wave sleep are not always replaying experienced trajectories, but are compatible with a random walk on the low dimensional manifold that represents the previously explored environment. This suggests that what is essential are not the sequences themselves, but the tendency to produce them: neural activity tends to move, constrained to abstract lowdimensional manifolds which can then be recycled to represent spatial environments, and possibly nonspatial ones as well. This dynamic nature extends to multiple timescales, as suggested by the observation that the neural map of the same environment progressively changes its component cells over time (Ziv et al., 2013).
Lowdimensional, dynamic activity is not constrained to a single subspace: replay in sleep can reflect multiple environments (Dragoi and Tonegawa, 2013; Gridchyn et al., 2020), the content of awake replay reflects both the current and previous environments (Karlsson and Frank, 2009), and during behavior fast hippocampal sequences appear to switch between possible future trajectories (Kay et al., 2020). Further evidence comes from a recent study with human participants learning novel word pair associations (Vaz et al., 2020). The study shows that the same pairdependent neural sequences are played during the encoding and the retrieval phase.
A similar phenomenon – a dynamic activity on low dimensional manifolds – is present in memory schemata, cognitive frameworks that constrain and organize our mental activity (Ghosh and Gilboa, 2014), and have been shown to have a representation in the medial temporal lobe (Baraduc et al., 2019). Yet another example of dynamical, continuous memories is offered by motor programs, which have been described as lowdimensional, temporally structured neural trajectories (Shenoy et al., 2013; Gallego et al., 2017; Oztop and Arbib, 2002), or as dynamical flows on manifolds (Huys et al., 2014; Pillai and Jirsa, 2017).
We refer to these objects as dynamical continuous attractors, since they involve a continuous subspace that constrains and attracts the neural activity, and a dynamical evolution in this subspace. Figure 1 schematically illustrates the concept of dynamical continuous attractors and their possible role in some of the neural processes described above.
In most cases, computational analyses of lowdimensional neural dynamics are not concerned with memory, and focus on the description of the features of single attractors, more than on their possible coexistence. On the other hand, mechanistic models of memory usually neglect dynamical aspects, treating memories as static objects, either discrete (Amit et al., 1985) or continuous (Battaglia and Treves, 1998; Monasson and Rosay, 2013; Samsonovich and McNaughton, 1997; Spalla et al., 2019). The production of sequences of discrete memories can be implemented with a heteroassociative component (Sompolinsky and Kanter, 1986), usually dependent on the time integral of the instantaneous activity, that brings the network out of equilibrium and to the next step in the sequence. A similar effect can be obtained with an adaptation mechanism in a coarse grained model of cortical networks (Kropff and Treves, 2005), with the difference that in this case the transitions are not imposed, but driven by the correlations between the memories in socalled latching dynamics (Russo et al., 2008; Kang et al., 2017). Moreover, adaptationbased mechanisms have been used to model the production of random sequences on continuous manifolds (Azizi et al., 2013), and shown to be crucial in determining the balance between retrieval and prediction in a network describing CA3CA1 interactions (Treves, 2004). In the case of continuous attractor networks, movement can be induced also by mechanisms that integrate an external velocity input and make use of asymmetric synaptic strengths. Models of this kind have been used for the description of head direction cells (Zhang, 1996), spatial view cells (Stringer et al., 2005), and grid cells (Fuhs and Touretzky, 2006; Burak and Fiete, 2009), and can represent simultaneously the positions of multiple features and their temporal evolution (Stringer et al., 2004). In the simplest instantiation, these systems do not necessarily reflect longterm memory storage: the activity is constrained on a single attractive manifold, which could well be experience independent.
Here, we propose a network model able to store and retrieve multiple independent dynamic continuous attractors. The model relies on a mapdependent asymmetric component in the connectivity that produces a robust shift of the activity on the retrieved attractive manifold. This connectivity profile is conceived to be the result of a learning phase in which the mechanism of spike timing dependent plasticity (STDP) (Markram et al., 1997) produces the asymmetry. Crucially, the asymmetry is not treated here as a 'pathological' feature, assumed to level out in the limit of long learning, but as a defining trait of the stored memories. The balance between two components – one symmetric and trajectoryaveraged, the other asymmetric and trajectorydependent – is explicit in the formulation of the model, and allows to study their effects on memory storage.
In what follows we develop an analytical framework that allows to derive the dependence of important features of the dynamics, such as the replay speed and the asymmetry of the activity cluster, as a function of the relevant parameters of the model. We show with numerical simulations that the behavior of the model is robust with respect to its details, and depends weakly on the shape of the interactions. Finally, we estimate the storage capacity for dynamical memories and we find it to be of the same order of the capacity for static continuous attractors, and even higher in some regimes.
Modeling framework
A mechanistic model for dynamic retrieval
The model we consider is a continuous attractor neural network, with an additional antisymmetric component in the connectivity strength. We consider a population of $N$ neurons, with recurrent connectivity described by an interaction matrix ${J}_{ij}$, whose entries represent the strength of the interaction between neuron $i$ and $j$. The activity of each neuron is described by a positive real number ${V}_{i}\in {\mathbb{R}}^{+}$ representing its instantaneous firing rate. The dynamic evolution of the network is regulated by the equations:
where ${[\mathrm{\dots}]}^{+}$ is the threshold linear activation function
with the gain $g$ modulating the slope and the Heaviside step function $\theta (\mathrm{\dots})$ setting to zero subthreshold inputs. The first term on the right hand side of Equation 1 represent the excitatory inputs provided to neuron $i$ from the rest of the network through recurrent connections. The threshold h_{0} and the gain $g$ are global parameters that regulate the average activity and the sparsity of the activity pattern (Treves, 1990).
In numerical simulations, these parameters are dynamically adjusted at each time step to constrain the network to operate at a certain average activity (usually fixed to one without loss of generality) and at a certain sparsity $f$, defined as the fraction of active neuron at each time (see appendix A). The connectivity matrix $J$ of the network encodes a map of a continuous parameter $\overrightarrow{x}$ spanning a lowdimensional manifold, e.g. the position in an environment. To do so, each neuron is assigned a preferential firing location ${\overrightarrow{x}}_{i}$ in the manifold to encode, and the strength of the interaction between pairs of neuron is given by a decreasing, symmetric function of the distance between their preferred firing locations
This shape of the interactions is a typical one in the framework of continuous attractor neural networks (Samsonovich and McNaughton, 1997; Battaglia and Treves, 1998; Tsodyks, 1999) and is thought to come from a timeaveraged Hebbian plasticity rule: neurons with nearby firing fields will fire concurrently and strengthen their connections, while firing fields that are far apart will produce weak interactions. The symmetry of the function ${K}_{S}$, usually called interaction kernel, ensures that the network reaches a static equilibrium, where the activity of the neurons represents a certain position in the manifold and, if not pushed, remains still.
The shift mechanism
The assumption of symmetric interactions neglects any temporal structure in the learning phase. In case of learning a spatial map, for example, the order in which recruited neurons fire along a trajectory may produce an asymmetry in the interactions as a consequence of Spike Timing Dependent Plasticity (Markram et al., 1997), that requires the postsynaptic neuron to fire after the presynaptic one in order to strengthen the synapse. This phenomenon can be accounted for in the definition of the interaction kernel. Any asymmetric kernel can be decomposed in two contributions:
where ${K}_{S}$ is the usual symmetric component and ${K}_{A}$ is an antisymmetric function (${K}_{A}({x}_{i}{x}_{j})={K}_{A}({x}_{j}{x}_{i})$). The parameter γ regulates the relative strength of the two components. The presence of ${K}_{A}$ will generate a flow of activity along the direction of asymmetry: neuron $i$ activates neuron $j$ that, instead of reciprocating, will activate neurons downstream in the asymmetric direction. Mechanisms of this kind have been shown to produce a rigid shift of the encoded position along the manifold, without loss of coherence (Zhang, 1996; Burak and Fiete, 2009; Fuhs and Touretzky, 2006). In the quantitative analysis that follows we will concentrate, when not stated otherwise, on a kernel $K$ with the exponential form
where $\overrightarrow{n}$ is a unit vector pointing in the – constant – direction along which the asymmetry is enforced, and ξ is the spatial scale of the asymmetric component, which is fixed to one where not explicitly stated otherwise. Moreover, we make the simplifying assumption of periodic boundary conditions on the manifold spanned by $\overrightarrow{x}$, such that the dynamics follows a periodic cycle. Both the kernel form and the periodic boundary conditions simplify the analytical description of the model, but are not a required feature of the model. In fact, all the results presented hold for a large class of interaction kernels, and the boundary conditions can be modified (e.g. with the introduction of interactions between different memories) without compromising the functionality of the network. Both points will be addressed in the analyses of the model in the next sections.
Results
Asymmetric recurrent connections produce dynamic retrieval
The spontaneous dynamics produced by the network is constrained to the low dimensional manifold codified in the connectivity matrix and spanned by the parameter $\overrightarrow{x}$. The shortrange interactions and the uniform inhibition enforced by the firing threshold h_{0} produce a localized 'bump' of activity in the manifold. The presence of an asymmetry in the connection strengths prevents the system from remaining in a stationary equilibrium. Instead, it generates a steady flow of activity in the direction of the asymmetry.
This flow is illustrated in Figure 2 (a),(b) and (c), obtained with numerical simulation of a network encoding a one, two, or threedimensional manifold, respectively. In the simulation, each neuron is assigned to a preferential firing location $\overrightarrow{{x}_{i}}$ in the manifold to encode, and the plots show the activity of the network organized according to this disposition. The activity of the population clusters in a bump around a certain position at each time point (t_{1}, t_{2}, and t_{3}), and the bump shifts by effect of the asymmetric component of the interactions. In this way, the neural population collectively encodes an evolving coordinate on the manifold spanned by $\overrightarrow{x}$. The coherence of the representation is not affected by the presence of the asymmetric term: the movement of the activity bump happens without dissipation.
The speed of movement of the bump is modulated by the value of the parameter γ and by the sparsity level of the representation $f$, that is the fraction of neurons active at each given time during the dynamics. The dependence of the speed on these two parameters is illustrated in 2(d). Stronger asymmetry (high γ) produces a faster shift. Interestingly, the sparsity value $f$ acts as a modulator of the influence of γ: sparser representations move more slowly than dense ones.
While γ describes a feature of the synaptic interactions, determined during the learning phase and relatively fixed at the short timescales of retrieval, $f$ can be instantaneously modulated during retrieval dynamics. A change in the gain or the excitability of the population can be used to produce dynamic retrieval at different speeds. Thus, the model predicts an interaction between the sparsity and the speed of the reactivation of a continuous memory sequence, with increased activity leading to faster replay. It is worth noting, however, that the direction of the dynamics is fixed with $J$: the model is able to retrieve either forward or backward sequences, but not alternate between them.
The dependence of the retrieval speed $s$ on γ and $f$ is well described by the approximate functional form
This dependence is shown in 2(b), where the dots are the values obtained with numerical simulations and the full curves the fitted relationship. The full understanding of the nature of this functional form remains an open challenge for future analysis. As we will see in the next section, the analytical solution of the model yields the same form to describe the dependence of speed on γ and $f$, but a closedform solution is still lacking.
In the model presented, the asymmetry in the interactions is enforced uniformly along a single direction also for two and threedimensional manifolds, representing the case in which neural dynamics follows a forced trajectory along one dimension, but is free to move without energy costs along the others. However, the same mechanism can be used to produce onedimensional trajectories embedded in lowdimensional manifolds, with the introduction of a positional dependence in the direction of the asymmetry (Blum and Abbott, 1996). In this case, an interesting problem is posed by the intersection of two trajectories embedded in the same manifold: is the network, during the retrieval of one trajectory, able to cross these intersections, or do they hinder dynamical retrieval? The investigation of the full phenomenology of positiondependent asymmetric kernels with intersecting trajectories is beyond the scope of the present work, but we present in Figure 2(e) a numerical study of a minimal version of this problem, with two orthogonal trajectories (Figure 2(e), black arrows) embedded in a 2D manifold and memorized simultaneously in the network. Notice that in this case the two trajectories are parallel to the main axis of the square environment, but they do not need to be: any pair of orthogonal trajectories will behave in the same manner. When the network is cued to retrieve the horizontal trajectory, the behavior at the intersection depends on the strength γ and scale ξ of the asymmetric component. At low γ, the dynamics spontaneously switch trajectory at the intersection (Figure 2(e), blue curve), while for γ sufficiently large the retrieval of the horizontal trajectory is successful (Figure 2(e), orange curve). The value ${\gamma}^{*}$ required for a successful crossing depends on the spatial scale ξ: larger ξ allow for crossing with lower values of γ, as shown in the topright inset of Figure 2(e), in which ${\gamma}^{*}(\xi )$ is plotted. Intuitively, the ability of the network to retrieve crossing trajectories depends on the shape of the activity bump, which needs to be sufficiently elongated in the direction of retrieval for the successful crossing of the intersection. The blue and orange insets of Figure 2(e) show the difference in shape of the bump in the case of a trajectory switch (blue) and successful crossing (orange).
Analytical solution for the single manifold case
The simplicity of continuous attractor models often allows to extract important computational principles from their analytical solution (Wu et al., 2008; Fung et al., 2010). In our case, the dynamic behavior of the system and its features can be fully described analytically with a generalization of the framework developed by Battaglia and Treves, 1998. For this purpose, it is easier to formulate the problem in the continuum, and describe the population activity $\{{V}_{i}\}$ by its profile $V(\overrightarrow{x})$ on the attractive manifold parametrized by the coordinate $\overrightarrow{x}$, and the dynamical evolution as a discrete step map, equivalent to Equation 1.
The requirement of a rigid shift of population activity is then imposed by setting the activity at time $t+1$ to be equal at the activity at time $t$, but translated by an amount $\mathrm{\Delta}\overrightarrow{x}=\tau s\overrightarrow{n}$, proportional to the speed $s$ of the shift and in the direction $\overrightarrow{n}$ of the asymmetry in the connections. The timescale τ sets the time unit in which the duration of the evolution is measured and does not have an impact on the behavior of the system.
The activity profile $V(\overrightarrow{x})$ is then found as the selfconsistent solution to the integral equation
Equation 9 is valid in general. We will focus here, for the explicit derivation (reported in appendix C), on the case of a one dimensional manifold with an exponential interaction kernel
In this case, the activity bump will take the form:
The parameters ${k}_{1}={k}_{1}(\gamma ,s)$ and ${k}_{2}={k}_{2}(\gamma ,s)$ determine the properties of the solution and they depend on the values of γ and speed $s=\mathrm{\Delta}x/\tau $.
k_{2} is related to the bump width by the relation
where $R$ is the point at which $V(x)=0$. k_{1} is related to the asymmetry of the bump: in the limit case $\gamma =0$, $s=0$ (Figure 3(a), first column) ${k}_{1}=0$, and we recover the cosine solution of the symmetric kernel case studied in Battaglia and Treves, 1998. Larger k_{1} values result in more and more asymmetric shapes (Figure 3(a), second and third columns).
From this analytical solution, we can determine the dependence of the speed $s$ on the asymmetry strength γ and on the sparsity $f=2R/L$ (note that in the continuum case the fraction of active neurons is given by the ratio between the bump size $2R$ and the manifold size $L$). The sparsity $f$ is modulated by the value of the gain $g$, as shown in Figure 3(b): a larger gain in the transfer function corresponds to a sparser activity. The exact relation $s(\gamma ,f)$ can be obtained from the numerical solution of a transcendental equation (see appendix C), and can be approximated with a functional shape analogous to the one used for the simulated network:
The full transcendental solution and the fitted curves are reported in Figure 3(c).
Dynamic retrieval is robust
The analytical solution of the model shows that the network performs dynamical retrieval for all values of the asymmetry strength γ, and that this parameter influences the retrieval speed and the shape of the activity bump. To further investigate the robustness of dynamical retrieval to parameter changes, we investigate with numerical simulations the behavior of the model with respect to another important parameter: the scale ξ of the antisymmetric component. We run several dynamics of a network with interaction kernel given by
varying the parameters γ and ξ and measuring the retrieval speed, the peak value of the activity bump and its skeweness. The joint effects of γ and ξ are shown in Figure 4. In the whole range of parameters analyzed, spanning four orders of magnitude for both parameters, the network was able to produce dynamic retrieval. γ and ξ affect the speed of the shift, the peak values of the activity distribution and the skewness of the activity bump, without hindering network functionality. Moreover, all these feature vary gradually and mildly with the parameters values, producing dynamical behavior qualitatively similar in the full parameter range. This analysis shows that dynamical retrieval does not require any fine tuning of network parameters, but relies on the assumption of an exponential shape for the interaction kernel. How robust is the behavior of the network to the details of the kernel shape?
We addressed this question by simulating the network dynamics with alternative kernel choices. We kept fixed the symmetric component, and explored three different antisimmetric shapes: a gaussianderivative shape (Figure 5a), a sinusoidal shape (Figure 5b) and a double step function (Figure 5c). Each of these simulations produced the same retrieval dynamics (a stable bump shifting at constant speed), the only effect of the kernel shape being on the details of the shape of the activity bump (Figure 5, bottom row). This shows that the dynamic retrieval mechanism, much like standard continuous attractors, is robust with respect to the precise shape of the interactions. Importantly, no particular relationship is required between the shape of the symmetric and the antisymmetric components of the kernel.
A dynamical memory: storing multiple manifolds
We described in detail the behavior of a neural network with asymmetric connectivity in the case of a single manifold encoded in the synaptic connectivity. For the network to behave as an autoassociative memory, however, it needs to be able to store and dynamically retrieve multiple manifolds. This is possible if we construct the interaction matrix ${J}_{ij}$ as the sum of the contributions from $p$ different, independently encoded manifolds:
Here, each ${x}_{i}^{\mu}$ represent the preferred firing location of neuron $i$ in the manifold μ, and $K$ is the same interaction kernel as in Equation 4, containing a symmetric and antisymmetric component.
The resulting dynamics show multiple continuous attractors, corresponding to the stored manifolds. Given an initial configuration, the networks rapidly converges to the nearest (i.e. most correlated) attractor, forming a coherent bump that then moves along the manifold as a consequence of the asymmetric component of the connectivity. The same dynamics, if projected on the other unretrieved manifolds, appear as random noise. This is illustrated in Figure 6 obtained with numerical simulations of a network encoding three different manifolds (of dimension one in (a), dimension two in (b)), and dynamically retrieving the first one. How does this shifting activity bump relate to the activity of single cells? To clarify this aspect we simulated an electrophysiological recording from the dynamical retrieval of Figure 6(a). We selected a random subset of 15 of the 1000 cells of the network, and generated spike trains using a poisson point process with an instantaneous firing rate proportional to the activity level of each cell yielded by the retrieval dynamics at each time step, plus a small noise. With this procedure we obtain the spike trains of each cell, that we can visualize with a rasterplot (Figure 6c). We can sort the cells according to their firing field position on each of the three manifolds (if these were, e.g., linear tracks, this would correspond to sorting according to place field positions in each of the tracks). When ordered as in the retrieved manifold, the recorded cells show a structured pattern that is considered the hallmark of sequential activity in experimental studies (Figure 6c, first column). If the same spikes are ordered according to the unretrieved manifolds, the pattern is lost (Figure 6c, second and third column), indicating that the population activity is retrieving the dynamical structure of the first manifold specifically.
Multiple dynamic manifolds can be memorized and retrieved by the network, with different speeds. Figure 7 (b) shows the result of the numerical simulation of a network with five different onedimensional manifold stored in its connectivity matrix, each encoded with a different value of γ (see appendix D). These manifold are dynamically retrieved by the network at different speeds, depending on the corresponding γ. This allows the model to simultaneously store memories without the constraint of a fixed dynamical timescale, an important feature for the description of biological circuits that need to be able to operate at different temporal scales.
Different memories stored in the same neural population can interact with each other, building more retrieval schemes in which, for example, the retrieval of a memory cues the retrieval of another one. To investigate this possibility, we have incorporated in the model a mechanism for interaction between memories, in which the endpoint of a dynamical, onedimensional manifold elicits the activation of the start point of a different one (see Appendix E). This results in the sequential retrieval of multiple memories, one after the other, as illustrated in Figure 7 (d). The top row shows the evolution in time of the overlaps ${m}_{\mu}$:
These order parameters quantify the coherence of the population activity $V(t)$ with each of the manifolds. Localized activity in manifold μ results in a large ${m}_{\mu}$, while a low ${m}_{\mu}$ corresponds to an incoherent scattering of the activity. The network retrieves the manifolds in sequence, one at a time, following the instructed transitions encoded in its connectivity. The allornothing behavior of the coherence parameters segments the continuous dynamics of the network into a sequence of discrete states.
The bottom row shows the evolution of the retrieved position, given in each manifold by the center of mass:
The dynamic runs across the retrieved manifold, from its beginning to its end, then jumps to next one and repeats the process. Note that the position in each of the unretrieved manifold fluctuates around $L/2$, as a consequence of the incoherence of the activity. Within each of the retrieved manifolds, the dynamic retains its continuous nature in the representation of the evolving position.
This sequential dynamic goes beyond the simple cued retrieval of independent memories that is the focus of most autoassociative memory models, and provides an example of a hybrid computational system, encoding both continuous and discrete features.
The interaction mechanism introduced here provides the opportunity to investigate the effect of more complex interactions than the simple memory chain presented here. We present here this first example as a proof of principle of the possibility of storing interacting dynamical memories, and will proceed to the investigation of more complex structures (e.g. interaction networks, probabilistic interactions, etc.) in future studies.
Storage capacity
The number of maps that can be stored and retrieved by an attractor network of this kind is typically proportional to the number of inputs per neuron $C$ (Treves and Rolls, 1991). The memory load $\alpha =p/C$ crucially determines the behavior of the system: when α is increased above a certain threshold value ${\alpha}_{c}$, the network is not able to retrieve any of the stored memories, falling instead into a disordered state. Therefore it is the magnitude of ${\alpha}_{c}$, that is the storage capacity of the system, that determines how effectively it can operate as a memory. To estimate the storage capacity of dynamic continuous attractors, and to investigate how it is impacted by the presence of asymmetric connections, we proceed along two complementary paths.
In the case a fully connected network, where the analytical tools developed for equilibrium systems are not applicable, we take advantage of the fact that numerical simulations can be effective for the estimation of the capacity, since the number of connections per neuron $C$ (the relevant parameters in the definition of the storage capacity ${\alpha}_{c}=p/C$) coincides with the number of neurons, minus one. For a highly diluted system, on the other hand, the number of neurons is much larger than $C$, making the simulation of the system very difficult in practice. We then resort to an analytical formulation based on a signaltonoise analysis (Battaglia and Treves, 1998), that exploits the vanishing correlations between inputs of different neurons in a highly diluted network, and does not require symmetry in the connectivity (Derrida et al., 1987). The quantification of the effect of loops in the dense connectivity regime, developed in Shiino and Fukai, 1992 and Roudi and Treves, 2004 for the case of static, discrete attractors, is beyond the scope of the present work and remains an interesting open direction.
In both the fully connected and the highly diluted case we study the dependence of the capacity on two important parameters: the map sparsity, that is the ratio between the width of the connectivity kernel (fixed to one without loss of generality) and the size $L$ of the stored manifolds, and the asymmetry strength γ. Note that the map sparsity $1/L$ is different from the activity sparsity $f$: the former is a feature of the stored memories, that we will treat as a control parameter in the following analysis; the latter is a feature of the network dynamics, and its value will be fixed by an optimization procedure in the calculation of the maximal capacity.
Analytical calculation of the capacity in the highly diluted limit
The signaltonoise approach we follow, illustrated in details in Battaglia and Treves, 1998, involves writing the local field h_{i} as the sum of two contributions: a signal term, due to the retrieved – ‘condensed’ – map, and a noise term consisting of the sum of the contributions of all the other, ‘uncondensed’ maps. In the diluted regime ($C/N\to 0$), these contributions are independent and can be summarized by a Gaussian term $\rho z$, where $z$ is a random variable with zero mean and unit variance. In the continuous limit, assuming without loss of generality that map $\mu =1$ is retrieved, we can write:
The noise will have variance:
where $L$ is the size of the map, $\u27e8\u27e8{K}^{2}(x{x}^{\prime})\u27e9\u27e9$ is the spatial variance of the kernel and
is the average square activity.
We can write the fixed point equation for the average activity profile ${m}^{1}(x)$, incorporating the dynamic shift with an argument similar to the one made for the single map case:
where $Dz=({e}^{{z}^{2}/2}/\sqrt{2\pi})dz$ and ${\int}^{+}f(x)\mathit{d}x=\int f(x)\theta (x)\mathit{d}x$. The average square activity $y$, entering the noise term, reads
Introducing the rescaled variables
And the functions
where $\mathrm{\Phi}(x)$ and $\sigma (x)$ are the Gaussian cumulative and the Gaussian probability mass function respectively, we can rewrite the fixedpoint equation as
Substituting Equation 28 in the expression for the noise variance 19 we obtain
If we are able to solve Equation 27 for the rescaled activity profile $v(x)$, we can use Equation 29 to calculate α. We can then maximize α with respect to $g$ and $w$: this yields the maximal value ${\alpha}_{c}$ for which retrieval solutions can be found.
These equations are valid in general and have to be solved numerically. Here we present the results for the case of onedimensional manifolds and interactions given by the exponential kernel of Equation 36. In this case, we have
where ${K}_{S}(x{x}^{\prime})={e}^{x{x}^{\prime}}$ is the symmetric component of the kernel. A simple approximation, illustrated in appendix F along with the detailed solution procedure, allows to decouple the dependence of ${\alpha}_{c}$ on γ and $L$, with the former given by the spatial variance given by Equation 30 and the latter by the solution of Equations 27 and 29 in the $\gamma =0$ case. We therefore have:
The storage capacity is plotted in Figure 8(a) as a function of γ and $L$.
For sparse maps and small values of the asymmetry, the capacity scales as
The scaling with $1/L$ is the same found by Battaglia and Treves, 1998 in the analysis of the symmetric case, as expected: for $\gamma =0$ the two models are equivalent.
The presence of asymmetry decreases the capacity, but does not have a catastrophic effect: the decrease is continuous and scales with a power of γ. There is therefore a wide range of values of asymmetry and map sparsity in which a large number of dynamic manifolds can be stored and retrieved.
Numerical estimation of the capacity for a fully connected network
To estimate the storage capacity for a fully connected network, we proceed with numerical simulations. For a network of fixed size $N$, and for given γ, $L$ and number of maps $p$, we run a number of simulations ${N}_{S}$, letting the network evolve from a random initial configuration. We consider a simulation to have performed a successful retrieval if the global overlap
that quantifies the coherence of the activity with map μ, is large for one map ${\mu}^{*}$ (at least 95% of the overlap value obtained in the case of a single map) and low in all others maps $\mu \ne \mu *$. We then define the retrieval probability as ${p}_{r}={N}_{R}/{N}_{S}$, where ${N}_{R}$ is the number of observed retrievals.
We repeat the process varying the storage load, that is the number of stored manifolds $p$. As $p$ is increased, the system reaches a transition point, at which the retrieval probability rapidly goes to zero. This transition is illustrated, for various values of γ, in Figure 8(b).
The number of maps p_{c} at which the probability reaches zero defines the storage capacity ${\alpha}_{c}(\gamma ,L)={p}_{c}(\gamma ,L)/N$. Repeating this procedure for a range of values of γ and $L$, we obtain the plots shown in Figure 8(c), for networks encoding one dimensional and two dimensional dynamical memories.
The first thing that can be noticed is that the network can store a large number of maps in the fully connected case as well, for a wide range of γ and $L$. A network with size in the order of ten thousand neurons could store from tens up to hundreds of dynamical memories.
The capacity for one dimensional attractors is higher than the one for their two dimensional counterparts. This is in line with what was found for symmetric networks (Battaglia and Treves, 1998).
Finally, we see that the peak of the capacity is found not only for intermediate values of map sparsity – again in line with what is known from the symmetric case – but also for intermediate values of the coefficient γ. This shows that moderate values of asymmetry can be beneficial for the storage of multiple continuous attractors, a nontrivial phenomenon that may be crucial for the memory capacity of biological networks. In particular this suggests that the natural tendency of the neural activity to show a rich spontaneous dynamics not only does not hinder the possibility for multiple memories to coexist in the same population, but can be a crucial ingredient for the correct functioning of memory mechanisms.
Discussion
The results presented show how a continuous attractor neural network with memorydependent asymmetric components in the connectivity can function as a dynamic memory. Our model is simple enough to be treated analytically, robustly produces dynamic retrieval for a large range of the relevant parameters and shows a storage capacity that is comparable to – and in some cases higher than – the capacity for static continuous attractors.
The analytical solution of the single manifold case shows that the interaction between the strength of the asymmetry and the velocity of the shift can be modulated by global features of the network activity such as its sparsity. This makes the network able to retrieve at different velocities in different regimes, without necessarily requiring short term synaptic modifications. The dependence of the retrieval speed from the sparsity of the activity yields a testable prediction in the context of hippocampal replay: faster reactivations are to be expected in association with an increase in the excitability of the population.
The insensitivity of the general features of the dynamics to the fine details of the shape of the interactions suggests that this mechanism could robustly emerge from learning or self organization processes in the presence of noise. The quantitative analysis of the learning process needed to effectively memorize lowdimensional dynamic manifolds is an interesting open direction, which goes beyond the scope of this work. However, the asymmetric Hebbian plasticity rule used here provides a simple and biologically realistic starting point.
Our analysis shows that the simple introduction of an aysmmetric Hebbian plasticity rule is sufficient to describe a dynamic memory able to store and retrieve multiple manifolds with different speeds, and it can incorporate interactions between them, producing the chained retrieval of a sequence of continuous memories.
The central result of the paper is the quantification of the storage capacity for dynamic continuous attractors, that we find to be large in magnitude, only mildly impacted by the asymmetry in diluted networks, and even higher than the capacity for static attractors in fully connected networks with moderate degrees of asymmetry. The storage capacity of out of equilibrium continuous attractors has been calculated, in a different scenario, by Zhong et al., 2020. The authors considered the case of an external signal driving the activity bump along the attractor, in a network of binary neurons, and proceeded to calculate the storage capacity with several assumptions that allowed to model the interference of multiple maps as thermal noise. Interestingly, their main result is broadly compatible with what we show here: in the highly diluted regime the velocity of the external signal has a mild – detrimental – effect on the capacity. This hints that out of equilibrium effects could show some form of universality across different network models and implementations of the shift mechanism. Moreover, a high capacity for dynamical sequences has shown to be achievable also in the case of discrete items (Gillett et al., 2020). Together these results suggest that the introduction of a temporal structure is compatible with the functioning of autoassociative memory in recurrent networks, and they open the way to the use of attractor models for the quantitative analysis of complex memory phenomena, such as hippocampal replay and memory schemata.
The model we propose suggests that the tendency of the activity to move in the neural population is a natural feature of networks with asymmetric connectivity, when the asymmetry is organized along a direction in a low dimensional manifold, and that static memories could be the exception rather than the rule. Indeed, Mehta et al., 2000 have shown that place fields can become more asymmetric in the course of spatial learning, demonstrating that the idea that symmetry emerges from an averaging of trajectorydependent effects (Sharp, 1991) does not always hold true. The structural role of the asymmetry has important implications for the analysis methods used to describe the activity of large populations of neurons, which often rely on the assumption of symmetry in the interactions (e.g. in the analysis of pairwise correlations) or equilibrium of the neural activity (e.g. the standard inverse Ising inference).
In most of the two and threedimensional cases analysed here, the asymmetry is constant along a single direction in each attractor. This can describe the situation in which the temporal evolution of the memory is structured along a certain dimension, and free to diffuse, without energy costs, in the remaining ones. The description of several onedimensional trajectories embedded in a two dimensional or three dimensional space requires a positiondependent asymmetric component. A systematic analysis of this situation is left for future analysis. However, the simple case of two intersecting trajectories embedded in a 2D map, analysed here, provides a proof of concept that several intersecting trajectories can be correctly retrieved, provided that the activity bump is sufficiently elongated in the direction of the trajectory. A progressive elongation of the place fields in the running direction has been observed in rats running on a linear track (Mehta et al., 1997), and our analysis predicts that an analogous effect would be observed also in openfield environments, when restricting the analysis to trajectories in the same running direction.
Another challenge is posed by the evidence that replayed sequences can be organized both forward and backward in time (Foster and Wilson, 2006). The model in its current formulation can produce the retrieval of a given sequence either forward or backward, but cannot alternate between the two. This suggests that, if replay relies on asymmetric connections, the hippocampus would have to use different representations for the forward and the backward component. The fact that a change in reward uniquely modulates backward replay (Ambrose et al., 2016) provides some evidence in this direction, but this question remains open to experimental investigation.
The dynamical retrieval of the model generalizes, in the framework of attractor networks, the idea of cognitive maps, incorporating a temporal organization in the lowdimensional manifold encoding the structure of the memory. This feature is reminiscent of the idea of memory schemata – constructs that can guide and constrain our mental activity when we reminisce about the past, imagine future or fictional scenarios or let our minds free to wander (Ciaramelli and Treves, 2019). The use of the present model to describe memory schemata will require further steps, such as an account of the interaction between hippocampus and neocortex, and the expansion of the mechanism describing the transition between different dynamical memories. Nevertheless, the idea of dynamic retrieval of a continuous manifold and the integration of the model presented here with effective models of cortical memory networks (Boboeva et al., 2018) open promising perspectives.
Finally, the full analytical description of a densely connected asymmetric attractor network is a challenge that remains open, and can yield valuable insights on the workings of the neural circuits underlying memory.
Appendix 1
A Numerical simulations
Numerical simulations are performed with python code, available at: https://github.com/davidespalla/CADM (copy archived at swh:1:rev:65a5bdfa291840cf5bf10e1da48aadb0b316a445), Spalla, 2021.
In the single map case, to each of the $N$ units ($N=1000$ in 1D, $N=1600$ in 2D) is assigned a preferential firing location x_{i} on a regular grid spanning the environment with linear dimension $L$. From this preferred firing locations the interaction matrix ${J}_{ij}$ is constructed, with the formula:
The precise shape of the symmetric and antisymmetric parts of the kernel are chosen differently in different simulations, according to the feature the analysis focused on, as specified in the main text. Once the network is assembled, the dynamics is initialized either with a random assignment of activity values to each unit in the range $[0,1]$, or with a gaussian bump centered in the middle of the environment (note that, due to the periodic boundary conditions and the translational invariance of the connectivity, the choice of the starting point does not influence the outcome). The dynamics is then evolved in discrete time steps, with the iteration of the following operations:
Calculation of the local fields ${h}_{i}(t)={\sum}_{j}{J}_{ij}{V}_{j}(t1)$
Calculation of the activity values ${V}_{i}(t)=g({h}_{i}(t){h}_{0})\theta ({h}_{i}(t){h}_{0})$
Dynamic adjustment of the threshold h_{0} such that only the $fN$ most active neurons remain active: ${h}_{0}={V}_{f}$, where ${\sum}_{j,{V}_{j}>{V}_{f}}=Nf$
Recalculation of the activity ${V}_{i}(t)$ with the adjusted threshold
Dynamic adjustment of the gain $g$ such that the mean activity $\u27e8\{{V}_{i}(t)\}\u27e9$ is fixed to 1: $g=g/\u27e8\{{V}_{i}(t)\}\u27e9$
Recalculation of the activity ${V}_{i}(t)$ with the adjusted gain
The adjustment of the parameters of the transfer function is enforced to constrain the network to operate at fixed sparsity $f$ and fixed mean, set to one without loss of generality. The dynamics is iterated for a given number of steps (usually 200), large enough to assure the convergence to the attractive manifold (reached usually in < five steps) and the observation of the dynamical evolution on the manifold.
In the case of multiple maps, the implemented dynamical evolution is the same, but the interaction matrix is constructed with multiple assignments of the preferred firing locations x_{i}, one for each of the $p$ stored maps:
The multiple assignments of the preferred firing locations are performed by a random shuffling of the labels of the units before the assignment to the position on the regular grid spanning each map.
B Simulation of electrophysiological recordings
To simulate the recording from a subset of the network during dynamical retrieval, we constructed a network of 1000 neurons with three different manifolds encoded in its connectivity matrix. We simulated a dynamic of the network from an initial condition correlated with the first manifold, for $T=400$ time steps. Since we used circular manifolds, we then selected a chunk of the dynamics corresponding to a single lap on the circle, to have an easier scenario to compare with experimental work. We then simulated an experimental recording by selecting a random subset of 15 observed cells. We used the activity values of each cell during the dynamics as the instantaneous firing rate of a Poisson random process to generate the spiking activity, using a conversion of $1/25$ to convert the arbitrary value of activity to a plausible range of firing rates, in the order of some Hertz. Spikes produced by this process are our simulated activity recording, and can be plotted according to the preferred firing position of each of the recorded cell in each of the three manifolds. The preferred firing position is supposed to be extracted, in an experimental setting, from the average rate maps of the recorded cells over many observations of the dynamics. In this way we obtain the plots reported in Figure 6(c).
C Analytical solution of the single map model in one dimension
To solve the integral Equation 9 in the case of a 1D manifold and the exponential kernel
we start by rewriting it as
where $[R,R],R>0$ is a compact domain for which there exist a solution of Equation 9 that is zero at the boundary. This allows to exploit the fact that our thresholdlinear system is, indeed, linear in the region in which $V(x)>0$.
We then differentiate twice to obtain the differential equation
This is a second order linear ODE, with constant coefficients. The presence of the shift term $\mathrm{\Delta}x$ inside the unknown function makes the equation nontrivial to solve. To solve it, we proceed in the following way: first, we look for a particular solution, that is easily found in the constant function
Then, we consider the associated homogeneous equation, and look for a solution in the form $V(x)={e}^{kx}$, where $k$ is a solution of the characteristic equation $C(k)=0$, with
This trascendental equation has to be solved graphically in the complex domain, as shown in Appendix 1—figure 1.
For each value of γ and $\mathrm{\Delta}x$, the equation shows a pair of complex conjugate solutions
The general solution of the equation will therefore have the form
From Equation 42 we can see that the absolute value of k_{i} is related to the width of the bump, and therefore to the sparsity of the solution, by the relation
$R$ does in turn depend only on the free parameter $g$, through the relation that can be derived in the symmetric case ($\gamma =0$, $\mathrm{\Delta}x=0$)
We can look for a solution with given sparsity $f=2R/L$ (where $L$ is the length of the manifold) by setting $R=fL/2$. Requiring the the sparsity to be fixed (i.e. setting $R$ constant) while varying γ constrains the zeros of 40 to lie in the subspace ${k}_{i}=\pi /2R$. This imposes a relation between γ and both the speed $s=\mathrm{\Delta}x/\tau $ (related to the speed of the shift) and k_{r} (related to the asymmetry of the shape of the solution). Varying $R$ we can study the dependence of the speed on both γ and $f$.
D Storing multiple manifolds with different retrieval speeds
To investigate the possibility to store and dynamically retrieve manifolds at different speeds, we have performed numerical simulations of a network with recurrent connectivity given by the formula
Each manifold μ is encoded with a different asymmetry strength ${\gamma}_{\mu}$. The normalization factor $1/(1+{\gamma}_{\mu})$ is added to ensure that each manifold contributes equally to the synaptic efficacies, and does not affect the ratio of strenghts of the symmetric and asymmetric components.
E Linking multiple manifolds together
In order to model the interaction between different stored manifolds, we add to the connectivity matrix an ”heteroassociative’ term, whose strength $J_{H}{}_{ij}{}^{\mu \nu}$ is proportional to the distance between the preferred firing location ${x}_{i}^{\mu}$ of neuron $i$ in manifold μ and the one of neuron $j$ in manifold ν, shifted by the length $L$ of the first manifold, which we denote by ${\stackrel{~}{x}}^{\nu}$. This shift enforces the fact that the interaction happens between the end of the first manifold and the beginning of the second. Then, the connectivity matrix will be given by
With
and ${G}_{\mu \nu}=1$ if there is a transition between μ and ν, and zero otherwise.
F Analytical calculation of ${\alpha}_{c}$ in the highly diluted limit
To calculate the maximum capacity, we first need to solve Equation 27 numerically for $v(x)$, for given $g$ and $w$. The procedure illustrated here focuses, for the sake of analytical simplicity, on the case of a onedimensional, exponential kernel
We start from equation 27:
First, following Battaglia and Treves, 1998 we rewrite it with the transformation
obtaining
We then transform this integral equation in a differential one, by differentiating twice. We obtain
where we have used the fact that ${\mathcal{N}}^{\prime}(x)=\mathrm{\Phi}(x)$. Equation 52 is a second order, nonlinear delayed differential equation. To solve it, it is not sufficient to impose an initial condition on a single point for the solution and the first derivative (i.e. something like $u({x}_{0})={u}_{0},{u}^{\prime}({x}_{0})={u}_{0}^{\prime}$): we have to specify the value of the function and its derivative in an interval $[{x}_{0},{x}_{0}+\mathrm{\Delta}x]$.
To do so, we reason that, if we want a bump solution, $u(x)$ has to be finite for $x\to \pm \mathrm{\infty}$ and cannot diverge. We then require the function to be constant ($u(x)={u}_{0}$, ${u}^{\prime}(x)=0$) before a certain value x_{0}, whose value can be set arbitrarily without loss of generality.
The value u_{0}, at $\gamma =0$ and $\mathrm{\Delta}x=0$ determines the shape of $u(x)$, as shown by the numerical solution presented in Appendix 1—figure 2. For ${u}_{0}<{u}^{*}$ the solution will diverge at $x\to \mathrm{\infty}$, while for ${u}_{0}>{u}^{*}$ it will oscillate. We are then left with a single value ${u}_{0}(g,w)={u}^{*}(g,w)$ for which the solution has the required form.
We can see from the full dots in the figure that the contribution of the integral in Equation 29 is remarkably constant in γ. This is due to the fact that the distortions of the bump shape induced by the presence of the asymmetry have a negligible effect on the average square activity $y$, whose value is dominated by the dependence on γ of the spatial variance of the kernel (Equation 19).
This allows us to approximate the value of the integral in Equation 29 with its value in the $\gamma =0$ case. We can then calculate the capacity as a function of γ and $L$ by solving the symmetric case for different Ls, and then incorporating the dependence on γ given by the kernel variance:
This approximation yields the results reported in the main text and in Figure 8
Data availability
The work did not generate any experimental dataset. The code used for numerical simulations is publicly available on Github (https://github.com/davidespalla/CADM (copy archived at https://archive.softwareheritage.org/swh:1:rev:65a5bdfa291840cf5bf10e1da48aadb0b316a445)).
References

Storing infinite numbers of patterns in a spinglass model of neural networksPhysical Review Letters 55:1530–1533.https://doi.org/10.1103/PhysRevLett.55.1530

A computational model for preplay in the HippocampusFrontiers in Computational Neuroscience 7:161.https://doi.org/10.3389/fncom.2013.00161

A model of spatial map formation in the Hippocampus of the ratNeural Computation 8:85–93.https://doi.org/10.1162/neco.1996.8.1.85

Accurate path integration in continuous attractor network models of grid cellsPLOS Computational Biology 5:e1000291.https://doi.org/10.1371/journal.pcbi.1000291

Cellular bases of hippocampal EEG in the behaving ratBrain Research Reviews 6:139–171.https://doi.org/10.1016/01650173(83)900371

An exactly solvable asymmetric neural network modelEurophysics Letters 4:167–173.https://doi.org/10.1209/02955075/4/2/007

The reorganization and reactivation of hippocampal maps predict spatial memory performanceNature Neuroscience 13:995–1002.https://doi.org/10.1038/nn.2599

BookFrom Conditioning to Conscious Recollection: Memory Systems of the BrainOxford University Press.

A spin glass model of path integration in rat medial entorhinal cortexJournal of Neuroscience 26:4266–4276.https://doi.org/10.1523/JNEUROSCI.435305.2006

Hippocampal Sharp Waves and Reactivation during Awake States Depend on Repeated Sequential ExperienceJournal of Neuroscience 26:12415–12426.https://doi.org/10.1523/JNEUROSCI.411806.2006

Awake replay of remote experiences in the HippocampusNature Neuroscience 12:913–918.https://doi.org/10.1038/nn.2344

The storage capacity of potts models for semantic memory retrievalJournal of Statistical Mechanics: Theory and Experiment 2005:P08010.https://doi.org/10.1088/17425468/2005/08/P08010

Replay and time compression of recurring spike sequences in the HippocampusThe Journal of Neuroscience 19:9497–9507.https://doi.org/10.1523/JNEUROSCI.192109497.1999

Play it again: reactivation of waking experience and memoryTrends in Neurosciences 33:220–229.https://doi.org/10.1016/j.tins.2010.01.006

The Role of Hippocampal Replay in Memory and PlanningCurrent Biology 28:R37–R50.https://doi.org/10.1016/j.cub.2017.10.073

Schema design and implementation of the grasprelated mirror neuron systemBiological Cybernetics 87:116–140.https://doi.org/10.1007/s0042200203181

An associative network with spatially organized connectivityJournal of Statistical Mechanics: Theory and Experiment 2004:P07010.https://doi.org/10.1088/17425468/2004/07/P07010

Free association transitions in models of cortical latching dynamicsNew Journal of Physics 10:015008.https://doi.org/10.1088/13672630/10/1/015008

Path integration and cognitive mapping in a continuous attractor neural network modelThe Journal of Neuroscience 17:5900–5920.https://doi.org/10.1523/JNEUROSCI.171505900.1997

Cortical control of arm movements: a dynamical systems perspectiveAnnual Review of Neuroscience 36:337–359.https://doi.org/10.1146/annurevneuro062111150509

Selfconsistent signaltonoise analysis and its application to analogue neural networks with asymmetric connectionsJournal of Physics A: Mathematical and General 25:L375–L381.https://doi.org/10.1088/03054470/25/7/017

Temporal association in asymmetric neural networksPhysical Review Letters 57:2861–2864.https://doi.org/10.1103/PhysRevLett.57.2861

Can grid cell ensembles represent multiple spaces?Neural Computation 31:2324–2347.https://doi.org/10.1162/neco_a_01237

Selforganizing continuous attractor network models of hippocampal spatial view cellsNeurobiology of Learning and Memory 83:79–92.https://doi.org/10.1016/j.nlm.2004.08.003

Thresholdlinear formal neurons in autoassociative netsJournal of Physics A: Mathematical and General 23:2631–2650.https://doi.org/10.1088/03054470/23/12/037

What determines the capacity of autoassociative memories in the brain?Network: Computation in Neural Systems 2:371–397.https://doi.org/10.1088/0954898X_2_4_004

Episodic Memory: From Mind to BrainAnnual Review of Psychology 53:1–25.https://doi.org/10.1146/annurev.psych.53.100901.135114

Dynamics and computation of continuous attractorsNeural Computation 20:994–1025.https://doi.org/10.1162/neco.2008.1006378

Representation of spatial orientation by the intrinsic dynamics of the headdirection cell ensemble: a theoryThe Journal of Neuroscience 16:2112–2126.https://doi.org/10.1523/JNEUROSCI.160602112.1996

Nonequilibrium statistical mechanics of continuous attractorsNeural Computation 32:1033–1068.https://doi.org/10.1162/neco_a_01280

Longterm dynamics of CA1 hippocampal place codesNature Neuroscience 16:264–266.https://doi.org/10.1038/nn.3329
Decision letter

Adrien PeyracheReviewing Editor; McGill University, Canada

Laura L ColginSenior Editor; University of Texas at Austin, United States

Ulisse FerrariReviewer; Sorbonne Université, France
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
The study presents a new and elegant theoretical framework that generalizes memory storage and retrieval in neural networks to the dynamical case. Specifically, the authors show that imposing asymmetric interaction between units is sufficient to learn and retrieve sequential activation of these units, resembling the dynamics of hippocampal place cells during memory encoding and replay.
Decision letter after peer review:
[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]
Thank you for submitting your work entitled "Continuous attractors for dynamic memories" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Ulisse Ferrari (Reviewer #3).
Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that this manuscript will not be considered further for publication in eLife. However, as pointed out by the reviewers, your work is of potential interest for the field. You may thus consider to submit a new version if you think you can significantly improve the manuscript. In this case, please include the original manuscript reference number as well as the names of the senior (Laura L Colgin) and reviewer editor (Adrien Peyrache) in your cover letter.
The authors present a computational model that allows for storing and reactivating dynamical memory patterns on low dimensional manifolds. The model presented in this manuscript is simple and elegant. Synaptic couplings are decomposed into a symmetric component, which makes low dimensional manifolds attractive, and an asymmetric component, which produces dynamic retrieval. This is of particular interest as sequential activation of neurons (i.e. trajectories on a manifold) is ubiquitous in the brain, especially in the hippocampus. This work is thus an important contribution to the field, bridging the gap between theoretical and experimental studies. However, the reviewers have raised several major concerns regarding the details of the model and its presentation. Furthermore, it seems necessary to demonstrate that the present model works well beyond the onedimensional case and/or to test predictions of the model with actual experimental data.
Reviewer #2:
I really like the simplicity of the learning rule. However, the article lack clarity in many fundamental sections, and it took me really long to understand important details (if I understood them). When the authors introduce the model, they introduce the inhibition term, but then they remove it as they claim it can be reabsorbed in h_{0}. From the equations this claim is not correct, assuming that h_{0} does not depend on the values of Vs. It's only by reading the methods that I understood that they actually choose dynamically both g and h_{0} to enforce a certain level of sparseness. I would say that already in the main text without introducing the term multiplying b at all, it is only confusing.
Even more confusing is the introduction of x. They define x as "preferential firing location in the stimulus space". They never spoke about stimuli, and I think they don't need to. I believe I finally understood what they mean, but I had to read the rest of the article (a couple of times). The description of the phenomenology is also unclear. In Figure 2 they never explain that they use x (y and z) to identify a neuron. So basically V is plotted as a function of the neuron number, parametrized by x. I guess that they discretized x and then assigned each x to a neuron. They should be more explicit about this, especially considering the broad audience of eLife. Moreover, they claim that the dynamics occurs in a low dimensional manifold, but in all the examples I see, their trajectory is always 1D, even in the cases in which they have multiple variables (e.g. figure 2c). Also in the dynamic kernel section they determine the kernel only in the case of one variable x, and obviously a 1D trajectory. The kernel they get is a for a rigid shift, under the assumption that the kernel is exponential. What would happen in a more general case? What would be the phenomenology if the dynamics really moves along a 2D or a 3D manifold? They only briefly mention in the Discussion that they consider only a single direction (they should say it from the very beginning). Also, what kind of asymmetric kernels do they consider? Is it only a rigid shift, or it could be a curvilinear trajectory when the underlying manifold has 2 or more dimensions? What is the maximal dimensionality of the trajectory? I guess it depends on the specific form of the asymmetric kernel. I have the impression that the authors have an interesting system to study, but they report only very few aspects of its dynamics (and in a rather confusing way).
Finally, they mention a few predictions in the Discussion, but they're rather generic. For a journal like eLife, in the absence of real data, quantitative predictions are important. The introduction, which is nicely written, contains several examples of phenomena that could be described by the proposed model.
There are several misprints and references missing.
Reviewer #3:
Spalla, Cornacchia and Treves presents a novel class of neural network models capable of storing and retrieving dynamical memory patterns. This result is a major advance in the field because it opens for many possible developments and importantly, it reduces the gap between theoretical and experimental works on memory. Previous models already showed the capacity of a neuronal network to store multiple memories, where each of them is represented by a pattern of neural activations. However, in these models when a particular memory is retrieved, neural activity aligns with the corresponding pattern, but it is static and does not flow along it. This issue prevents a clear connection between experimental and theoretical analysis of memory reactivations which are dynamical in nature, as for example in the case of the replay effect. Here instead, once neural activity aligns with a stored memory pattern, it evolves along it as a dynamical bump of activity. Therefore, here memories are dynamical objects, something closer to what has been observed in experiments.
Overall the paper presents some very relevant science and the claims are well supported by the presented analyses.
Strengths:
Having a simple but powerful model that allows for temporal evolution of reactivated memories represents a major step forward towards the construction neural network models with realistic phenomenology
Despite the added complexity of the present 'dynamical memory' model with respect to 'static memory' ones, storage capacity is not much affected, and can be even larger in certain regimes.
The authors' approach generalises previous static models, which are nicely recovered as limiting cases. This allows for encompassing previous results into a more flexible and powerful framework.
The behavior of the proposed model is robust against the fine details of its construction, meaning that a large class of models will present a similar phenomenology. This is important because it opens for considering more biologically plausible models that will still present the same phenomenological behavior.
Weaknesses:
Although the temporal evolution of the reactivated memory is the major advance of this work, their dynamics is very simple: once reactivated they evolve along a given (rigid) direction. It is not clear, for example, if the dynamics can be timereversed, if cycles are allowed and if different memories can interact dynamically.
This work reduces the gap between computational and experimental studies of memory storing and retrieval. However it remains still very theoretical, and misses some development towards real neuronal networks. For example, it is suggested, but not shown that a biologically plausible mechanism of plasticity can result in systems that display the same phenomenology of the presented model. Additionally, the paper misses a clear discussion of how the model can empower the analysis of experimental results on memory storage and reactivation. Although it is fair to imagine that reaching these ambitious goals is a long term program that encompasses the current paper, the manuscript would benefit from a deep discussion of what are the main obstacles to be faced. As such, this weakness should be seen as suggestions for further developments, and not as limitation of the impact of the current work.
I find the dynamics of the retrieved memory either very simple, or not well explained and/or showcased. At first, it is not clear to me if the manifolds are compact and have endpoints. If it is the case, what happens when the dynamics reaches this point? Also, the velocity of the reactivation looks like a rigid property of the stored memory, but then the authors show that actually it might vary depending on the property of the systems (Figure 4, for example). Is there a way for actively tuning the velocity? This may link with the timecompressed reactivation observed in the replay? Additionally, it is not explained if the model can support more complex dynamical effects. For example, considering the effect of reverse replay (Foster and Wilson 2006), I wonder if the dynamics can be timereversed, so that a dynamical memory can be reactivated in both directions. Also, is it possible to have cycles, so that the dynamics will run in a circle over and over? More generically, which kind of dynamics are supported by the current model? How to extend the model to increase the resulting phenomenology?
Additionally, I found the model's limitations discussed in LL 363>370 not clearly explained, and they should anyhow be expanded discussing what the model dynamics can or cannot support.
[Editors’ note: further revisions were suggested prior to acceptance, as described below.]
Thank you for resubmitting your work entitled "Continuous attractors for dynamic memories" for further consideration by eLife. Your revised article has been evaluated by Laura Colgin (Senior Editor) and a Reviewing Editor.
The manuscript has been greatly improved since the first submission. We believe that this manuscript is potentially of great interest for the community, however we think that the manuscript would benefit from one minor (but key) revision: to add a "reallife" example that would illustrate the potential of this theoretical work to a broad audience. Basically, this could be as simple as a toy model with place cells spiking along a linear/circular track. This would then entail to simulate the network spontaneous activity and measure the firing rate in time of a subsample of the cells (recorded cells), generate some spikes from the firing rate, show that the spiking activity of the network resembles that of replay events (apart from the reverse replay), and present the results in a way that an experimental reader will immediately understand the importance of the study.
We have also made some comments regarding the general presentation of the figures and equations, detailed below in the reviewer's specific comments. The authors can of course decide not to follow these suggestions as they are not critical for the main message of the paper.
Reviewer #2:
Spalla, Cornacchia and Treves have largely improved the manuscript and fully addressed all my main criticisms, and in my opinion also those of the other referee. I already found the paper interesting during the first review round, and now it has also improved in clarity and completeness. I particularly appreciated:
– The results of the paper are novel and very interesting. Even if the model is very simple and easy to understand, it presents an impressive phenomenology. Also, by just adding one more feature, it nicely extends previous models (the asymmetric component of the kernel), which are recovered when γ>0.
– The efforts that the authors have made to better characterise and describe the phenomenology of the dynamical retrieval: analyses as those of Figure 7 are very intriguing and in my opinion will open for further developments of the research field.
– The improvement of the exposure, which now better fits the scope and wide audience of eLife.
For all these reasons I recommend the publication of this paper.
Reviewer #3:
The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, while purely symmetrical synapses (as in canonical attractor networks) lead to the retrieval of static patterns, asymmetric synapses lead to sequential activation of units in the network, as observed for example during replay of hippocampal place cells. In addition, the study demonstrates that, as previous models had shown in the static case, such network can store multiple maps.
The main concerns with this study, in its present form, is the general presentation of the model and of its general properties, which could be hard to follow for a nonexpert reader. The study would also benefit from concrete examples, such as simulation of replay in a minimal model of hippocampal place cells.
The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, asymmetric synapses lead to sequential activation of units in the network, instead of the classical static patterns emerging from purely symmetrical synapses (e.g. attractors in which connectivity kernels depends only on the "representational distance" between units in the feature space). The study also demonstrates that, as previous models had shown in the static case, such network can also store multiple maps.
While the authors have already addressed many comments made during a previous round of review, the general presentation of the manuscript should be greatly improved for publication in a life science journal targeting a broad audience. Importantly, the number of main figures should be drastically reduced, and figures should focus only on the most important claims of the study.
One major concern though is that the study would benefit from a concrete example. For example, displaying simulated "replayed" trajectories in 2D maps (as a neurophysiologist would present it in an experimental study) in different cases influencing the speed (and general behaviour) of replay, such as different levels of asymmetry and sparsity.
The remarks below are related to the specific presentation of the manuscript.
Top panels of Figure 3 and associated text should be put in supplementary info, as this section can be fairly confusing for the nonexpert reader and does not contribute much to the understanding of the phenomenon.
Figure 4 conveys an important message: the speed of the bump is related to the asymmetry of the connectivity kernel. The figure legend should be clarified: xaxis should read "kernel asymmetry (\γ)", with arrows point to either left ("more symmetrical") to right ("more asymmetrical"). dx should be replace by "bump speed (a.u.)". The message of Figure 4b is unclear and should certainly be discarded.
While Figure 5 shows that the solution can be generalized to a broad class of asymmetric kernels, Figure 6 presents again the link between network parameters and bump features (speed, etc.) in the case of an exponential kernel. One idea would be to merge the bottom panels of Figure 3, Figure 4a and Figure 6 to convey one message about the exponential case. Then show (current) Figure 5 to show that it is valid in other cases.
Figure 7 presents an interesting simulation, illustrating that retrieval can be specific to a particular map. Here again, the presentation should be drastically improved: x_1,2,3 should be replaced by more explicit terms like "feature space (map 1)" or something similar. Yaxis should be labels "activity level (V)" or something similar, etc. Against, it is important to present figure that do not require the reader to rely too much on abbreviations and variable names.
The link between kernel asymmetry and storage capacity is important but should be made clearer. Most of the equations (1729) should be put in appendix and be limited to equation 30 and 31. Figure 8 and 9 present overlapping data, they should be merged (or better: only one of the two should be shown) and the same remarks as for Figure 4 apply for the presentation of this figure.
The presentation of the nonmonotonic dependence of storage capacity w.r.t \γ in the fully connected case is another aspect that should be improved. Figure 10 would benefit from an additional graph showing P=50% chance retrieval as a function of γ to illustrate the nonmonotonic relationship. This figure should be merge in a multipanel figure with Figure 11 so that the reader can immediately grasp the main messages regarding the fully connected case.
https://doi.org/10.7554/eLife.69499.sa1Author response
[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]
Reviewer #2:
I really like the simplicity of the learning rule. However, the article lack clarity in many fundamental sections, and it took me really long to understand important details (if I understood them). When the authors introduce the model, they introduce the inhibition term, but then they remove it as they claim it can be reabsorbed in h_{0}. From the equations this claim is not correct, assuming that h_{0} does not depend on the values of Vs. It's only by reading the methods that I understood that they actually choose dynamically both g and h_{0} to enforce a certain level of sparseness. I would say that already in the main text without introducing the term multiplying b at all, it is only confusing.
We have clarified in the main text (LL 120126) how the inhibition mechanism works, and simplified the notation, avoiding the complication of the introduction of the b(x) term.
Even more confusing is the introduction of x. They define x as "preferential firing location in the stimulus space". They never spoke about stimuli, and I think they don't need to. I believe I finally understood what they mean, but I had to read the rest of the article (a couple of times).
We have expanded the section “A mechanistic model for dynamic retrieval” (LL 127139), clarifying the meaning of the parameter x and avoiding references to stimuli, and striving for a clearer and more accessible explanation of the model.
The description of the phenomenology is also unclear. In Figure 2 they never explain that they use x (y and z) to identify a neuron. So basically V is plotted as a function of the neuron number, parametrized by x. I guess that they discretized x and then assigned each x to a neuron. They should be more explicit about this, especially considering the broad audience of eLife.
We have clarified the phenomenology presented in Figure 2, by rewriting and expanding the description of the plots in the main text (LL 174183)
Moreover, they claim that the dynamics occurs in a low dimensional manifold, but in all the examples I see, their trajectory is always 1D, even in the cases in which they have multiple variables (e.g. figure 2c). Also in the dynamic kernel section they determine the kernel only in the case of one variable x, and obviously a 1D trajectory.
We clarify this aspect in LL 207212 and in the discussion (LL 500506), underlying how the dynamics in 2D and 3D is different from a simple 1D trajectory.
We have also added a new analysis of the case of two intersecting 1D trajectories embedded in a 2D manifold (Figure 2(e) and accompanying text), which explores the case of positiondependent asymmetry, and provides a proof of concept of the possibility to expand the model in this direction.
The kernel they get is a for a rigid shift, under the assumption that the kernel is exponential.
This problem is addressed in the section “dynamical retrieval is robust” (LL 266283). There we investigate the effect of the kernel shape and its parameters on the behaviour of the model. We find this behaviour to be largely independent from the specific kernel shape, and able to produce dynamic retrieval in a remarkably broad range of parameters (see Figure 4 and Figure 5).
What would happen in a more general case? What would be the phenomenology if the dynamics really moves along a 2D or a 3D manifold? They only briefly mention in the Discussion that they consider only a single direction (they should say it from the very beginning). Also, what kind of asymmetric kernels do they consider? Is it only a rigid shift, or it could be a curvilinear trajectory when the underlying manifold has 2 or more dimensions? What is the maximal dimensionality of the trajectory? I guess it depends on the specific form of the asymmetric kernel. I have the impression that the authors have an interesting system to study, but they report only very few aspects of its dynamics (and in a rather confusing way).
We have added clarifications on these aspects as stated in the previous points.
In the case with a constant asymmetric direction, the presence of the asymmetry imposes no additional bound on the maximal dimensionality of the manifold.
As it is the case in continuous attractor networks of any kind, a tradeoff is to be expected between the resolution and the dimensionality of the representation, if the size of the network is kept fixed.
Finally, they mention a few predictions in the Discussion, but they're rather generic. For a journal like eLife, in the absence of real data, quantitative predictions are important. The introduction, which is nicely written, contains several examples of phenomena that could be described by the proposed model.
We have added two key quantitative predictions by the model: the predicted elongation of place fields in the direction of the running trajectory (LL 224227, and LL 506513) and an interaction between the sparsity of the activity of the network and the retrieval speed (LL 191 196 and LL 452459).
These predictions complement the main result of the paper: that auto associative memory networks can encode continuous dynamic memory with large capacity, and that the asymmetry of connections, often considered as an afterthought in mechanistic models, can be a crucial ingredient of memory systems.
We believe these points are of interest to the broad eLife audience, beyond the technical apparatus developed to obtain them.
There are several misprints and references missing.
We have made an effort to correct all misprints and missing references, and we have added several more references to better put the work in the context of the existing literature.
Reviewer #3:
Spalla, Cornacchia and Treves presents a novel class of neural network models capable of storing and retrieving dynamical memory patterns. This result is a major advance in the field because it opens for many possible developments and importantly, it reduces the gap between theoretical and experimental works on memory. Previous models already showed the capacity of a neuronal network to store multiple memories, where each of them is represented by a pattern of neural activations. However, in these models when a particular memory is retrieved, neural activity aligns with the corresponding pattern, but it is static and does not flow along it. This issue prevents a clear connection between experimental and theoretical analysis of memory reactivations which are dynamical in nature, as for example in the case of the replay effect. Here instead, once neural activity aligns with a stored memory pattern, it evolves along it as a dynamical bump of activity. Therefore, here memories are dynamical objects, something closer to what has been observed in experiments.
Overall the paper presents some very relevant science and the claims are well supported by the presented analyses.
Strengths:
Having a simple but powerful model that allows for temporal evolution of reactivated memories represents a major step forward towards the construction neural network models with realistic phenomenology
Despite the added complexity of the present 'dynamical memory' model with respect to 'static memory' ones, storage capacity is not much affected, and can be even larger in certain regimes.
The authors' approach generalises previous static models, which are nicely recovered as limiting cases. This allows for encompassing previous results into a more flexible and powerful framework.
The behavior of the proposed model is robust against the fine details of its construction, meaning that a large class of models will present a similar phenomenology. This is important because it opens for considering more biologically plausible models that will still present the same phenomenological behavior.
Weaknesses:
Although the temporal evolution of the reactivated memory is the major advance of this work, their dynamics is very simple: once reactivated they evolve along a given (rigid) direction. It is not clear, for example, if the dynamics can be timereversed, if cycles are allowed and if different memories can interact dynamically.
This work reduces the gap between computational and experimental studies of memory storing and retrieval. However it remains still very theoretical, and misses some development towards real neuronal networks. For example, it is suggested, but not shown that a biologically plausible mechanism of plasticity can result in systems that display the same phenomenology of the presented model. Additionally, the paper misses a clear discussion of how the model can empower the analysis of experimental results on memory storage and reactivation. Although it is fair to imagine that reaching these ambitious goals is a long term program that encompasses the current paper, the manuscript would benefit from a deep discussion of what are the main obstacles to be faced. As such, this weakness should be seen as suggestions for further developments, and not as limitation of the impact of the current work.
I find the dynamics of the retrieved memory either very simple, or not well explained and/or showcased. At first, it is not clear to me if the manifolds are compact and have endpoints. If it is the case, what happens when the dynamics reaches this point?
We have clarified in the presentation of the model (LL 158165) that we consider manifolds with periodic boundary conditions, in which the dynamic retrieval produces periodic cycles. We highlight that this simplifying assumption is not a necessary feature of the model, and we study the alternative case in which this is replaced by a link between the endpoint of a manifold and the beginning of the next one (Figure 7 (c), (d) and accompanying text.). This analysis not only shows that dynamic retrieval does not rely on the specific boundary conditions, but also demonstrates that a chain of multiple continuous manifolds can be retrieved sequentially by the network.
Also, the velocity of the reactivation looks like a rigid property of the stored memory, but then the authors show that actually it might vary depending on the property of the systems (Figure 4, for example). Is there a way for actively tuning the velocity? This may link with the timecompressed reactivation observed in the replay?
We have further expanded the study of the dependence of the retrieval speed on the parameters of the network by adding an analysis of the effect of sparsity on the reactivation velocity (Figure 2 (d) and Figure 3 (b), (c)).
We find that the sparsity of the representation (a feature of the network activity, that can be regulated instantaneously during the dynamics) interacts with the asymmetry parameter \γ in determining the retrieval speed.
The possibility of instantaneously tuning the velocity by acting on the sparsity of the activity also yields a key new prediction: in a network with fixed connectivity, retrieval speed can be modulated by the activity level of the population.
Additionally, it is not explained if the model can support more complex dynamical effects. For example, considering the effect of reverse replay (Foster and Wilson 2006), I wonder if the dynamics can be timereversed, so that a dynamical memory can be reactivated in both directions. Also, is it possible to have cycles, so that the dynamics will run in a circle over and over? More generically, which kind of dynamics are supported by the current model? How to extend the model to increase the resulting phenomenology?
We have added the clarification that the current model cannot handle backward and forward replay of the same trajectory (LL 196198 and LL 514521 in the discussion). The dynamics is naturally organized in cycles under the assumption of periodic boundary conditions, but this is not the only possibility the model can describe.
To clarify this point, we have added new analyses that demonstrate complex dynamical effects. In particular the linking of dynamical memories and the possibility to simultaneously memorize manifolds with different values of the asymmetric component, leading to different retrieval speeds (Figure 7 and accompanying text).
Additionally, I found the model's limitations discussed in LL 363>370 not clearly explained, and they should anyhow be expanded discussing what the model dynamics can or cannot support.
As stated in response to reviewer 2: we clarify this aspect in LL 207212 and in the discussion (LL 500506), underlying how the dynamics in 2D and 3D is different from a simple 1D trajectory.
We have also added a new analysis of the case of two intersection 1D trajectories embedded in a 2D manifold (Figure 2(e) and accompanying text), which explores the case of positiondependent asymmetry, and provides a proof of concept of the possibility to expand the model in this direction.
[Editors’ note: what follows is the authors’ response to the second round of review.]
Reviewer #3:
The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, while purely symmetrical synapses (as in canonical attractor networks) lead to the retrieval of static patterns, asymmetric synapses lead to sequential activation of units in the network, as observed for example during replay of hippocampal place cells. In addition, the study demonstrates that, as previous models had shown in the static case, such network can store multiple maps.
The main concerns with this study, in its present form, is the general presentation of the model and of its general properties, which could be hard to follow for a nonexpert reader. The study would also benefit from concrete examples, such as simulation of replay in a minimal model of hippocampal place cells.
The study by Spalla et al. presents a new and elegant theoretical framework that generalizes memory storage and retrieval in auto associative networks to the dynamical case. Specifically, asymmetric synapses lead to sequential activation of units in the network, instead of the classical static patterns emerging from purely symmetrical synapses (e.g. attractors in which connectivity kernels depends only on the "representational distance" between units in the feature space). The study also demonstrates that, as previous models had shown in the static case, such network can also store multiple maps.
While the authors have already addressed many comments made during a previous round of review, the general presentation of the manuscript should be greatly improved for publication in a life science journal targeting a broad audience. Importantly, the number of main figures should be drastically reduced, and figures should focus only on the most important claims of the study.
We thank the reviewer for the suggestion. We have reduced the number of main figures by merging the last three figures in what is now Figure 8, and have strived to improve graphical quality and figure accessibility by a general standardization of figure design and a relabeling of figure axes throughout the manuscript, with the aim to make them more consistent and more readily interpretable.
One major concern though is that the study would benefit from a concrete example. For example, displaying simulated "replayed" trajectories in 2D maps (as a neurophysiologist would present it in an experimental study) in different cases influencing the speed (and general behaviour) of replay, such as different levels of asymmetry and sparsity.
We agree with the reviewer that a concrete example is beneficial for the interpretation of the study. To this end, we have added the analysis of a simulated electrophysiological experiment probing the population activity during dynamic retrieval (see also response to the editors).
This simulation is illustrated in Figure 6(c), constructed so as to be familiar to experimental readers in its design and interpretation.
The figure and the accompanying text (LL 314328, plus appendix B for details) serve the double purpose to connect the behaviour of the model to experimentally observable quantities, and to further illustrate the concept of dynamic auto associative memory, that Figure 6 was originally built to exemplify.
The remarks below are related to the specific presentation of the manuscript.
Top panels of Figure 3 and associated text should be put in supplementary info, as this section can be fairly confusing for the nonexpert reader and does not contribute much to the understanding of the phenomenon.
We believe that the analytical solvability of the model is a key feature of our work that can yield quantitative insights and a deep understanding of the mechanisms behind dynamic retrieval. Therefore, we chose to maintain Figure 3 and the accompanying paragraph in the main text. However, during the first revision we moved all technical details to Appendix C, including the previous Figure 3 illustrating the analytical solution procedure, leaving only results and interpretation in the main text.
We have tried to improve the graphical design and axis labeling to make the figure more accessible to a nontechnical audience.
Figure 4 conveys an important message: the speed of the bump is related to the asymmetry of the connectivity kernel. The figure legend should be clarified: xaxis should read "kernel asymmetry (\γ)", with arrows point to either left ("more symmetrical") to right ("more asymmetrical"). dx should be replace by "bump speed (a.u.)". The message of Figure 4b is unclear and should certainly be discarded.
We think the reviewer is referring to the version of Figure 4 that was in the original manuscript. We indeed changed this figure during our first revision, by getting rid of Figure 4b, as the reviewer suggested, and incorporating Figure 4a, with an extended analysis on the effect of sparsity, in what is now Figure 3. We have also changed axis label names to improve clarity.
While Figure 5 shows that the solution can be generalized to a broad class of asymmetric kernels, Figure 6 presents again the link between network parameters and bump features (speed, etc.) in the case of an exponential kernel. One idea would be to merge the bottom panels of Figure 3, Figure 4a and Figure 6 to convey one message about the exponential case. Then show (current) Figure 5 to show that it is valid in other cases.
We agree with the reviewer and thank them for the suggestion. We swapped the two figures (they are now Figure 4, on the effects of the parameters in the exponential case and Figure 5, on different kernel shapes) and restructured the section on robustness of the asymmetric interactions by examining in depth the exponential kernel case first, and then extending the analysis to other kernel shapes.
Figure 7 presents an interesting simulation, illustrating that retrieval can be specific to a particular map. Here again, the presentation should be drastically improved: x_1,2,3 should be replaced by more explicit terms like "feature space (map 1)" or something similar. Yaxis should be labels "activity level (V)" or something similar, etc. Against, it is important to present figure that do not require the reader to rely too much on abbreviations and variable names.
We have changed the labels in this figure, as well as in the others throughout the manuscript, following this advice for which we are thankful to the reviewer. We have also made some changes in coloring and label sizes to improve readability, as well as adding here as a new panel the results of the new simulation of experimental recordings (see main points).
The link between kernel asymmetry and storage capacity is important but should be made clearer. Most of the equations (1729) should be put in appendix and be limited to equation 30 and 31. Figure 8 and 9 present overlapping data, they should be merged (or better: only one of the two should be shown) and the same remarks as for Figure 4 apply for the presentation of this figure.
We believe that the reviewer is here referring to the first version of the manuscript.
During our first revision we heavily restructured this section putting most of the mathematical details to Appendix F. Also Figure 8 was moved to the appendix, since we agree with the reviewer that the data is overlapping with Figure 9 (now Figure 8a), and the point of the figure (to show the goodness of our approximation ansatz) is beyond the scope of the main text. Following the previous point on the number of main figures, we have aggregated all figures of this section in what is now Figure 8, that conveys in a single place the main messages on the storage capacity of the network.
The presentation of the nonmonotonic dependence of storage capacity w.r.t \γ in the fully connected case is another aspect that should be improved. Figure 10 would benefit from an additional graph showing P=50% chance retrieval as a function of γ to illustrate the nonmonotonic relationship. This figure should be merge in a multipanel figure with Figure 11 so that the reader can immediately grasp the main messages regarding the fully connected case.
We have merged the figures in what is now Figure 8. We did not insert an additional graph, with the rationale not to show overlapping data and not to multiply further the number of figures. However, we have changed labels and other graphical aspects to improve understandability and readability. We believe that the nonmonotonicity of the storage capacity is appreciable by the reader from both Figure 8b, where the transition point is shown to move back and forth with γ, and Figure 8c, in which the joint dependence of the capacity on γ and 1/L shows a clear peak in the bulk of the parameter ranges.
https://doi.org/10.7554/eLife.69499.sa2Article and author information
Author details
Funding
Human Frontier Science Program (RGP0057/2016)
 Alessandro Treves
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
Work supported by the Human Frontier Science Program RGP0057/2016 collaboration. We are grateful for inspiring exchanges with Remi Monasson and others in the collaboration, and thank Silvia Girardi for her help with Figure 1.
Senior Editor
 Laura L Colgin, University of Texas at Austin, United States
Reviewing Editor
 Adrien Peyrache, McGill University, Canada
Reviewer
 Ulisse Ferrari, Sorbonne Université, France
Version history
 Preprint posted: November 8, 2020 (view preprint)
 Received: April 18, 2021
 Accepted: August 12, 2021
 Version of Record published: September 14, 2021 (version 1)
Copyright
© 2021, Spalla et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 3,267
 Page views

 433
 Downloads

 11
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Cancer Biology
 Computational and Systems Biology
Drug resistance is a challenge in anticancer therapy. In many cases, cancers can be resistant to the drug prior to exposure, i.e., possess intrinsic drug resistance. However, we lack targetindependent methods to anticipate resistance in cancer cell lines or characterize intrinsic drug resistance without a priori knowledge of its cause. We hypothesized that cell morphology could provide an unbiased readout of drug resistance. To test this hypothesis, we used HCT116 cells, a mismatch repairdeficient cancer cell line, to isolate clones that were resistant or sensitive to bortezomib, a wellcharacterized proteasome inhibitor and anticancer drug to which many cancer cells possess intrinsic resistance. We then expanded these clones and measured highdimensional singlecell morphology profiles using Cell Painting, a highcontent microscopy assay. Our imaging and computationbased profiling pipeline identified morphological features that differed between resistant and sensitive cells. We used these features to generate a morphological signature of bortezomib resistance. We then employed this morphological signature to analyze a set of HCT116 clones (five resistant and five sensitive) that had not been included in the signature training dataset, and correctly predicted sensitivity to bortezomib in seven cases, in the absence of drug treatment. This signature predicted bortezomib resistance better than resistance to other drugs targeting the ubiquitinproteasome system. Our results establish a proofofconcept framework for the unbiased analysis of drug resistance using highcontent microscopy of cancer cells, in the absence of drug treatment.

 Computational and Systems Biology
Antigen immunogenicity and the specificity of binding of Tcell receptors to antigens are key properties underlying effective immune responses. Here we propose diffRBM, an approach based on transfer learning and Restricted Boltzmann Machines, to build sequencebased predictive models of these properties. DiffRBM is designed to learn the distinctive patterns in aminoacid composition that, on the one hand, underlie the antigen’s probability of triggering a response, and on the other hand the Tcell receptor’s ability to bind to a given antigen. We show that the patterns learnt by diffRBM allow us to predict putative contact sites of the antigenreceptor complex. We also discriminate immunogenic and nonimmunogenic antigens, antigenspecific and generic receptors, reaching performances that compare favorably to existing sequencebased predictors of antigen immunogenicity and Tcell receptor specificity.