Abstract
Across the brain, circuits with continuous attractor dynamics underpin the representation and storage in memory of continuous variables for motor control, navigation, and mental computations. The represented variables have various dimensions and topologies (lines, rings, euclidean planes), and the circuits exhibit continua of fixed points to store these variables, and the ability to use input velocity signals to update and maintain the representation of unobserved variables, effectively integrating the incoming velocity signal. Integration constitutes a general computational strategy that enables variable state estimation when direct observation of the variable is not possible, suggesting that it may play a critical role in other cognitive processes. While some neural network models for integration exist, a comprehensive theory for constructing neural circuits with a given topology and integration capabilities is lacking. Here, we present a theoretically-driven design framework, Manifold Attractor Direct Engineering (MADE), to automatically, analytically, and explicitly construct biologically plausible continuous attractor neural networks with diverse user-specified topologies. We show how these attractor networks can be endowed with accurate integration functionality through biologically realistic circuit mechanisms. MADE networks closely resemble biological circuits where the attractor mechanisms have been characterized. Additionally, MADE offers innovative and minimal circuit models for uncharacterized topologies, enabling a systematic approach to developing and testing mathematical theories related to cognition and computation in the brain.
Introduction
The brains of species from insects to mammals contain circuits specialized to represent and integrate continuous variables (Figure 1A) [1, 2]: the head direction circuits in mammals [3, 4, 5], fish [6], and flies [7, 8, 9, 10], the oculomotor system of vertebrates [11, 12, 13, 14, 15], and grid cell networks in mammals [16, 17, 18] (see Figure 1B,C,D). These circuits receive velocity inputs, representing the rate of change of the represented variable, and update their internal state in proportion to the instantaneous velocity [1]. The oculomotor circuit integrates head velocity signals to counter-rotate the eyes and hold the gaze fixed during head movements [15, 11]; it also integrates saccadic velocity commands to generate stable fixations at different gaze angles between saccades [13]. In the head direction and grid cell circuits for spatial navigation, self-movement cues from turning and walking update the internal pose estimates [5, 19, 20, 21, 22]. This so-called path integration (PI) computation underpins behaviors that are core for survival [23, 24].

Existing continuous attractor network models and the biological systems where they are found.
A. Left, schematic representation of a spatially embedded set of neurons and their connections. The neural connectivity constrains the patterns of neural co-activation, thus determining the dimensionality and topology of neural activity in the state space. Center, schematic representation of neural activity states, in this case forming a continuous manifold in state space. Right, schematic representation of the states of a (latent) variable in the external world. B,C,D. Examples of integrator circuits. Top row, integration in the oculomotor system. Center row, head direction system. Bottom row shows the grid cell system. B. Schematic representation of CAN models architecture for line, ring and torus attractors. C. Schematic illustration of the continuous manifolds of fixed points predicted and found to exist in the corresponding circuits, adapted from published work [15, 14, 34, 47, 35, 3, 48, 18, 46, 44]. D. Schematic illustration of variable manifolds.
Integration may also underlie the representation and mapping of other continuous domains including auditory sound spaces, parametric image variations, and emotional/aggression states [25, 26, 27, 28, 29, 30, 31, 32], and thus support inference, reasoning, planning, and imagination in all these domains.
Neural network models of these integrator circuit fall under the category of continuous attractor networks (CANs) [33, 1, 34, 35, 36, 15, 18]. All continuous attractor models posit recurrent circuitry to generate a continuous set of states that persist in the absence of external inputs (continuous attractors). However, not all CAN models are integrators: integrators must additionally contain a mechanism for updating the internal state based on velocity inputs. CAN models generate extensive predictions about circuit connectivity, activity, population dynamics, and lesions, and have stimulated extensive experimental work across species and circuits to test their predictions. Core novel predictions of these models have subsequently been validated via physiology, imaging, and connectomics: the dynamics and connectivity of the oculomotor integrator [37, 14, 12, 38] have been shown to match the hypothesized circuit model in considerable detail. The one-dimensional ring attractor dynamics, including fixed point dynamics, isometric representation in the head direction circuit in mammals matches [3] the predicted population dynamics of the ring integrator models in detail [39, 35, 34]. In insects, the connectivity and physical layout of the head direction circuit form an actual ring [7, 10] and exhibit some of the shift-like asymmetries hypothesized by a subset of the ring attractor models [39, 35]. In the grid cell system, the invariant two-dimensional population dynamics [40, 41, 42, 43] and its localization [44] to the predicted torus of fixed point states [18, 45, 46] has been directly observed in experiments. Thus, when available, circuit models have propelled a conceptual understanding of the structure and function of the mechanisms involved in integration, memory, and control of continuous variables, and driven experiments that have confirmed their mechanistic hypotheses
These models have been hand-crafted through intuition and insight, individually for each circuit or system in the brain. It is remarkable that the corresponding biological circuits have been found possess a structure, in the population dynamics and when direct physical comparisons have been possible in the circuit architecture, that closely matches these models [37, 14, 12, 40, 41, 42, 43, 7, 10, 38, 1]. This suggests that mathematically guided and conceptually minimal models are well-matched to the biology of the brain. Yet we lack a general mathematical theory to allow researchers to automatically construct such models for other continuous variables of a given dimension and topology, to generate predictions for future experiments and for potential use in machine learning applications involving such input variables.
Recent efforts to overcome this limitation center on training networks via gradient learning to perform continuous integration tasks on the desired variable [49, 50, 51, 52]. However, the difficulties of this approach for the formation of continuous attractor networks is that it is in-efficient, and the results are not usually interpretable. Specifically, training on M -dimensional manifolds requires of order kℳ samples [53, 54], scaling exponentially with manifold dimension. In the few cases where the results become interpretable, it is only through mapping onto the original “hand designed” models. The combination of these factors and the striking match between biology and the minimal hand-crafted models suggests that a set of simple and general mathematical principles are used by biology to build such circuits and if discovered, can be used to directly construct circuit models for integration of arbitrary continuous variables.
Here, we present such a small set of mathematical principles to directly construct minimal, interpretable continuous attractor networks that integrate variables of rich topologies and geometries. The theoretical framework converts directly into a practical recipe for constructing integrating attractor networks of desired dimension and topology. Existing integration networks known from biology appear as special cases of this framework. We name the method MADE (Manifold Attractor Direct Engineering). Thus, MADE can serve as a generally useful framework for making circuit predictions about connectivity and function in neural integrators not yet discovered, including in high-level areas that perform various cognitive tasks.
Results
Integration is the task of estimating the value of some (potentially latent) continuous variable x(t), based on an initial condition and inputs conveying information about
Theory: Continuous attractor manifolds of desired dimension and topology
Here we describe the theoretical elements sufficient to construct a neural network possessing a continuous set of attractor states with desired intrinsic dimensionality d (e.g., d = 1 for a ring lattice and d = 2 for a plane) and desired topology specified by a manifold 𝒫.
Consider a set of N neurons and spatially embed them, equally spaced (in a lattice), according to the desired manifold topology 𝒫. With this embedding, each neuron has a unique d-dimensional coordinate θi. This spatial organization is used for the specification of network connectivity, Wij = W (θi, θj); it may but need not mirror the actual locations of neurons in neural tissue [18]. We use rate-based neurons with standard recurrent weighted sums and point-wise neural nonlinearity given by the function f. The activation of the neuron at θi is denoted
We will use the rectifying nonlinearity, f (x) = x if x > 0 and f (x) = 0 if x ≤ 0. Derivations that follow are conceptually and qualitatively independent of this continuum limit.
We seek interaction weights consistent with the formation, through symmetry breaking, of a single activity bump state that can be positioned anywhere on the neural manifold 𝒫. The set of such bump states will form the continuous set of attractor states of desired dimension and topology.
Let W be a kernel function, W (θ, θ′) = k(d(θ, θ′)), where d(θ, θ′) is a distance metric defined on 𝒫, and k is a continuous scalar function that is symmetric about the origin (see Figure 2A). Analogous to prior work [34, 48, 56], we set k to be locally excitatory and globally inhibitory. To avoid runaway excitability, we make it strictly inhibition-dominated (k(d) ≤ 0 for all d) as in [57, 18]; network activity can be non-zero because of a compensatory spatially- and temporally-constant excitatory feed-forward drive b > 0. Specifically, k(d) = −k0 + k1(d), where k0 > 0 is a positive number and k1(d) → 0 as d → ∞ with k1(0) = k0.

CAN construction and activity manifolds.
A.Left, neural lattice 𝒫 for the Plane (top) and Torus (bottom) attractor networks. Black circles indicate the location of an example neuron, shades of green represent distance from other points on the lattice. Bottom right, inhibitory connectivity strength between the example neurons and all other points on the neural lattice. Middle inset, three examples of valid connectivity kernel functions k. B. Neural manifold in state space (top,right) and activity patterns on the neural lattice 𝒫 (top,left). Bottom row shows three activity patterns with bumps at different locations corresponding to different points on the activity manifold 𝒩.
Let the kernel’s length scale be given by σ, i.e., k1(d) ≈ 0 for d ≥ σ, with σ selected to be much smaller than the distances L over which the manifold 𝒫 has curvature. Thus, within any ball Vl of radius l such that σ ≪ l ≪ L, 𝒫 is flat. Since σ is the only spatial scale being introduced in the dynamics, we qualitatively expect that a localized bump state within the ball will have a spatial scale of O(σ). The conditions for the formation of a stable single bump state are thus the same as those for a globally flat manifold.
Since W is symmetric, Eq. 1 can be described through an energy function [58], and a stable steady state must exist. If the homogeneous state (all neurons equally active) were unstable, there must exist some other stable state, with broken symmetry. If the symmetry broken state is localized, we would refer to it as a bump state. Thus, we seek conditions under which the homogeneous steady state is unstable. The homogeneous steady state s(x) = s0 must satisfy
We derive the existence and stability of the homogeneous state (Appendix 1) following the analysis in Ref. [59], to obtain two requirements for the formation of a stable bump state:first, the Fourier transform of the kernel k1(d), which we denote as
A broad sufficiency condition for the first requirement is if k1(d) ≥ 0 for all d, then its Fourier transform is maximized at zero (proof in Appendix 1). This condition does not include all interaction kernels k1 whose Fourier transforms are maximized at zero, but is a sufficiently broad class.
Thus, up to a rescaling of the strength of the interaction, an interaction W (d(θ, θ′)) will lead to the formation of a bump state if it can be rewritten as W (d(θ, θ′)) = k1(d(θ, θ′)) − k0 for: k0 ≥ 0; a kernel k1 that satisfies k1(d) ≥ 0 and k1(d) → 0 for d ≥ σ ; and sufficiently small σ over which the manifold 𝒫 is approximately flat. As a result, there is a set of stable fixed points corresponding to activity profiles that are in one-to-one correspondence with points on 𝒫: every stable single-bump activity pattern is centered at some point in 𝒫, and every point in 𝒫 forms the center of some stable single-bump state (see Figures 2B). Thus, the set of stable states of the dynamics in Eq. (1) form a continuous attractor manifold 𝒩 that has a bijection with the manifold of the neural layout 𝒫 and thus to the target manifold. Moreover, importantly for representation and integration of continuous variables, we show in Appendix 2 that 𝒫 and 𝒩 are isometric to each other, with respect to their intrinsic geodesic metrics.
Theory: Integration on manifolds
The theoretical and practical frameworks outlined above show how to construct neural networks whose activity states possess a set of attractors forming a manifold 𝒩 of desired dimension and topology. Here, given the desired manifold 𝒩, we describe how the constructed attractor network with states matching the topology and dimension of 𝒩 can be augmented to endow them with the ability to perform velocity integration.
Note that to perform velocity integration of an external observed variable, the desired manifold 𝒩 may, but need not, coincide in dimension and topology with the manifold on which the observed variable states lie. This possibility is exemplified by grid cells, where the manifold 𝒩 of a grid module is 𝒩= 𝕋2 and is used to integrate animal velocities as animals move about in physical 2D space (thus ℳ= ℝ2). In a future work, we will consider the question of which internal manifolds 𝒩, not necessarily of the same topology or dimension as ℳ, permit accurate integration of velocities on ℳ. Here we show how to equip networks with attractor manifold 𝒩 with accurate path integration functionality for velocity inputs of matching dimensionality.
Previous models [34, 35, 18] constructed offset interactions between multiple copies of a continuous attractor network to permit external inputs to drive the state along the mani-fold. Here, we analytically derive the conditions required for an external input that has no knowledge about the structure and state of the continuous attractor network to generate appropriate movements along the nonlinear attractor manifolds of given topology, and show that offset interactions are necessary solutions.
For simplicity, consider a one-dimensional manifold with linear transfer function f. The stable bump states are fixed points of Eq. 1:
where s(θ) denotes an activity bump centered at any point in 𝒫. Consider two such activity bump states: s0(θ) centered at θ0 and s0(θ + ϵ) centered at θ0 − ϵ. For the neural state to move from s0(θ) to s0(θ + ϵ) in time Δt, the time derivative ∂s/∂t must equal
The movement speed is v = ϵ/Δt. Multiplying by τ on both sides, we have
We can add 0 to the equation above, in the form (−s0 + ∫ W (θ − θ′)s0(θ′)dθ′ + b), which is zero because of the equality of Eq. 3), to obtain:
Comparing this expression to Eq. 1, we see that moving the bump with velocity v can be achieved by adding a feedforward input drive
Observing that
This expression has now “internalized” the desired input to move the bump, converting it into the weight asymmetry term
Because we have that δ = τϵ/Δt = τv, the equation above results in a moving bump along the internal state-space manifold of fixed points 𝒩 with speed v = δ/τ, without any external velocity input or temporally varying modulation of network weights. The network corresponds to the original continuous attractor network constructed in the previous section, with the modification that the weights, instead of being symmetric, have a small offset in a particular direction δ along the neural circuit manifold 𝒫. The speed of bump movement on 𝒩 is proportional to the magnitude of the offset, |δ|, and inversely proportional to the neural time-constant.
This continuous-speed flow may form a periodic cycle on specific manifolds (e.g. Ref.[18, 35]). In these cases, the network is a limit cycle attractor. On generic manifolds, however, this flow need not close periodically on itself. The result will be a quasiperiodic attractor dynamics [60]. We therefore refer to these as Quasiperiodic Attractor Networks (QANs). The flow of activity patterns in a QAN defines a constant vector field Ψ on 𝒩.
For several attractor manifolds 𝒩 of dimension d (in particular, ‘parallelizable manifolds’ such as the Euclidean spaces ℝd and the Torii 𝕋d) it is possible to construct d QANs with linearly independent flows, and 2d QANs with two mutually opposing flows in each of d dimensions (defined by weight matrices W (θ − θ′ ± δm), where δm is a displacement vector of norm |δ| along the mth manifold dimension). Each sets up a constant vector field Ψ±m on 𝒩. For these manifold topologies [34, 35, 18], opposing-pair QANs numbering 2d, where d is the manifold dimension, can generate smooth non-vanishing flows of any direction at every point and are thus sufficient to construct integrators. The combined dynamics is given by:
where sσm indicates neural activities in the individual QANs and
Coupled in this way, the QANs form a network whose combined activity state moves on 𝒩 in a way controlled by the velocity inputs
On other manifolds, 2d opposing QANs for the d manifold dimensions are not sufficient for accurate integration. For instance, in the case of even-dimensional spheres, the hairy ball theorem states that every continuous tangent vector field must vanish at some point(s) [61, 62, 63]. In other words, a continuous vector field Ψ±m generated by the QAN prescription above will be zero somewhere on the sphere; at that location, the QAN will not be able to drive bump movement; thus, d QAN pairs will not suffice for good integration everywhere. Further, on non-orientable manifolds such as the Möbius band, it is not possible to define continuous vector fields that are globally orthogonal everywhere and smooth. Thus, while the approach above provides a unified way to construct integrating continuous attractor networks — including all those with a single bump state currently found in the neuroscience literature [34, 35, 47, 48, 56] — it needs to be further generalized for manifolds that do not permit non-vanishing continuous tangent vector fields everywhere.
Generalization: Killing vector fields
To enable accurate path integration over a significantly wider set of manifolds (excluding the Klein bottle), we now broaden and further generalize the concepts developed above. The approach replaces the constant weight offset vector fields Ψ±m with the more generally applicable Killing vector fields [62]: Killing fields are vector fields on a manifold whose flows preserve the structure of the manifold, i.e., they are continuous isometries on the manifold. Conceptually, if each point of an object on the manifold is displaced by the corresponding Killing vector, it will move without distortion. Killing fields form a ‘vector space’, such that linear combinations of Killing fields are also Killing fields. The manifold isometric property of Killing fields means that activity patterns are rigidly translated over 𝒫 through the flow Ψt without changes in area, a necessary condition for accurate integration [34].
To generate Killing fields in each QAN, the constant weight offsets are replaced by an appropriate position-dependent offset:
where ±δm(θ) is the offset vector of the σ, mth QAN at coordinates θ on 𝒫. This allows for weight offsets to vary at different locations on the manifold 𝒩 consistent with non-constant
Killing fields required on the sphere (Figure 3C). This simple change, and allowing the number of QANs to be larger than 2d, endows a much broader class of continuous attractor manifolds including spheres and Möbius band with integration functionality. For a two-dimensional sphere, three basis Killing fields (dkill = 3) are required (each corresponding to rotational symmetry along one principal axis; Figure 3C). Although each field vanishes at two points on the sphere, at least two fields are non-vanishing and point in independent directions along the manifold at any point, forming an overcomplete basis such that it is possible for the network to perform accurate path integration.

Quasiperiodic Attractor Networks for Path Integration.
A. Schematic representation of a desired 2D spherical set of fixed points in state space and corresponding connectivity on 𝒫. B. Example activity bump plotted on the neural manifold 𝒫. C. Schematic illustration of Killing vector fields for the sphere manifold, left, and resulting offset connectivity weights on 𝒫, right. D. Schematic illustration of the QAN approach to velocity integration. Left two panels, relationship between changes in the variable on ℳ and on the neural 𝒩 manifold, and associated tangent vectors. Center, each QAN receives a velocity-dependent input based on the tangent vectors at left projected onto its Killing fields, and the activity of all networks is combined. Right: this results in a trajectory in the state-space 𝒩, which corresponds to velocity integration of inputs from ℳ.
Finally, we generalize how an external manifold ℳ may be mapped to the internal integrating manifold 𝒩, by mapping velocity vectors in the external space to the QANs within the network. Throughout, our construction seeks to make 𝒫 and 𝒩 isometric, and indeed they are, as shown in Appendix 2. However, as noted at the start of this section, 𝒩 need not exactly match the topology of the external variable: 𝒩 = T2 of a grid module represents positions on ℳ = ℝ2 of the externals partial variable. Similarly, the dimensionality of 𝒩 could equal or exceed that of ℳ: a planar integrator network is capable of integrating an external one-dimensional variable if the velocity inputs are one-dimensional. For instance, grid cell responses on a linear track appear to be generated as a slice through their 2D manifold of states [41, 64].
Define π as the mapping of ℳ to 𝒩 (which can be the identity map or the isomorphism map when ℳ and 𝒩 are isomorphic, such as when head direction is represented in a ring attractor, or a many-to-one map as when spatial position is represented in a single grid module). The Jacobian π⋆ is a map from the tangent space of ℳ to the tangent space of 𝒩: it is the operator that maps tangent vectors from ℳ (i.e.
where Wσm refers to the Killing-field weights from Eq. 9, dkill defines the minimal number of independent Killing fields. The term
The constant vector fields on the ring and torus manifolds described above (and effectively discovered in previous work) are Killing fields. Therefore, this approach encompasses previous work and provides a broader general framework for constructing minimal biologically plausible continuous attractor neural networks capable of path integration on spaces of various dimension and topology. Next, we demonstrate how to practically construct the networks, the examine the effectiveness of the approach through extensive numerical simulations of path integration in MADE integrator networks.
Practical construction of CAN integrators with MADE
With the complete conceptual and mathematical frameworks in place, we now illustrate through numerical simulation how to apply the MADE prescription to construct various CANs and integrators of desired dimension and topology. The simulations also allow us to validate the functionality of the resulting CANs and integrators. For simplicity, here we focus our description on one and two-dimensional surfaces, allowing us to construct line, ring, plane, cylinder, torus, sphere, Möbius band and Klein bottle topologies and geometries (Figure 4A). The procedures outlined here can be straightforwardly generalized to apply to manifolds of different dimensionality.

Stationary states and manifold topologies of the MADE CANs
A. Desired population activity manifold topology for CANs constructed with MADE for several manifolds (from top to bottom): line, ring, plane, cylinder, torus, sphere, Mobius band and Klein bottle. B. Distance functions over the neural lattice 𝒫 for selected example neurons. C. Low dimensional embedding of the neural activity manifold 𝒩. D. Betti number and persistent homology bar code for each CAN’s neural population states (in 𝒩). E. Left: Activity of one example neuron over 𝒩 (low dimensional embedding). Right: Stationary population activity states form localized bumps on the neural lattice 𝒫.
We first construct a neural surface 𝒫 that is isometric to the target state-space manifold 𝒩. For the sphere attractor, we construct 𝒫 as an embedding of the two-dimensional unit sphere in ℝ3, and for the Klein bottle attractor 𝒫 was an embedding of a finite cylinder manifold with appropriate identification of the cylinder end-points to each other in ℝ4). For several other manifolds (including all others from Fig. 4), which admit a flat metric, we define a rectangular two-dimensional space [0, L1] × [0, L2] (Figure 4B) and provide an appropriate distance function on the rectangular space. For example, for the torus manifold, L1 = L2 = 2π, and distances are computed respecting the periodic boundary conditions that identify 0 and 2π as the same point.
Given 𝒫, we next approximately evenly place neurons on the surface. For manifolds with a flat metric, this involved placing neurons on an n × n rectangular lattice on this space, where n2 is the total number of neurons. For the sphere, we spaced neurons at regular intervals along a Fibonacci spiral over the unit sphere (see Methods)to approximate an even placement on the sphere. Thus, for each neuron we define their 𝒫 coordinates θi.
Next, we computed the connectivity of the network Wij, which depends on the (geodesic) distances d(θi, θj) between pairs of neurons with coordinates θi and θj on 𝒫. With appropriate coordinate parametrization for the neurons, these geodesic distances can be computed via analytical expressions (for instance, as Euclidean distance with periodic boundary conditions on a torus attractor), or via a simple numerical computation (see Methods). Connectivity is then given by the n2 × n2 matrix with entries Wi,j = k(d(θi, θj)), where k is a kernel function (Figures 2A, 3A, 4B) satisfying the requirements described earlier for the formation of activity bump states (see also Appendix 1). We used a scaled Gaussian kernel such that the connectivity between pairs of neurons was strictly negative and Wij = 0 if d(θi, θj) = 0 (see Methods). Other choices of kernels yield similar results (data not shown). Neural activity is simulated based on these weights according to Eq. 2. We will provide Python and Julia code that implements the MADE prescription for CANs (see Methods).
Validation of CAN states and dynamics
To validate the MADE CANs, we first characterize where the states of the constructed networks localize. To do so, we sample population activity data from each model by randomly initializing each network and allowing the initial state to settle to a stationary state (see Methods). This state forms one population vector sample; we repeat the process 2500 times for each network. We apply nonlinear dimensionality reduction via ISOMAP, which has proven useful for the visualization of nonlinear low-dimensional manifolds in real data [3], to the resulting point cloud of stationary population activity states. The resulting structures (Figure 4C) visually matched the desired manifolds (Figure 4A): the population responses of the MADE CANs localize to low-dimensional sets of states that appear homeomorphic to 𝒩.
To quantify the structure of the resulting population states, we use persistent homology, a Topological Data Analysis [66, 67] technique that has been applied with success in neuroscience [68, 3, 44]. Persistent homology supplies Betti numbers that characterize the topology of the set of stationary states of each network (see Methods). Betti numbers catalog the number of “cavities” of each dimension present on a manifold; the first three Betti numbers correspond to the number of connected components, rings and two dimensional cavities, respectively. Betti numbers don’t provide a complete or unique description of manifold structure (e.g., the ring and the cylinder share the same Betti numbers while having different dimensionality), but they provide a quantitative confirmation that the MADE CANs match their intended targets. The Betti numbers of all MADE CANs population states match those of their target manifolds (Figure 4 D).
We next visualize the instantaneous population activity states as functions on the neural lattice. The localized kernel connectivity on the manifold was expected to stabilize single activity bump states on the manifold. A stationary population activity state can be directly visualized on the neural lattice by coloring neurons according to their activity level. Indeed, we see that the stationary population states correspond to localized bumps of activation on the neural lattice 𝒫 and activity manifold 𝒩 (Figure 4E).
Next, we characterize the intrinsic dimensionality [3, 69] of the stationary states of the MADE CANS. Intrinsic dimensionality at a point on a manifold is the numbers of degrees of freedom of movement along the manifold at that point. Intrinsic dimensionality would allow one to distinguish, for example, a ring (one dimensional) from a cylinder (two dimensional). Dimensionality is generally a difficult (and ill-posed) quantity to estimate in noisy data, and existing works use various methods [3, 70, 71, 72, 73]. For MADE CANs, which we can run in a noiseless setting, intrinsic manifold dimension is well-defined.
We adopt an approach [73, 72] based on estimating the dimensionality of the tangent space to a manifold (see Methods)(Figure 5 A, left). The tangent space Ts 𝒩 at a point s ∈ 𝒩 is the best linear approximation of the manifold at that point and has the same dimensionality as the underlying manifold [74, 63, 65]. We consider the set S of points in a small neighborhood of s (see Methods)and apply PCA to determine the number of large principal components needed to describe the data. This gives us the dimensionality of the tangent space at s and thus the local intrinsic dimension of the manifold.

Dimensionality and attractor dynamics of the MADE CANs.
A, Left, tangent planes approach to computing the intrinsic manifold dimension (schematic) of 𝒩. Right, estimated tangent space dimension for each manifold, which estimates the low intrinsic dimensionality of the CAN networks. B Cumulative manifold variance explained by global PCA analysis: the slow saturation of the curves shows that the linear (embedding) dimension of the manifolds can be large. C Numerical simulations to probe attractor dynamics. Inset: activity manifold, perturbation vector (black) and on-manifold (red) and off-manifold (blue) components of the perturbation. Main plot: Time-varying distance from the starting point in the off-manifold and along-manifold dimensions.
Repeating this analysis across multiple randomly selected sample points s for each MADE CAN, we confirmed that all manifolds had the expected intrinsic dimensionality given their topology: line 1 ± 0.0 (mean ± standard deviation, across multiple repeats), ring: 1 ± 0.0, torus: 2 ± 0.0, sphere: 2 ± 0.0, Möbius band: 1.96 ± 0.16, cylinder: 2 ± 0.0 and plane: 2.05 ± 0.23 (Figure 5 A). By contrast to the small intrinsic dimensionality of the constructed CAN manifolds, their extrinsic linear dimensionality, estimated by the minimum number of principal components required to represent the manifold as a whole, is large (Figure 5 B).
Finally, we examined whether the stationary manifolds of the MADE CANs are neutral attractor states, with rapid decay of off-manifold perturbations, together with no state drift along the manifold in the absence of noise and external inputs [34, 1]. First, we consider manifold stability by computing Betti numbers of the population states in networks simulated with varying noise conditions, and find that except in the most severe noise case, we recover the same Betti numbers for the noisy dynamics – indirectly showing that the manifold is attractive and robust to noise (see Methods)(Figure S1). Second, we more directly perturb the neural population state with a randomly oriented vector of fixed magnitude (see Methods), repeating this experiment for multiple initial states and random perturbations, and observe the dynamics by which the perturbed state evolves. To quantify on- and off-manifold dynamics following perturbation, we again used PCA to estimate the manifold’s tangent space in the neighborhood of the initial state. The distance between the perturbed and initial (pre-perturbation) states along the tangent space dimension was considered the on-manifold perturbation component; the rest (along the remaining N −d dimensions) was the off-manifold perturbation(see Methods). We find very limited on-manifold drift and strong decay of the off-manifold component of the perturbation, as intended (Figure 5 C).
Practical construction of integrators with MADE
To generate the QANs that combine to create neural integrator circuits, we slightly modify the connectivity structure of MADE CANs. We start with the same procedure as before to construct 𝒫 and compute the distance function d. For a QAN indexed by σ, m we simply apply a shift
Validation of MADE integrators
To examine the performance of each MADE integrator in representing and tracking time-varying external variables, we provide the circuit with the velocity of a simulated random trajectories of the variable x(t) ∈ ℳ and track how the network’s internal state changes. We first consider how the firing of a single cell varies with the external variable, by plotting its tuning curve or firing response as a function of the external variable, estimated over a long velocity trjectory (Figure 6A). The existence of a localized activity bump (Figure 6A, top three panels) means that the circuit has correctly inferred external position: the cell fires at a specific position and not other random positions, and the network has transferred the internal bump activity pattern into a corresponding pattern as a function of location on the external manifold ℳ. In cases where the external manifold ℳ is not isomorphic to the internal manifold 𝒩, such as when a plane in ℳ is represented by a cylinder or a torus in 𝒩, a continued linear trajectory along one direction in ℳ corresponds to a periodic traversal on 𝒩, and thus one would expect repeating bumps in the tuning curve along that dimension, as we find (Figure 6A, panels 4-5). Note that based on the details of how we periodically connected the boundaries of our rectangular neural lattice to obtain a torus, we would obtain a square grid tuning curve (as shown) or a triangular grid tuning curve (as previously described for grid cells in [48, 56]). Finally, the tuning curves for the sphere and Mobius strip are single bumps, as expected (Figure 6A, last two panels).

Numerical simulations of path integration performance with MADE path integrators.
A. Tuning curves of single example neurons as a function of the external (latent) variable x. Insets show the manifold topologies of the external variable (red) and neural population states (blue): these pairings might be of identical manifolds, or e.g. a 2D Euclidean manifold in x could be mapped to a cylinder or torus, etc. in the neural population states. B. Example input trajectory (red) and decoded trajectory from the neural population response (blue). C. Decoding error across multiple simulations for various external-neural manifold pairs. Decoding error is shown as percentage of trajectory length over ℳ. Colored boxes show the interquartile range, white lines the mean, circles outliers and vertical lines the 95th percentile confidence interval. D. Same as B but for torus attractors with varying amounts of noise. E. Left: Killing and non-Killing weight offsets for the torus (top) and sphere (bottom). Right: Same as C for integrators correctly constructed with Killing weight offsets, and with the non-Killing weight offsets from the left. F. Same as C for Möbius to Möbius (left) and cylinder to Möbius mappings (right).
We can more directly quantify how closely the network tracks the external variable x(t) by decoding it from the network’s internal state s(t), as
We performed additional experiments on circuits requiring Killing vector fields to integrate. To show the necessity of Killing fields, we built torus (𝒩 = torus, ℳ = plane) and sphere (ℳ = 𝒩 = sphere) integrator networks, but varied the QAN weight offsets relative to the Killing field prescription. For the torus, we varied the orientation of the offset vectors, while for the sphere we changed their lengths to be of constant magnitude everywhere (except at two poles, where the magnitudes were left at 0), (see Figure 6E, left), (see Methods). The constant-magnitude non-Killing field on the sphere may be considered a direct extension of the constant offset vector fields used for flat manifolds and used in all prior work in the construction of neural integrators. In both cases, we observed a dramatic deterioration in integration accuracy, Figure6E (right). The result underscores the importance of Killing fields for integration on manifolds with a non-flat metric.
Finally, we considered integrating velocities from a cylindrical external variable on a network with Möbius band topology. Both manifolds are two-dimensional with one periodic and one non-periodic dimension. However, while a rectangle is glued without a twist to make a cylinder (which has two surfaces, inner and outer), it is glued with a twist to make a Möbius band (which has a single surface) with the consequence that there is no simple continuous mapping between the two. Proceeding naively by simply mapping the two manifolds onto each other by ignoring the flipped boundary of the Möbius band, it is unsurprising that integration is significantly less accurate (Figure6F). In future work, it will be interesting to consider which pairings of external to neural manifolds will provably permit accurate path integration.
Discussion
Summary
Here, we have presented MADE, a mathematical theory and recipe for constructing biologically plausible neural CANs and integrator networks of desired topologies and geometries, with single- or multi-bump tuning curves. The mathematical theory unifies existing biologically plausible continuous attractor and integrator models involving bump-like activation functions, which emerge as specific cases of the MADE theory.
The theory provides a first-principles derivation showing that multiple copies of a basic network must be coupled together for integration with biological constraints, in part to relieve demands for rapid synaptic modulation and in part to remove the need for velocity estimating regions from knowing the full nonlinear structure and current state of the integrator network. It also predicts that manifolds without a flat metric will require an overcomplete set of network pairs in the form of QAN networks, relative to the intrinsic dimensionality of the manifold: thus, integration on a two-dimensional spherical surface requires more than 2 QAN pairs.
We envision MADE to be useful to distinct fields: for deep-learning models that might require accurate low-dimensional neural network attractors and integrators, and for neuroscience, where MADE provides de novo models and novel circuit-level mechanistic predictions for the structure of other possible integrators in brain that may be uncovered in the future.
Indeed, given recent discoveries that path-integrating neural circuits generalizably represent multiple cognitive variables, it is likely that such circuits are used by the brain to perform cognitive tasks in which variables of interest are not directly observed and only information about their rate of changes is available (e.g., mental object rotation) [25, 75, 27]. MADE models could then act as test beds to generate mechanistic hypotheses for the network dynamics underpinning integration computation in such cognitive tasks.
Activity bumps and tuning curves
MADE provides a basic prescription for the construction of continuous attractor and integrator networks of a desired dimension and topology. We numerically implemented a particular (Gaussian) kernel shape to illustrate the framework. The shape of the population activity bumps that result will depend on the kernel shape, which can be varied and selected as desired, according to the constraints supplied by our theory. Recent theoretical work on symmetry breaking for pattern formation also suggests that the set of potential kernels forms a large function space.
The tuning curve shapes of single cells depends both on the population activity bump shape as well as on the mapping from the external variable manifold to the internal neural state space manifold. As we have seen, if the external manifold is unbounded in some dimension but the internal representation is compact and periodic, then the spatial tuning curve will be periodic in that dimension. More subtle details of the mapping can affect the geometry of the periodic mapping, as we have described above.
We have focused our illustrations on simple and non-trivial manifolds of intrinsic dimension ≤ 2 for visualization and convenience. However, the theory and recipe for continuous attractor and integrator network construction generalizes in a straightforward manner to manifolds of higher dimension and different topologies.
Related work
Computational models first described attractor networks [76, 36, 34, 35, 15, 11] and the mechanisms by which they could enable velocity integration [34, 47, 35, 15, 11, 46, 48, 18] long before experimental data verified the existence of such mechanisms. Intriguingly and surprisingly, in every case experimentally probed to date, the proposed neural circuit models closely resemble the hand-designed attractor models [40, 3, 42, 43, 44, 38, 15, 37, 14, 13]. Why is this the case? Presumably this match arises because the models were minimal in the sense that they implemented the essential elements and symmetries required to form the desired attractor, and circuits in the brain evolving under efficiency pressures arrived at similarly minimal models. MADE adopts a very similar mathematically minimal approach, recovering all of the known integrator models with bump-like tuning (except for the oculomotor integrator, which does not have bump-like responses).
An alternative approach to building models of integrating circuits in brains is to train artificial neural networks to perform tasks requiring integration [50, 49, 51]. After training, the networks’ solution is analyzed to reverse engineer the relation between network connectivity, neural dynamics and task performance [50, 77, 78, 79]. However, such approaches often fail to provide novel testable predictions or interpretable mechanisms to guide further experimental investigations, unless there was already a hand-crafted model available to which the trained network could be compared.
The network engineering approach [80, 81, 82, 83, 84, 82, 85, 86, 87] constructs circuits starting from the detailed desired dynamics of a system (precise states, fixed points, or specific tuning curves), then directly searching or solving for some network connectivity with those dynamics. Typically, these works further constrain the problem to make it well-posed by searching for low-rank weights or the lowest-dimensional embedding space for the dynamics while satisfying the desired properties. These methods are complementary to our approach: they permit construction of a broader set of dynamical systems, for instance trajectories ending in discrete fixed points, stable and unstable fixed points, etc., while our focus is specifically on biologically plausible continuous attractors that integrate. Conversely, those approaches do not provide a framework for building biologically realistic continuous attractor networks that integrate and lack known matches or easy interpretability to compare with biological circuits in known cases.
In conclusion, MADE allows for easy generation of interpretable, mechanistic, models of CAN networks that can integrate. We hope that MADE will endow researches with tools required to generate detailed, testable, hypotheses about the neural underpinnings of integration in diverse settings and in various cognitive processes, accelerating our understanding of the critical role that this class of computations play in many aspects of brain function and allowing for easy incorporation of such circuits in deep learning applications.
Methods
All simulations and figures were implemented in custom Julia code available at GeneralAttractorsTheory. We will provide a minimal Python package for creating CANs and QANs using MADE: MADE-Python.
CAN construction
In MADE, CAN engineering depends on computations of the pair-wise on-manifold distances between neurons in a lattice 𝒫. Thus, we begin by specifying a set of n equally spaced points on 𝒫. For the Line attractor, n = 256 and 𝒫 was taken to be the interval [−6, 6]. For the Ring attractor, n = 256 and 𝒫 was taken to be the interval [0, 2π] with the two ends identified (i.e. we ensured not to have a neuron at θi = 0 and one at θi = 2π). For all remaining networks, n = 482 was used. The following rectangular intervals were used: for the plane attractor 𝒫 = [−10, 10] × [−10, 10], cylinder: 𝒫 = [−5, 5] × [0, 2π], torus: 𝒫[0, 2π] × [0, 2π], Möbius band: 𝒫 = [−2, 2] × [0, 2π] and Klein Bottle: 𝒫[0, 2π] × [0, 2π]. For the sphere attractor, the n points were chosen to be on a Fibonacci spiral on the unit sphere embedded in ℝ3.
Next, to implement custom manifold-specific distance metrics d we used the Julia package Distances.jl. The standard Euclidean metric was used for the line and plane attractor, for the ring a one dimensional periodic Euclidean metric (period 2π) was used, for the torus a two dimensional periodic Euclidean metric (period 2π in each direction) and for the Cylinder a heterogeneous periodic and standard Euclidean metric for the periodic and non-periodic dimensions respectively. For the sphere the great arc spherical distance for points on the unit sphere (implemented in the Manifolds.jl package [88]) was used. For the Möbius band a custom metric function was used to account for the non-orientable nature of the manifold.
For the Klein Bottle, a different approach was used. First, we defined an embedding of the Klein Bottle in ℝ4 mapping each lattice point θ = (u, v) to a point q ∈ ℝ4:
Next, we computed the pairwise Euclidean Distance in ℝ4 for the embedded points and selected the 8 nearest neighbors of each point. We then constructed a graph where each node was a lattice point and two nodes were connected if one belong to the neighborhood of the other. Each edge was assigned a weight equal to the Euclidean distance between the two points. Thus, the graph structure was taken to represent the local topological structure (connectivity) of the Klein Bottle. Given two points θi, θj then, their on-manifold distance was given by summing the edge weights (local distances) along the shortest path on the graph from the node corresponding to θi to the one corresponding to θj as a way to numerically approximate the geodesic distance between them.
Following computation of pairwise distances, the connection weights between two neurons was computed using as kernel function:
yielding strictly non-positive values for the connection strength. This gave a connectivity pattern characterized by global, long-distance inhibition, and no, or reduced, inhibition locally such that a localized pattern of activation on the neural lattice 𝒫 would remain localized and not result in activation of all neurons in the network. The parameters α, σ were varied based on the CAN topology and are indicated in table 1

Kernel function parameters
CAN simulation
Network dynamics were approximate to discrete time using forward Euler integration with Δt = 0.5ms using:
where s(t) is a vector representing the activity of each neuron in the network at time t, τ = 5ms was used as time constant. The constant input b = 0.5 was used throughout. The term η(t) was used to simulate Poisson noise in the recurrent dynamics, it represents a vector of length n whose entries are given by:
For each CAN, 2500 simulations of 25ms in duration were performed to generate data for the analysis of the activity manifold topology. We chose 25ms since we observed this to be sufficient for the network to settle into a steady state (i.e. one in which the network’s activity does not change between simulation steps).
For the first 15ms of each simulation, the activity of neurons at a distance d > 0.5 from a selected neuron θ0 (randomly selected for each simulation) was artificially set to 0 to induce the formation of a stable bump of activity around θ0 to promote uniform coverage of the entire manifold. The final activation vector s(T) for each simulation was then stored for subsequent analysis. For the torus attractor network, additional simulations were performed varying the noise parameter to assess the effect of noise on the attractor dynamics.
Attractor manifold analysis
The final activation vector of each of 2500 CAN simulations for each manifold were collected into a matrix of dimensionality n × 2500 with n being the number of neurons in the network. For networks other than the line and ring attractors in which n > 400 a first dimensionality reduction step using PCA was performed to reduce the data to a 400 × 2500 dimensional matrix. Then, further reduction to three dimensional data for visualization (Figure 4) was achieved using Isomap [89]. To reduce computation Isomap was fitted to 10% randomly selected data points and then used to embed the entire dataset for visualization. For subsequent Topological Data Analysis (TDA) point cloud data was subjected to PCA dimensionality reduction to generate a 200 × 2500 data matrix and Isomap was then used to further reduce dimensionality to 10 (Isomap fitted to 10% of the data) [3].
Topological data analysis
To perform persistent homology analysis the Julia packages Ripserer.jl and PersistenceDiagrams.jl [90] were used. To reduce computation, the TDA filtration was computed using a subset of randomly selected data points (20% of the entire dataset) to obtain the persistence diagrams shown in Figure 4 A. Only intervals with a lifespan > 7 were kept to remove features due to noise and the number of persistent intervals of each dimension (up to two dimensional cavities) were counted to obtain Betti numbers, which were then compared with those expected for manifolds of the given topology.
Visualizing neural tuning curves
To visualize neural activation turning curves over 𝒩 in Figure 4, we used PCA and ISOMAP to reduce the dimensionality of neural activity to three dimensions. We thus obtained 2500 low dimensional points which we colored according to the activity of one selected neuron in the corresponding neural state. To visualize activity over the neural lattice 𝒫, we started by selecting one random neural state from the 2500 simulations. Then, we uniformly sampled 𝒫 and for each location θi ∈ 𝒫 we identified the closest neuron in the CAN (by coordinates). We then colored each point in 𝒫 according to the activation of the closest neuron.
Intrinsic manifold dimensionality analysis
To estimate the manifold’s intrinsic dimensionality all data points in the n-dimensional state space were utilized. Pairwise Euclidean distance between each data point was computed to obtain each data point’s k nearest neighbors (using the NearestNeighbors.jl package). While Euclidean distance in state space does not necessary match on-manifold geodesic distance on 𝒩 in general, on a sufficiently small scale a manifold’s Euclidean structure makes this approximation acceptable. Next, 250 random data points (10% of the total) were selected for estimation of local dimensionality in their neighborhood. For each, the k closest points were selected and PCA fitted to the data. The number d of principal components required to explain at least 75% of the data was used as estimate of local dimensionality and the manifold’s intrinsic dimensionality was taken to be the average across repeats. Thus, the dimensionality estimation procedure depended on two hyperparameters: k and the percentage of variance explained. Preliminary tests on artificially generated data with known dimensionality and variable Gaussian noise were used to select the parameters used here, and we’ve found the estimated intrinsic dimensionality to be robust across a wide range of parameters values (data not shown). For the analyses shown here we used k = 500 throughout. Our preliminary tests showed that much smaller values of k resulted in noisy estimates (especially in the face of noise) and very large values of k led to an overestimation of the manifold intrinsic dimensionality (likely due to the higher global embedding dimensionality).
Attractor dynamics analysis
To explicitly quantify attractor dynamics, a torus network was constructed as described above and simulated without external stimuli for a simulation time of 250ms (given 100 random initializations). Next, the network’s state was perturbed by addition with a random vector v of the same dimensionality as the network activity. For each of 100 simulations the vector was chosen to have a random orientation but fixed magnitude. The magnitude was computed to be 50% of the average distance between states on the torus manifold and the origin. Following the stimulus, the simulation was continued for 750ms more for a total simulation time of 1000ms. Data for each simulation was collected for the analysis of on-vs off-manifold dynamics. For each repeat the state just prior to stimulus application was used as seed for local PCA using k-nearest points from the point cloud data used for previous estimation of manifold topology, as described above (i.e. the steady states from previous simulations without the inputs were used to estimate the tangent plane to the manifold). The top two principal components were retained as approximation of the manifold’s tangent plane. The neural trajectory was then decomposed into on-manifold and off-manifold components by projection onto the tangent plane and remaining N − 2 dimensions. The euclidean distance in each subspace from the initial condition over time was then computed to asses drift following stimulus application and averaged across repeats.
QAN construction
To construct quasi-periodic attractor networks for integration, first a choice of variable (ℳ) and neural lattice (𝒫) manifolds was made, ensuring that an identity or periodic mapping existed between the two (unless explicitly stated otherwise). Next, a map π : ℳ → 𝒫 and its inverse π−1 were defined (e.g. mapping each point on the plane, 𝒫, to a corresponding point on the torus, 𝒩). To compute connection weights in each QAN, points on 𝒫 were selected as before and the same distance metrics and kernel functions were used applying an offset to the neurons’ coordinates during distance computation. For the Line and Ring attractor, two QANs were constructed using the offset vectors δ±m = ±0.15. For the Plane, Cylinder, Torus, Möbius attractors four QANs were constructed using δ±1 = [±0.25, 0] and δ±1 = [0, ±0.25] as offset vectors. For the sphere attractor, six QANs were constructed using as offset vectors δ1± = ±[0, −z, y], δ2± = ±[z, 0, −x] and δ3± = ±[−y, x, 0] where [x, y, z] represents coordinates on the unit sphere embedded in three dimensional euclidean space. For the sphere, therefore, the offset vector magnitude varied as a function of position on the sphere to ensure that Killing vector fields were used (which are constant for the other manifolds used). The same vectors were used to compute the velocity-dependent stimulus
QAN dynamics simulation
Similarly to CANs, network dynamics were simulated using forward Euler integration. For each QAN in a network performing integration the discrete time dynamics were:
where s(t) =∑±m s±m and
To assess integration accuracy we generated 50 random trajectory (each corresponding to 1 second of simulated time) and simulated integration with the QANs. For each simulation, a trajectory
where d is a metric function for ℳ as described above and Lγ the trajectory length of γ on ℳ. More precisely, for decoding we used
To generate the trajectories, we first defined a set of 2d vector fields Φi over ℳ, each corresponding to a vector field Ψi over 𝒫. Then, we generated smoothly varying vectors Ai such that at each time t the velocity vector
In some conditions we artificially injected Poisson noise in the QAN neural dynamics as described previously to assess the effect of noise on path integration.
Neural tuning curves on ℳ
To visualize neural tuning curves with respect to ℳ in Figure 6, we generated a single trajectory densely sampling from ℳ (5-10 seconds of simulated time). After simulating path integration, we selected one random neuron to visualize its tuning curve. The visualization method varied based on the manifold topology. For one dimensional manifolds we simply plotted decoded value, x, against the neuron’s activity. For most two dimensional manifolds, with the exception of the sphere, we generated a heatmap by binning x and quantifying the average neuron’s activity for samples from each bin. A small amount of noise was added to x before binning to improve visualization. For the sphere, we first sampled 2000 points uniformly distributed on ℳ. Then, for each point we looked at the closest decoded value. We then colored each point on the sphere according to the neural activity value at the corresponding sample.
Non Killing fields and non-periodic manifold mapping
To demonstrated that path integration depended on the weight offset vector fields Ψ being Killing fields we generated two variants of the torus and sphere QANs. For the torus, we kept the magnitude and relative orientation of the offset vector fields constant, but gradually rotated their position by an angle cos(θ1) (i.e. only as a function of position along one manifold dimension). This ensured that vector fields at the boundary conditions were identical, as expected. For the sphere, we started with the Killing vector fields we had, and simply normalized each vector such that all vectors had constant length. We then ran 50 simulations using random trajectories as described previously.
To assess path integration when no trivial or periodic mapping between ℳ and 𝒩 existed, we performed path integration simulations with ℳ as a cylinder and 𝒩 as a sphere. We used the same procedure described above to generate 50 random trajectories over the cylinder and computing the corresponding velocity vectors over 𝒫.
Supplementary Information

Torus CAN activity manifold (top) and persistence diagram (bottom) for varying noise intensity levels (columns).
1 Kernels constructed through distance metrics produce single bump states
Here we estimate the conditions on the interactions that lead to the formation of bump states on the lattice of neurons, 𝒫.
As earlier, consider an interaction weight matrix W (θ, θ′) = k(d(θ, θ′)). We rewrite the kernel k as k(d) = −k0 + k1(d), where k1(d) → 0 as d → ∞ and k1(0) = k0 > 0; and correspondingly write W (d(θ, θ′)) = −W0 + W1(d(θ, θ′)). We assume that the kernel k has a length scale σ, i.e., k1(d) ≈ 0 for d ≥ σ.
Since σ is the only spatial scale being introduced in the dynamics, we qualitatively expect the localized bump states will have a scale of O(σ). If σ is much smaller than the distances over which the manifold 𝒫 has curvature, 𝒫 will be approximately flat within a ball Vσ centered on any x ∈ 𝒫. In this approximation, the conditions for the formation of a stable bump state are the same as those for the formation on a bump state on a globally flat manifold.
To examine the conditions for the existence of a bump state, we will first calculate the homogeneous steady state supported by Eq. 1. Next, we note that since W is symmetric in this case, thus Eq. 1 can be described through an energy function[58], and thus a stable steady state must exist. If the homogeneous state is unstable, there must then exist a stable symmetry broken state of the system. If this symmetry broken state is localized, we refer to it as the bump state.
The homogeneous steady state s(x) = s0 must satisfy
where V is the volume of the manifold, ∫dθ, and
Since the kernel k1 is supported on a small volume of the entire manifold,
To examine the stability of this homogeneous state, consider a small perturbation, s(x, θ) = s0 + exp(α(ω)t + iω · θ) to Eq. 1. Following the analysis in Ref. [59], we obtain
where ℱ [W] is the Fourier transform of the interaction W .
where δ(ω) is the Dirac delta function, obtained from the Fourier transform of a constant. Thus, the homogeneous steady state will be unstable if ℱ [W](ω) > 1/τ for some ω. Since α(ω) denotes the rate of exponential growth, the maxima of Eq. 12 will determine the dominant growing mode. If ℱ[W] were maximized at |ω| > 0, then the growing perturbation would have a periodic component, and would thus likely not form a localized mode. Instead, if ℱ[W1](ω) were maximized at ω = 0, then ℱ[W](ω) will be maximized at ω → 0 (ℱ[W](ω) cannot be maximized strictly at ω = 0 itself due to the −2πW0δ(ω) contribution to ℱ[W](ω)). In this case, the growing perturbation will be unimodal, likely leading to the formation of a localized state.
Thus, for the formation of a stable bump state on a general manifold, we obtain two requirements: First, the Fourier transform of the kernel k1(d) must be maximized at ω = 0; and second, this maximum must be larger than 1/τ. If we are solely interested in interaction shapes that lead to bump formation, we assume we have freedom to rescale the interactions. Thus, if a positive maximum is attained at ω = 0 a rescaling can always make this maximum larger than 1/τ. Thus, we primarily focus on the first requirement.
While we do not provide an exhaustive classification of interaction kernels k1 whose Fourier transforms are maximized at zero, we provide a broad sufficient condition — if k1(d) ≥ 0 for all d, then its Fourier transform will be maximized at zero. This can be proved as:
Thus, we finally conclude that, up to a rescaling of the strength of the interaction, an interaction W (d(θ, θ′)) will lead to the formation of a bump state if it can be rewritten as W (d(θ, θ′)) = k1(d(θ, θ′)) − k0 for: k0 ≥ 0; a kernel k1 that satisfies k1(d) ≥ 0 and k1(d) → 0 for d ≥ σ ; and sufficiently small σ over which the manifold 𝒫 is approximately flat.
2 Manifold of single bump states 𝒩 is isometric to manifold of neural lattice 𝒫
Here we will show that the manifold 𝒩 of neural activity, formed through single bump states at each point of the neural lattice 𝒫, is isometric to 𝒫. Specifically, we provide a distance metric dN on the manifold 𝒩, such that (𝒩, dN) is isometric to (𝒫, dP), where dp represents the geodesic distance considered as the distance metric during the MADE construction described in the main text.
While we will not prove this in complete generality for any 𝒫, we will assume that if 𝒫 has a sufficiently large separation of lengthscales (as assumed in the previous section), it will suffice to show this result for 𝒫 given as the flat Eucldiean manifold ℝn (and correspondingly, dp being the usual L2 metric).
To prove the existence of an isometry, we first argue that 𝒩 and 𝒫 are diffeomorphic. In Apx. 1, we argued that the prescribed connectivity kernel leads to the formation of activity bump states centered at any x ∈ 𝒫. Define the function f from 𝒫 to 𝒩 to characterize the shape of the activity bump, i.e., for any x0 ∈ 𝒫, we let
Note that a direct L2 norm on 𝒩 does not suffice, since for sufficiently distant x0 and x1,
Next, we examine candidate metrics on 𝒩 that may lead to an isometry with (𝒫, L2). the distance between
Instead, we construct here a metric of intrinsic length induced by the Riemannian metric on the tangents of 𝒩. For any two vectors u(x) and v(x) in Ts 𝒩, the tangent space of 𝒩 at s. Define the Riemannian metric as g(u, v) = ⟨u, v⟩ =∫ uvdx. Then, for any path γ(t) ∈ 𝒩, we can define the length of the path L[γ(t)] as
where the norm of a tangent vector γ′ is defined as
Assume that 𝒩 is an n dimensional manifold. Let (x1, …xn) be a coordinate chart in the neighborhood of a bump state
We can now compute the metric tensor gij = g(∂/∂xi, ∂/∂xj)
where Eq. 13 is obtained by performing the change of variables y = x − x0. From Eq. 14 we can make two crucial observations: first, since the integrand is odd in yi and yj, thus gij = 0 for i ≠ j; second, gii is independent of x0, and by symmetry is also independent of i — it is entirely determined by the shape of the bump state F (|x|). Thus, the metric tensor gij has constant entries on the diagonal, and zero on the off-diagonal elements, i.e., g is proportional to the identity matrix. We denote this proportionality constant as α.
The length of an infinitesimal line element ds is then given as
Thus, under the approximation of 𝒫 being treated as a flat space without curvature at scales smaller than σ, the metric space (𝒩, dN) is thus isometric to the metric space (𝒫, dP).
3 External velocities ignorant about network structure and state require shifted-kernel networks to control bump flow
In this section, for analytical simplicity, we will ignore the neural transfer function nonlinearity f.
The fixed points resulting from symmetric kernels in Eq. 1 satisfy:
where s(θ) denotes an activity bump centered at any point in 𝒫. Consider two such activity bump states: s0(θ) centered at θ0, and a nearby state sϵ centered at θ0−ϵ, i.e., sϵ(θ) = s0(θ+ϵ). For a neural state s(θ) to move from s0 to sϵ in time Δt, the time derivative ∂s/∂t must equal
We can use Eq. 15 to evaluate this space derivative as
where Wδ represents a kernel with a small offset δ, i.e., Wδ = W (θ − θ′ − δ). We can insert this in Eq. 16 to obtain
Comparing the above equation with Eq. 1, we find that the neural time constant τ = δΔt/ϵ. Since the speed of the activity bump is ϵ/Δt, we obtain a speed of
Thus, a network built with a kernel with offset δ in particular direction leads to activity flow along that direction. Coupling multiple copies of such networks with opposing directions of kernel offsets leads to an equilibrium, with the bump state at a fixed position. This can be intuitively seen by noting that Wδs + W−δs ≈ (W + δ∂θW)s + (W − δ∂θW)s = 2Ws, and thus opposing offset kernels acting on the same state are equivalent to the state being acted on by a kernel with no offset.
To control the flow the bump in arbitrary directions, we will next demonstrate that the magnitude of the feed-forward input b in a particular subnetwork can bias the motion of the bump. To see this, we first consider Eq. 1 scaled by a factor α,
Thus, scaling b by a factor α (i.e., b → αb) results in an equivalent solution of the dynamical equation with the states s also scaled by the same factor α (i.e., s(θ) → αs(θ)).
Consider two such coupled networks with opposing offsets, with feedforward inputs scaled by α1 = (1 + α)/2 and α2 = (1 − α)/2. As noted above the neural firing rates can be assumed to be scaled by the same factors. Heuristically, we will assume that the firing rates of the coupled network can be approximated through individually scaled firing rates of independent offset networks. This leads to the effective interaction through the offset kernels as
Thus, the effective interaction is similar to that obtained by a kernel with an offset of δα, leading to a bump speed of δα/τ.
Finally, we note that while the above argument has been constructed for offsets along a single dimension, it readily generalizes to higher dimensions: For continuous and differentiable W, a directional derivative can be written as a linear combination of partial derivatives along coordinate axes, i.e.,
Thus, subnetworks with differently scaled feedforward inputs lead to differently scaled firing rates s which leads to an interaction kernel that has an effective offset in the vector direction determined by the scaling coefficients. This effective offset in a particular direction causes the activity bump to flow along the manifold along that direction, leading to controllable flow of the activity bump through differential feed-forward inputs to the coupled network.
References
- [1]Attractor and integrator networks in the brainNat. Rev. Neurosci 23:744–766Google Scholar
- [2]Attractor dynamics of spatially correlated neural activity in the limbic systemAnnu. Rev. Neurosci 35:267–285Google Scholar
- [3]The intrinsic attractor manifold and population dynamics of a canonical cognitive circuit across waking and sleepNat. Neurosci 22:1512–1520Google Scholar
- [4]Head-direction cells recorded from the postsubiculum in freely moving rats. i. description and quantitative analysisJ. Neurosci 10:420–435Google Scholar
- [5]The head direction signal: origins and sensory-motor integrationAnnu. Rev. Neurosci 30:181–207Google Scholar
- [6]Neural dynamics and architecture of the heading direction circuit in zebrafishNat. Neurosci 26:765–773Google Scholar
- [7]The neuroanatomical ultrastructure and function of a biological ring attractorNeuron 108:145–163Google Scholar
- [8]A connectome of the drosophila central complex reveals network motifs suitable for flexible navigation and context-dependent action selectioneLife 10:e66039https://doi.org/10.7554/eLife.66039Google Scholar
- [9]A neural circuit architecture for angular integration in drosophilaNature 546:101–106Google Scholar
- [10]Ring attractor dynamics in the drosophila central brainScience 356:849–853Google Scholar
- [11]An improved neural-network model for the neural integrator of the oculomotor system: more realistic neuron behaviorBiol. Cybern 53:93–108Google Scholar
- [12]The hypothesis of the uniqueness of the oculomotor neural integrator: direct experimental evidence in the catJ. Physiol 492:517–527Google Scholar
- [13]Discharge patterns and recruitment order of identified motoneurons and internuclear neurons in the monkey abducens nucleusJ. Neurophysiol 60:1874–1895Google Scholar
- [14]Anatomy and discharge properties of pre-motor neurons in the goldfish medulla that have eye-position signals during fixationsJ. Neurophysiol 84:1035–1049Google Scholar
- [15]Continuous attractors and oculomotor controlNeural Netw 11:1253–1258Google Scholar
- [16]Microstructure of a spatial map in the entorhinal cortexNature 436:801–806Google Scholar
- [17]The entorhinal grid map is discretizedNature 492:72–78Google Scholar
- [18]Accurate path integration in continuous attractor network models of grid cellsPLoS Comput. Biol 5:e1000291Google Scholar
- [19]Principles of insect path integrationCurr. Biol 28:R1043–R1058Google Scholar
- [20]Distance-tuned neurons drive specialized path integration calculations in medial entorhinal cortexCell Rep 36Google Scholar
- [21]The anatomical and computational basis of the rat head-direction cell signalTrends Neurosci 24:289–294Google Scholar
- [22]Angular velocity and head direction signals recorded from the dorsal tegmental nucleus of gudden in the rat: implications for path integration in the head direction cell circuitBehav. Neurosci 115:571–588Google Scholar
- [23]Place cells, grid cells, and the brain’s spatial representation systemAnnu. Rev. Neurosci 31:69–89Google Scholar
- [24]A computational theory of the hippocampal cognitive mapProg. Brain Res Google Scholar
- [25]Engagement of neural circuits underlying 2D spatial navigation in a rodent virtual reality systemNeuron 84:442–456Google Scholar
- [26]A map of visual space in the primate entorhinal cortexNature 491:761–764Google Scholar
- [27]Organizing conceptual knowledge in humans with a gridlike codeScience 352:1464–1468Google Scholar
- [28]How to build a cognitive mapNat. Neurosci 25:1257–1272Google Scholar
- [29]Efficient inference in structured spacesCell 183:1147–1148Google Scholar
- [30]Grid cells in cognition: Mechanisms and functionAnnu. Rev. Neurosci 47:345–368Google Scholar
- [31]Flexible mapping of abstract domains by grid cells via self-supervised extraction and projection of generalized velocity signalsIn: The Thirty-eighth Annual Conference on Neural Information Processing Systems Google Scholar
- [32]Encoding of female mating dynamics by a hypothalamic line attractorNature :1–9Google Scholar
- [33]Grid cells: the position code, neural network models of activity, and the problem of learningHippocampus 18:1283–1300Google Scholar
- [34]Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: a theoryJ. Neurosci 16:2112–2126Google Scholar
- [35]Double-ring network model of the head-direction systemPhys. Rev. E Stat. Nonlin. Soft Matter Phys 66:041902Google Scholar
- [36]Theory of orientation tuning in visual cortexProc. Natl. Acad. Sci. U. S. A 92:3844–3848Google Scholar
- [37]Stability of the memory of eye position in a recurrent network of conductance-based model neuronsNeuron 26:259–271Google Scholar
- [38]Electron microscopic reconstruction of functionally identified cells in a neural integratorCurr. Biol 27:2137–2147Google Scholar
- [39]A model of the neural basis of the rat’s sense of directionAdv. Neural Inf. Process. Syst 7:173–180Google Scholar
- [40]Specific evidence of low-dimensional continuous attractor dynamics in grid cellsNat. Neurosci 16:1077–1084Google Scholar
- [41]Grid cell responses in 1D environments assessed as slices through a 2D latticeNeuron 89:1086–1099Google Scholar
- [42]Grid cell co-activity patterns during sleep reflect spatial overlap of grid fields during active behaviorsNat. Neurosci 22:609–617Google Scholar
- [43]Correlation structure of grid cells is preserved during sleepNat. Neurosci Google Scholar
- [44]Toroidal topology of population activity in grid cellsNature 602:123–128Google Scholar
- [45]A spin glass model of path integration in rat medial entorhinal cortexJournal of Neuroscience 26:4266–4276Google Scholar
- [46]Path integration and the neural basis of the ‘cognitive map’Nat. Rev. Neurosci 7:663–678Google Scholar
- [47]A coupled attractor model of the rodent head direction systemNetwork: computation in neural systems 7:671Google Scholar
- [48]Do we understand the emergent dynamics of grid cell activity?J. Neurosci 26:9352–9354Google Scholar
- [49]Vector-based navigation using grid-like representations in artificial agentsNature 557:429–433Google Scholar
- [50]Emergence of grid-like representations by training recurrent neural networks to perform spatial localizationArxiv Google Scholar
- [51]A unified theory for the computational and mechanistic origins of grid cellsNeuron Google Scholar
- [52]Explaining heterogeneity in medial entorhinal cortex with task-driven neural networksIn: Advances in Neural Information Processing Systems Google Scholar
- [53]Breaking neural network scaling laws with modularityarXiv Google Scholar
- [54]How diffusion models learn to factorize and composearXiv Google Scholar
- [55]Dynamics and computation of continuous attractorsNeural Comput 20:994–1025Google Scholar
- [56]A model of grid cells based on a twisted torus topologyInt. J. Neural Syst 17:231–240Google Scholar
- [57]Synaptic mechanisms and network dynamics underlying spatial working memory in a cortical network modelCerebral cortex 10:910–923Google Scholar
- [58]Neurons with graded response have collective computational properties like those of two-state neuronsProceedings of the national academy of sciences 81:3088–3092Google Scholar
- [59]Spontaneous emergence of topologically robust grid cell modules: A multiscale instability theorybioRxiv Google Scholar
- [60]QuasiperiodicityCambridge University Press pp. 212–245Google Scholar
- [61]On local and global existence of killing vector fieldsAnnals of Mathematics :105–120Google Scholar
- [62]On killing vector fields on riemannian manifoldsMathematics 9:259Google Scholar
- [63]Smooth manifoldsSpringer Google Scholar
- [64]One-shot entorhinal maps enable flexible navigation in novel environmentsNature 635:943–950Google Scholar
- [65]Introduction to Riemannian manifoldsSpringer Google Scholar
- [66]State space discovery in spatial representation circuits with persistent cohomologybioRxiv Google Scholar
- [67]Using persistent homology to reveal hidden information in neural dataarXiv Google Scholar
- [68]Barcodes: The persistent topology of dataBull. Am. Math. Soc 45:61–76Google Scholar
- [69]Interpreting neural computations by examining intrinsic and embedding dimensionality of neural activityarXiv Google Scholar
- [70]Nonlinear manifolds underlie neural population activity during behaviourbioRxiv Google Scholar
- [71]Estimating the dimensionality of the manifold underlying multi-electrode neural recordingsPLoS Comput. Biol 17:e1008591Google Scholar
- [72]Multiscale geometric methods for estimating intrinsic dimensionMIT http://hdl.handle.net/1721.1/72597Google Scholar
- [73]An algorithm for finding intrinsic dimensionality of dataIEEE Trans. Comput C-20:176–183Google Scholar
- [74]Differential geometry of curves and surfacesCRC Press Google Scholar
- [75]Neurons in primate entorhinal cortex represent gaze position in multiple spatial reference framesJ. Neurosci 38:2430–2441Google Scholar
- [76]Dynamics of pattern formation in lateral-inhibition type neural fieldsBiological cybernetics 27:77–87Google Scholar
- [77]Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networksNeural Comput 25:626–649Google Scholar
- [78]Universality and individuality in neural dynamics across large populations of recurrent networksAdv. Neural Inf. Process. Syst 2019:15629–15641Google Scholar
- [79]Reverse-engineering recurrent neural network solutions to a hierarchical inference task for micebioRxiv Google Scholar
- [80]Linking connectivity, dynamics, and computations in Low-Rank recurrent neural networksNeuron 99:609–623Google Scholar
- [81]Differential geometry methods for constructing Manifold-Targeted recurrent neural networksNeural Comput 34:1790–1811Google Scholar
- [82]A unified approach to building and controlling spiking attractor networksNeural Comput 17:1276–1314Google Scholar
- [83]Learning to represent continuous variables in heterogeneous neural networksCell Rep 39:110612Google Scholar
- [84]Neural Engineering: Computation, Representation, and Dynamics in Neurobiological SystemsMIT Press Google Scholar
- [85]Mapping Low-Dimensional dynamics to High-Dimensional neural activity: A derivation of the ring model from the neural engineering frameworkNeural Comput 33:827–852Google Scholar
- [86]Engineering recurrent neural networks from task-relevant manifolds and dynamicsPLoS Comput. Biol 16:e1008128Google Scholar
- [87]Shaping dynamics with multiple populations in Low-Rank recurrent networksNeural Comput 33:1572–1615Google Scholar
- [88]Manifolds.jl: An extensible julia framework for data analysis on manifoldsarXiv Google Scholar
- [89]A global geometric framework for nonlinear dimensionality reductionscience 290:2319–2323Google Scholar
- [90]Ripserer.jl: flexible and efficient persistent homology computation in juliaJournal of Open Source Software 5:2614Google Scholar
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.107224. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2025, Claudi et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 401
- downloads
- 0
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.