Theoretical principles explain the structure of the insect head direction circuit
Abstract
To navigate their environment, insects need to keep track of their orientation. Previous work has shown that insects encode their head direction as a sinusoidal activity pattern around a ring of neurons arranged in an eightcolumn structure. However, it is unclear whether this sinusoidal encoding of head direction is just an evolutionary coincidence or if it offers a particular functional advantage. To address this question, we establish the basic mathematical requirements for direction encoding and show that it can be performed by many circuits, all with different activity patterns. Among these activity patterns, we prove that the sinusoidal one is the most noiseresilient, but only when coupled with a sinusoidal connectivity pattern between the encoding neurons. We compare this predicted optimal connectivity pattern with anatomical data from the head direction circuits of the locust and the fruit fly, finding that our theory agrees with experimental evidence. Furthermore, we demonstrate that our predicted circuit can emerge using Hebbian plasticity, implying that the neural connectivity does not need to be explicitly encoded in the genetic program of the insect but rather can emerge during development. Finally, we illustrate that in our theory, the consistent presence of the eightcolumn organisation of head direction circuits across multiple insect species is not a chance artefact but instead can be explained by basic evolutionary principles.
Editor's evaluation
This important work suggests that the observed cosinelike activity in the head direction circuit of insects not only subserves vector addition but also minimizes noise in the representation. The authors provide solid evidence using the locust and fruit fly connectomes. The work raises important theoretical questions about the organization of the navigation system and will be of interest to theoretical and experimental researchers studying navigation.
https://doi.org/10.7554/eLife.91533.sa0eLife digest
Insects, including fruit flies and locusts, move throughout their environment to find food, interact with each other or escape danger. To navigate their surroundings, insects need to be able to keep track of their orientation. This tracking is achieved through visual cues and integrating information about their movements whilst flying so they know which direction their head is facing.
The set of neurons responsible for relaying information about the direction of the head (also known as heading) are connected together in a ring made up of eight columns of cells. Previous studies showed that the level of activity across this ring of neurons resembles a sinusoid shape: a smooth curve with one peak which encodes the animal’s heading. Neurons downstream from this eightcolumn ring, which relay velocity information, also display this sinusoidal pattern of activation.
Aceituno, Dall’Osto and Pisokas wanted to understand whether this sinusoidal pattern was an evolutionary coincidence, or whether it offers a particular advantage to insects. To answer this question, they established the mathematical criteria required for neurons in the eightcolumn ring to encode information about the heading of the animal. This revealed that these conditions can be satisfied by many different patterns of activation, not just the sinusoidal shape.
However, Aceituno, Dall’Osto and Pisokas show that the sinusoidal shape is the most resilient to variations in neuronal activity which may impact the encoded information. Further experiments revealed that this resilience only occurred if neurons in the circuit were connected together in a certain pattern.
Aceituno, Dall’Osto and Pisokas then compared this circuit with experimental data from locusts and fruit flies and found that both insects exhibit the predicted connection pattern. They also discovered that animals do not have to be born with this neuronal connection pattern, but can develop it during their lifetime.
These findings provide fresh insights into how insects relay information about the direction of their head as they fly. They suggest that the structure of the neuronal circuit responsible for encoding head direction was not formed by chance but instead arose due to the evolutionary benefits it provided.
Introduction
Insects exhibit an impressive ability to navigate the world, travelling long distances to migrate, find food or reach places of interest before returning to their nests (Müller and Wehner, 1988; Menzel et al., 1996; Heinze et al., 2013; Collett, 2019), a feat that requires them to keep track of their orientation across long journeys (Mappes and Homberg, 2004; Merlin et al., 2012; Warren et al., 2019; Collett, 2019; Beetz et al., 2022). This orientation tracking is achieved by the use of visual cues as well as integrating angular velocity signals over time to maintain a heading estimate relative to a starting angle (Seelig and Jayaraman, 2015; Taube, 2007), known as heading integration.
Electrophysiological and calcium imaging studies have shown that the neural population encoding the head direction in insects has a sinusoidshaped activation pattern (Labhart, 1988; Labhart, 2000; Loesel and Homberg, 2001; Pfeiffer et al., 2005; Pfeiffer and Homberg, 2007; Kinoshita et al., 2007; Heinze et al., 2009; Homberg et al., 2011; El Jundi et al., 2014; El Jundi et al., 2019). Furthermore, downstream neural populations also encode velocity signals as sinusoidal activations (Lyu et al., 2022).
Theoretical work has speculated that this recurring motif of sinusoidal activity patterns might be so prevalent because it enables easy elementwise vector addition, where vectors encoded as sinusoidal activity waveforms can be added together to give a sinusoidal waveform encoding the sum of the vectors (Touretzky et al., 1993; Wittmann and Schwegler, 1995; Vickerstaff and Di Paolo, 2005). This allows the encoded heading to be easily used by downstream circuitry to track the insect’s position (Mittelstaedt, 1985; Wittmann and Schwegler, 1995; Vickerstaff and Di Paolo, 2005; Haferlach et al., 2007; Wessnitzer et al., 2008; Sakura et al., 2008; Stone et al., 2017). Further studies have shown that models closely aligned with biological data can indeed implement heading integration using sinusoidal activity patterns (Pisokas et al., 2020; TurnerEvans et al., 2020), and that such circuits can be learned (Vafidis et al., 2022).
Here, we show that enabling easy vector addition cannot be the unique driving factor for the presence of sinusoidal heading encodings, as many other circuits with different activity patterns can perform vector addition in the same way. This finding led us to question whether the sinusoidal activation patterns seen in insect navigation circuits are a coincidence, or if they might offer a particular functional advantage that was selected for during evolution.
To address this question, we consider the basic principles necessary for a circuit encoding direction. Of all the circuits fulfilling these requirements, the sinusoidal activity pattern offers the best resilience to noise for the encoded information. However, obtaining this activity requires a circuit with a specific connectivity pattern between neurons. Thus, our theory predicts that the heading integration circuit will have a sinusoidal connectivity pattern. We compare our predicted circuit with connectivity data for the desert locust (Schistocerca gregaria) and fruit fly (Drosophila melanogaster) using network analysis tools, showing a strong agreement. We then ask how an insect brain might develop such a circuit, finding that a simple Hebbian learning rule is sufficient. Finally, we combine ideas from replication dynamics with our theory, which leads us to the conclusion that the eightcolumn structure is a consequence of basic theoretical principles, rather than an evolutionary coincidence.
Results
A theoretical circuit for heading integration
We consider a population of $N$ ‘compass neurons’ with an activity that encodes the direction of the insect as an angular variable θ. We represent the activity of this population by a vector where each element corresponds to the activity of one neuron, $\mathbf{a}(\theta )=[{\mathbf{a}}_{1}(\theta ),{\mathbf{a}}_{2}(\theta ),...,{\mathbf{a}}_{N}(\theta )]$. We take $N=8$ neurons, consistent with data from many insect species which possess an eightcolumn organisation, with each column encoding a different direction (Honkanen et al., 2019; Stone et al., 2017; Pisokas et al., 2020).
Each neuron’s activity is updated depending on its current firing rate and the inputs it receives, both from other neurons in the circuit and externally. We formulate this update rule as follows:
where $\mathbf{W}$ is the circuit’s weight matrix, representing the connections between neurons; $\varphi$ is the neural activation function, that converts the total neural input into an output firing rate; and $\mathbf{u}(t)$ is the external input that encodes the insect’s angular velocity, via the PEN population of neurons (Green et al., 2017; TurnerEvans et al., 2017; Sayre et al., 2021; Pisokas et al., 2020; TurnerEvans et al., 2020; Hulse et al., 2021).
To simplify our derivations we allow the neural activity values to be both positive and negative, interpreting these values as being relative to a baseline neural firing rate. Similarly, we allow the weights to be both positive and negative, a common simplification in computational models (Li et al., 2023; Cornford et al., 2020; Kriegeskorte and Golan, 2019). This simplification will be addressed in section ‘Comparing the predicted circuit with biological data’ where we compare our model with experimental data.
Mathematical principles for neural heading integration
A circuit capable of performing heading integration must fulfil the requirements outlined in Table 1. The first two requirements allow us to establish the family of possible path integration circuits. We then use the principle of noise minimisation to determine which circuits perform best.
Constraints on the neural activity
The neural activity should encode the insect’s head direction with a matching topology. Because the heading is a single angular variable, the topology of the activity space should be a 1D circle.
Furthermore, the symmetry requirement implies that rotating the heading of the insect should rotate the neural activity vector without changing its shape. Concretely, whether the insect is facing north or east, the activity of the neural population as a whole should be the same, but with the identity of the neuron with each activity value being different.
We can formalise the symmetry requirement by considering a head direction, θ, and a rotation by an integer multiple, $k$, of the angular spacing between neurons, $\mathrm{\Delta}\theta =\frac{2\pi k}{N}$. In this case, individual neuron activities follow the relation
which enforces that the neural activity vector is circularly rotated as the head direction changes. This relation can be expressed in the Fourier domain where, by the shift property, the circular rotation becomes a multiplication by a complex exponential:
where $i=\sqrt{1}$ and $f\in [0,...,N1]$ is the index of the spatial frequency, also called the harmonic. Activity with spatial frequency $f$ has $f$ ‘bumps’ around the ring.
Since the complex exponential has unit norm, the magnitude of the Fourier components remains the same for any rotation, $\Vert \mathcal{F}[\mathbf{a}(\theta +\mathrm{\Delta}\theta )]\Vert =\Vert \mathcal{F}[\mathbf{a}(\theta )]\Vert \phantom{\rule{mediummathspace}{0ex}}\mathrm{\forall}\mathrm{\Delta}\theta$. Therefore, the shape of the activity pattern, $\mathbf{a}\equiv \mathbf{a}(0)$, can also be fully specified by its Fourier domain representation, $\mathcal{F}[\mathbf{a}]$, and the phase of this activity profile around the network encodes the heading of the insect:
Taking the inverse Fourier transform with the constraint that the neural activities must be real, we get the following form for the neural activity,
It should be noted that the phase offset, $\theta f$, of each cosine waveform scales with the spatial frequency. This is because higher frequency waveforms have shorter wavelengths (in terms of number of neurons), so need their phases to rotate more quickly to move at the same speed around the network.
To explain this in more detail, we consider the case where this scaling is not used and the phase offset is the same for all harmonics, shown in Figure 1 top row. For $f=1$ (red line in Figure 1), the waveform has one bump and a wavelength of 8 neurons, so a 180° rotation corresponds to the waveform moving 4 neurons. But for $f=2$ (blue line in Figure 1), the waveform has two bumps and a wavelength of 4 neurons, so a 180° rotation corresponds to moving only 2 neurons, while a 4 neuron shift would correspond to a full rotation. The fact that the waveforms for different frequencies move at different speeds through the network implies that, if multiple spatial frequencies are used to independently encode the heading, their positions would not be aligned and the combined waveform would change shape as different angles are encoded (purple line in Figure 1). This violates the rotational symmetry principle, as the activity would look significantly different when the insect is facing north compared to south.
To solve this problem of misaligned harmonics, we return to the case specified in Equation 5 where the phase offset is scaled linearly with the spatial frequency so that all waveforms move at the same speed and rotational symmetry is ensured, which is shown in Figure 1 bottom row. For a network with $N=8$ neurons, a 180° rotation shifts the activity waveform by 4 neurons. This 4 neuron shift corresponds to a $\frac{4}{8}$ or 180° phase offset for the $f=1$ waveform, which has a wavelength of 8 neurons, but a $\frac{4}{4}$ or 360° phase offset for the $f=2$ waveform, which has a wavelength of 4 neurons. While this maintains rotational symmetry, it implies that the $f=2$ waveform is the same whether the heading angle 0° or 180° is encoded. As a consequence, a higher harmonic waveform (having $f>1$) cannot on its own specify the encoded angle. The activity in all harmonics must be considered simultaneously to decode the angle, but even then this decoding might not be unique in the presence of noise (see section ‘Ambiguities in multiple harmonics decoding with drift’). The difficulties associated with encodings utilising multiple harmonics will be further addressed in the following sections.
Constraints on heading integration circuits
The basic assumptions outlined earlier also constrain the possible heading integration circuits – such circuits should allow for neural activity with the required topology and rotational symmetry to stably exist and propagate. Together, these two principles require that the activity in the circuit should have a constant total magnitude. We consider that this constraint is enforced by the nonlinear neural activation function, ϕ, in Equation 1, as detailed in section ‘Path integration dynamics for heading and position’.
If the network activity is at the desired level, and the external input $\mathbf{u}(t)$ is projected onto the ring attractor such that it does not alter the total network activity, then we can consider the network dynamics to be linear at this operating point. Therefore, while within the space of possible activities that corresponds to the ring attractor, our circuit dynamics are effectively described as,
The rotational symmetry principle also applies to the network. For the same shaped activity waveform to be able to stably exist at any position around the network, the network connectivity should also be rotationally symmetric. For example, the connection strength between the neurons encoding the north and northeast directions should be the same as between those encoding south and southwest. Mathematically, this imposes that the weight matrix, $\mathbf{W}$, is circulant, specifically that $\mathbf{W}}_{n,m}={\mathbf{W}}_{n+k,m+k$. This matrix is fully specified by its first row, called the connectivity profile and denoted $\omega$, meaning that we can express the product of the matrix with the neural activity as
which can be simplified in terms of the convolution operation,
Considering the case where the insect is not moving, $\mathbf{u}(t)=0$, and the network activity is stable, $\dot{\mathbf{a}}(\theta )=0$, we can combine Equation 6 and Equation 8 to get a relation for the stable network activity
In the Fourier domain, this simplifies into
As for the activity waveform analysis, we note that the Fourier transform is taken on the neural indices, not on the temporal domain.
Here, we have $N$ equations that have to be satisfied, since the Fourier transform of an $N$dimensional vector is also $N$dimensional. For each harmonic frequency $f$, Equation 10 has two solutions:
${\mathcal{F}}_{f}\left[\omega \right]=1$, which implies that activity with this spatial frequency is stable in the network, and therefore that it can encode the insect’s heading.
${\mathcal{F}}_{f}[\mathbf{a}(\theta )]=0$, meaning that the activity in this harmonic is zero, and thus nothing can be encoded in this frequency.
Minimising noise propagation in the circuit
The only constraint on the connectivity weights given by Equation 10 is that, if a frequency is used for encoding then ${\mathcal{F}}_{f}\left[\omega \right]=1$. There is no restriction on the weights for the inactive harmonics – they remain free parameters.
However, nonencoding channels can still propagate noise, which would be prevented by setting ${\mathcal{F}}_{f}\left[\omega \right]=0$. To illustrate this, we consider white noise, denoted by $\u03f5$, that is added to the neural activity. When this noisy activity evolves in accordance with the dynamics from Equation 8, we have
where the term $\omega \ast \u03f5$ corresponds to noise and should therefore be dampened. We want to minimise the strength of that noise, which is quantified by its variance
Hence the magnitude of the noise that passes from one time interval to the next is modulated by the magnitude of the weight vector. By Parseval’s theorem,
which implies that to minimise noise propagation in the network we should impose ${\mathcal{F}}_{f}\left[\omega \right]=0$ for all harmonics, $f$, where ${\mathcal{F}}_{f}[\mathbf{a}(\theta )]=0$. We show this in simulation in Figure 2, where only the first harmonic encodes information and we vary the weights of the other harmonics. The noise is indeed minimised if the weights for all nonencoding harmonics are set to zero.
This result establishes that all nonencoding harmonics in the network should be set to 0 to minimise noise propagation, and allows us to recover the circuit connectivity
where $\mathbf{F}$ is the set of harmonics used to encode the head direction, $\mathbf{F}=\{f\in [\mathrm{1...}N1]:\Vert {\mathcal{F}}_{f}[\mathbf{a}]\Vert \ne 0\}$. The choice of $\mathbf{F}$ therefore determines both the harmonics used for encoding the angle in the activity and the connectivity of the circuit that supports this activity.
We leverage the same logic to prove that the number of encoding channels does not affect the signaltonoise ratio in the network. If we have $c$ encoding channels, each with the same activity, the total activity will grow linearly with $c$. The noise will also grow with $\Vert \omega {\Vert}^{2}=c$, meaning that the signaltonoise ratio will remain constant.
We now return to the question of whether using one or multiple harmonics is better. As mentioned, the signaltonoise ratio in the network remains constant as additional encoding channels, $c$, are used. But additional channels imply additional harmonics, which by Parseval’s theorem require a higher total neural activity. If the total activity in the network is limited, the signaltonoise ratio is in fact decreased when using more than one harmonic.
Additionally, as detailed in sections ‘Constraints on the neural activity’ and ‘Ambiguities in multiple harmonics decoding with drift’, circuits using multiple harmonics perform worse in high noise environments because they are not guaranteed to provide an unambiguous orientation encoding. Finally, as we will discuss in sections ‘Learning rules and development’ and ‘Convergence of Oja’s rule with multiple harmonics’, the use of multiple harmonics also complicates the circuit’s development. We therefore only select circuits that use a single harmonic for further analysis, reducing the number of possibilities to $N$.
Determining the optimal circuit
As we consider networks with $N=8$ neurons, consistent with multiple insect species (Honkanen et al., 2019; Stone et al., 2017; Pisokas et al., 2020), the possible single harmonic circuits are $f=\{1,2,3,4,5,6,7\}$, where the zeroth harmonic is discarded because it only represents the baseline neural activity. This gives the following activities and weights, derived from Equation 5 and Equation 14
The circuits for the first four harmonics are plotted in Figure 3, but not all of these circuits are valid. In particular, circuits with even spatial frequencies have multiple neurons with identical activity values because they share a common divisor with $N=8$, as explained in detail in section ‘Degenerate circuits’.
For $f=4$ there are only two unique activity values, ${\mathbf{a}}_{n}=\pm \mathrm{cos}(\theta )\text{}\mathrm{\forall}n$, so this circuit can only encode one dimension, not a circular topology (see Figure 3D). For $f=2$ and $f=6$ (not plotted), there are four unique activity values, allowing the angle to be properly encoded. However, these circuits are degenerate because they are composed of two independent subcircuits, each encoding one direction (see Figure 3B). Because the weights connecting the two subcircuits are all zero, the activity in each subcircuit is independent of the activity in the other, and so the overall activity cannot be constrained to have the required circular topology.
All circuits with odd harmonic frequencies are equivalent. For example, as shown in Figure 3, the connections in the circuit for $f=3$ are the same as those for the $f=1$ circuit after the neuron identities are permuted. As detailed in section ‘Equivalence under permutation’, the frequencies $f=\{1,3,5,7\}$ always give the same circuit because the odd frequencies are coprime with the number of neurons $N=8$.
Since the activities and weights of all the nondegenerate circuits $f=\{1,3,5,7\}$ are the same as the base harmonic $f=1$ up to a permutation, we choose the lowest harmonic $f=1$, which gives us the following activity and weights:
Comparing the predicted circuit with biological data
Our theory proposes an optimally noiseresistant circuit for heading integration, and its corresponding activity. The prediction that heading should be encoded as a sinusoidal activity bump is consistent with previous theoretical models (Touretzky et al., 1993; Wittmann and Schwegler, 1995; Hartmann and Wehner, 1995; Zhang, 1996; Vickerstaff and Di Paolo, 2005; Haferlach et al., 2007; Stone et al., 2017), as well as experimental evidence in both the locust and fruit fly (Heinze et al., 2009; TurnerEvans et al., 2017). We note, however, that data from the fruit fly shows a more concentrated activity bump than what would be expected from a perfect sinusoidal profile (Seelig and Jayaraman, 2015; TurnerEvans et al., 2017), and that calcium imaging (which was used to measure the activity) can introduce biases in the activity measurements (Siegle et al., 2021; Huang et al., 2021). Thus the sinusoidal activity we model is an approximation of the true biological process rather than a perfect description.
Importantly, our theory proposes that the optimally noiseresilient heading integration circuit should have synaptic weights that follow a sinusoidal pattern, even though such weights are not necessary for producing sinusoidal activity as discussed in section ‘Minimising noise propagation in the circuit’. This is the main prediction of our theory and we sought to validate it using connectivity data from the locust and fruit fly.
However, before we can directly compare our model with biological circuits, we must address a number of modelling simplifications:
In the model we considered a population of 8 neurons, but in insects there are eight neural columns, each with several neuron types (EPG, PEG, PEN, Delta7). Of these we model only the EPG (known as the compass neurons) which encode the integrated head direction.
The synaptic connections between neurons in our model can be both positive and negative, while biological neurons follow Dale’s law, meaning that a single neuron can either have only positive or only negative synapses.
The neural activity in the model was centred around zero and could have both positive and negative activity (firing rates), while real neurons only have positive firing rates.
Therefore, we simplified the biological connectivity to produce an equivalent circuit that could be directly compared with our model prediction. The neural population in our theoretical model corresponds to the biological EPG neurons, as these encode the integrated heading. We considered the other 3 neuron types that are part of the compass circuit (PEG, PEN, and Delta7 – see Kakaria and de Bivort, 2017; Pisokas et al., 2020) as just implementing connections between EPG neurons in accordance with biological constraints. We counted the number of different paths between EPG neurons, accounting for the sign of the connections (whether the path passed through an inhibitory Delta7 neuron), and used the net path count as a proxy for connectivity strength. This process is explained in detail in section ‘Path counting’.
We then computed the average connectivity profile, i.e., how each neuron connected to its neighbours around the ring, and compared this profile to the closest fitting sinusoid. Because the neuron gains, absolute synaptic strength and biophysical properties of the neurons are unknown, the units of the net path count are not necessarily equivalent to our abstract connection strength. We therefore fit an arbitrarily scaled and shifted sinusoid to the connection counts:
where $mn$ is the circular distance between 2 neurons while β and γ are constants that are fit to minimise the precisionweighted mean squared error compared with the experimental connectivity profile (see section ‘Fitting weights’).
Analysing the data from Pisokas et al., 2020, we consider the shortest excitatory and inhibitory paths between EPG (also known as CL1a) neurons in the locust, which have lengths of 2 and 3, respectively. There are no direct connections of length 1, and all the paths of length ≥4 must pass through the same neurontype multiple times. This path counting analysis for the locust is shown in Figure 4A, and the procedure is detailed in section ‘Path counting’. We find that the connectivity profile between neurons for the locust is very close to sinusoidal in shape, supporting our theoretical prediction.
For the fruit fly, we used data from Scheffer et al., 2020, which provides synapse counts between pairs of neurons. These synapse counts were used as a proxy for connectivity strength, a view that has been validated in previous experiments (Liu et al., 2022; Barnes et al., 2022). After identifying the neurons and connections of interest, we grouped the neurons in eight columns following the logic presented in Pisokas et al., 2020, with detailed methodology explained in section ‘Data preprocessing’. We repeated the path counting analysis for the fruit fly with synapse count data (Figure 4B), and found that while the data are noisy, the connectivity profile fits a single sinusoid pattern reasonably well. However, the high variability in the synapse counts makes our hypothesis difficult to differentiate from alternative shapes (see section ‘Fitting weights’).
Taken together, this analysis shows that our theory is consistent with experimental data – using binary connectivity data for the desert locust and synapse count data for the fruit fly.
Learning rules and development
Having validated the connectivity of our theoretical circuit by comparing it to experimental data, we ask whether our circuit lends itself to biological development. Specifically, we show that even though our circuit requires precise connection strengths, this connectivity can be developed naturally by a Hebbian learning rule. Our approach follows from previous research which has shown that simple Hebbian learning rules can lead to the emergence of circular line attractors in large neural populations (Stringer et al., 2002), and that a head direction circuit can emerge from a predictive rule (Vafidis et al., 2022). In contrast to this work, we focus only on the selfsustaining nature of the heading integration circuit in insects and show that our proposed sinusoidal connectivity profile can emerge naturally.
Because the weight matrix is circulant, its eigenvalues are equal to the Fourier spectrum of its first row, which we derived to only have one nonzero value, for $f=1$. The network therefore only has a single eigenvalue and so projects activity into a single dimension. This operation is similar to classical dimensionality reduction methods such as principal component analysis, which can be implemented by Hebbianlike learning rules (Dayan and Abbott, 2001). We thus analyse the effects of incorporating a modified Oja’s rule into our model, a classical variant of Hebbian learning where the synaptic strength between 2 neurons grows when both neurons are active simultaneously, and the total synaptic strength is regularised to prevent exploding weight growth,
where $\mathbf{W}}_{n,m$ is the synaptic connection strength from neuron $m$ to neuron $n$, $a}_{m$ and $a}_{n$ are the pre and postsynaptic activities, respectively, and $\eta {\displaystyle \frac{d\theta}{dt}}$ is an adaptive learning rate where the plasticity is proportional to the rotational speed, whereas in the classical Oja’s rule it would be constant.
For our analysis we assume that the insect faces all possible directions, and therefore that the neural activity goes around the full circle. We then integrate the weight updates over some long period of time,
where θ is the integration space. Applying the activity from Equation 17, we can find the fixed point of this update rule, when $\mathrm{\Delta}{\mathbf{W}}_{n,m}=0$:
Combined with Equation 7 and Equation 17, this result means that if there is sinusoidal activity in the network, the weights will naturally converge to the optimal sinusoidal values by way of our variant of Oja’s rule. This leads to the emergence of sinusoidal activity and weights from selfconsistency: initial noisy sinusoidal activity will enforce weights that are close to a sinusoid, and those weights will filter out noise to make the activity even closer to a sinusoid, which will in turn make the weights more sinusoidal. This process will iteratively make the activity and the weights converge to the solution from Equation 17.
An important point in the use of Oja’s rule is that it would tend to concentrate the activity in a single harmonic. In a linear network, the harmonics would compete during learning, leading to one single harmonic emerging and all others being suppressed, as shown in section ‘Linear neurons’. For neurons with a nonlinear activation function, secondary harmonics would emerge, but would remain small under mild assumptions, as shown in section ‘Neurons with nonlinear activations’. Oja’s rule will still cause the weights to converge to approximately sinusoidal connectivity.
The finding that the weights will converge to a sinusoidal connectivity with learning has two interesting consequences from a biological standpoint: First, our circuit can emerge even when starting with only very coarse initial weights, without the need for high precision initial connectivity. Second, this simple plasticity rule allows the system to repair or recover from perturbations in its synapses as shown by simulations in Figure 5. However, this learning rule only applies to the weights that ensure stable, selfsustaining activity in the network. The network connectivity responsible for correctly integrating angular velocity inputs (given by the PEN to EPG connections in the fly) might require more elements than a purely Hebbian rule (Stringer et al., 2002), such as the addition of a predictive component (Vafidis et al., 2022).
Evolution of the eightcolumn circuit
Having derived and experimentally validated the theoretical circuit, we now address another question: whether there might be a reason that insect head direction circuits have an eightcolumn architecture (Honkanen et al., 2019; Stone et al., 2017; Pisokas et al., 2020). The derivations leading to Equation 17 are valid for other values of $N$, so there is no a priori reason to expect $N=8$.
Our reasoning follows recent studies in genetics (Johnston et al., 2022; Dingle et al., 2018), which argue that an observed organism is more likely to have resulted from a simple than a more complex genome: evolution favours simplicity. We note that powers of two are easier to generate with replication dynamics than other numbers, because they just require each cell to divide a set number of times. Other numbers require that, at some point, two cells resulting from a division must behave differently, necessitating more complex signalling mechanisms and rendering this possibility less likely to have been developed by evolution without some other driving factor. We therefore expect $N$ to be a power of two unless required to be otherwise.
As we show in section ‘Circuits with different neuron counts’, not all numbers of neurons enable a working circuit. The circuits for $N=2$ and $N=4$ are degenerate – either producing a single dimensional encoding, or two disconnected circuits that do not enforce the required circular topology. $N=8$ is the smallest power of two that could result in a nondegenerate circuit. This hints at the possibility that the eightcolumn architecture is not a chance evolutionary artefact, but rather that it is the genetically simplest circuit capable of performing heading integration.
Discussion
In this work we derived an optimal noiseminimising circuit for encoding heading and verified that this circuit matches experimental data from insects. Furthermore, we showed that such a circuit can be developed and maintained by a biological learning rule, and proposed a mathematical argument for the eightcolumn structure found in insect compass circuits. In this section, we discuss the implications and limitations of these contributions, and outline potential future work.
Heading integration circuits in insects have been extensively studied, with models ranging in complexity from simplified conceptual networks (Wittmann and Schwegler, 1995; Cope et al., 2017) to sophisticated models constrained by biological data and featuring multiple neuron types (Kakaria and de Bivort, 2017; Su et al., 2017; Kim et al., 2017; Pisokas et al., 2020; Lyu et al., 2022). Previous theoretical work has argued that a sinusoidal activity encoding is such a common motif in insect navigation because it facilitates elementwise vector addition (Wittmann and Schwegler, 1995). However, this cannot be the only reason because as we show there is a whole family of circuits with different encoding patterns that enable easy vector addition. By showing that sinusoidal activity emerges from the theoretically most noiseresilient heading integration circuit, and verifying that the corresponding circuit matches experimental data, we close this explanatory gap.
We also show that our proposed circuit can be developed by a simple Hebbianbased learning rule, and that the presence of the eightcolumn structure can be explained from the perspective of replication dynamics. Both results align with the idea that evolution should be biased towards structures that are easier to encode in the genome (Johnston et al., 2022; Alon, 2007) and learn (Zador, 2019). To the best of our knowledge, this is the first time that such arguments have been put forward in the context of a specific circuit with a specific function.
Our work still has some unaddressed limitations, in particular regarding the topology of the activity. The use of a circular topology to encode the head direction of the insect is valid only in 2D environments. But many species, including the fruit fly, can navigate in a 3D environment. We argue that even if these insects live in a 3D environment, the third dimension (updown) is different from the other two due to gravity and the existence of a hard boundary (the ground). Further studies would be required to investigate the full effects of 3D motion.
We could also investigate circuits that integrate position, not only heading, which would require the activity to have a 2D plane topology instead of a circle. This would be particularly relevant for foraging insects such as bees or ants, whose ability to remember their position with respect to their nest has been the subject of many experimental and computational studies (Collett, 2019; Wehner and Srinivasan, 2003). The position integrating neurons in these insects are also predicted to have sinusoidal activation patterns (Wittmann and Schwegler, 1995; Vickerstaff and Di Paolo, 2005; Stone et al., 2017).
Finally, another interesting avenue for future work is to compare the encoding of direction in insects with that of mammals, which encode heading in a fundamentally different way that uses many more neurons. This raises a critical question: why would the circuit and encoding be different if navigation follows similar principles across species? We speculate that the difference might lie in the type of navigation that the two classes of animals use. Insects often rely on a reference system that is globally anchored to a certain point or phenomenon, whether it is their nests for ants and bees, the sunlight polarisation pattern for locusts, or the milky way for dung beetles. On the other hand, mammals such as rodents typically do not use global cues but rely on local landmarks that are contextdependent and only occur in specific locations. Therefore, mammals must employ a flexible encoding that can be updated as different environments are explored. Further investigation of this possibility would require a different set of principles than those selected here.
Materials and methods
Path integration dynamics for heading and position
Request a detailed protocolHere, we make more precise statements about the dynamics of the circuit. The topology of the activity is a circular line attractor, so that any perturbation falls back into a circle and the position of the activity around the circle represents an angle. In our circuit with the dynamics from Equation 1, this circular line attractor is achieved by setting
where $r$ is the radius of the attractor. Given that the dynamics are effectively linear we can apply a Fourier transform and write the original dynamics around the line attractor directly in the spatial Fourier basis,
which we can expand to obtain
which only applies to the circular activity where the dynamics are linear. To incorporate the dynamics that force the activity to return to the cycle, we add a new term into the network dynamics which fulfils Equation 22 – increasing $\Vert {\mathcal{F}}_{f}[\mathbf{a}]\Vert$ if $\Vert \mathbf{a}\Vert <r$ and decreasing it if $\Vert \mathbf{a}\Vert >r$. This increase or decrease can be incorporated as a simple scaling factor on the activity decay term
where $\alpha (r\Vert \mathbf{a}\Vert )$ is a nonlinear smooth function with a single minimum and $\alpha (0)=1$, which forces the activity to have the appropriate magnitude, $\Vert \mathbf{a}\Vert =r$.
Note that this doesn’t require the neurons to have access to a global activity magnitude signal because each neuron receives a sufficient number of inputs to locally compute the total activity in the network. See the $f=1$ circuit in Figure 3: each neuron receives inputs from all others except those tuned to an orthogonal direction, but the activity of these orthogonal neurons can be computed from the correlated neurons next to them. For example, neuron 1 receives input from all neurons except 3 and 7, but the activities of these neurons can be computed from the activities of neurons 2 and 4 or 6 and 8, respectively. Note also that this is not the case for degenerate circuits such as $f=2$ in Figure 3, which would require an activation function with access to global information to constrain the total activity.
As the activity always falls back to the circular line attractor, the heading integration is linear around the circle. This implies that any small movement of the animal or the perception of a sensory cue is first projected onto the circle, then linearly integrated. We therefore only consider how network inputs cause the activity to rotate around the network,
where $\mathcal{F}}_{f}[\mathbf{u}(t){]}_{\perp \mathbf{a}$ is the projection of the input signal perpendicular to the current activity pattern.
In more intuitive terms, the neurons have a saturating nonlinear activation function where they modulate their gain based on the total activity in the network. If the activity in the network is above the desired level, $r$, the gain is reduced and the activity decreases, and when the activity of the network is less than desired level, both the gain and the activity increase. Note that in this scenario transient deviations from the line attractor, which would induce nonlinear behaviour in the circuit dynamics, are tolerable. External inputs, $\mathbf{u}(t)$, could transiently modify the shape of the activity, producing activity shapes deviating from what the linear model can accommodate. For example, the shape of the bump attractor could be modified through nonlinearities while the insect attains high angular velocity (TurnerEvans et al., 2017).
Such nonlinear dynamics do not conflict with the theory developed here, which only requires linearity when the activity is projected onto the circular line attractor. In our framework, the linearity of integration at the circular line attractor is not a computational assumption, but rather it emerges from the principle of symmetry.
Ambiguities in multiple harmonics decoding with drift
Request a detailed protocolWe consider the case where multiple harmonics are used, and their phases have drifted from each other. We only focus on angular drift, rather than noise in the full activity, because as noted in section ‘Path integration dynamics for heading and position’, any deviations in the overall activity level in the network will dissipate.
For example, we consider the harmonics $f}_{1$ and $f}_{2$. The activity is given by Equation 5
If the phase of the second harmonic drifts by $\delta \theta$,
We can calculate the alignment between the activity with drift and the activity without drift as a dot product of the activity vectors,
However, there are other angles where the alignment is better. For example, we can consider an estimate angle, $\hat{\theta}$, with its corresponding activity, $\mathbf{a}(\hat{\theta})$, which gives
For small $\delta \theta$, this is maximised when $\hat{\theta}=\theta +\delta \theta /2$. However, since θ is circular, if the drift is $\pi$ then there are two possible positions given by $\hat{\theta}=\theta \pm \pi /2$.
This implies that an encoding using multiple harmonics does not necessarily offer a unique decoding in the presence of noise.
Equivalent circuits and degeneracies
Equivalence under permutation
Request a detailed protocolThe activity of neurons given by Equation 5 implies that the preferred angle of neuron $n$ is given by
where in our case $N=8$. Table 2 shows $nf\phantom{\rule{1em}{0ex}}\mathrm{mod}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}(N)$ evaluated for all neurons in the network using different harmonics. We notice that for $f=\{1,3,5,7\}$ all the numbers from zero to seven appear, while for $f=\{2,6\}$ we only get the even numbers, for $f=4$ we get only zero and four, and for $f=8$ there is only zero.
The explanation is based on number theory. If $f$ and $N$ have the greatest common divisor $gcd(N,f)=d$, then $nf\phantom{\rule{1em}{0ex}}\mathrm{mod}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}N=0$ for $n=N/d$. This implies that the preferred angle of neuron $n=0$ is the same as that of $n=N/d$. When $N,f$ are coprime $d=1$, $n$ goes from 0 to $N1$ without repeating any value. However, when $d>1$, the neuron $n=N/d$ has the same tuning as the neuron $n=0$, the neuron $n=N/d+1$ has the same tuning as the neuron $n=1$ and so on. In other words, the angular tuning of the neurons has a period of $gcd(N,f)$.
This divides the possible circuits into the following groups:
$f=\{1,3,5,7\}$ which contain $N=8$ angular tunings with values $[0,\pi /4,\pi /2,3\pi /4,\pi ,5\pi /4,3\pi /2,7\pi /4]$ because every odd number is coprime with 8 and thus the $gcd(N,f)=1$.
$f=\{2,6\}$ which cycle through four possible directions $[0,\pi /2,\pi ,3\pi /2]$ because $\frac{N}{gcd(N,f)}=4$.
$f=4$ which can only represent two possible directions $[0,\pi ]$.
$f=8$ which can only represent the direction 0.
Degenerate circuits
Request a detailed protocolGiven the groupings presented in the previous subsection, we notice that not all of them allow the encoding of a full range of angles, (0°, 360°). Notably, for $f=8$ the neurons only encode one angle, and for $f=4$ only two complementary angles are encoded, 0 and $\pi$. This means that we cannot represent a full circle, because we need two different dimensions to do so, but in both of these cases we can only encode one.
For $f=2,6$ we obtain two groups of neurons, and in each group there are 2 neurons that encode two opposing directions, thus it is possible to cover a full circle. However, the circuit is degenerate, as shown in Figure 3. In this case there are no connections between even and odd neurons. Thus, we have two groups of neurons that are disconnected, and each group encodes only one direction (either northsouth or eastwest). While the encoding could work in principle, the circuit is decoupled, meaning that there is nothing in this circuit that prevents the activity from having activities outside the circular topology: as the two circuits are disconnected, a given value of activity in one group does not restrict the activity in the other.
Circuits with different neuron counts
Request a detailed protocolWe evaluate the viability of circuits with other values of $N$. From findings in the the previous section, we note that choosing $N$ to be a prime number implies that all frequencies will be coprime with $N$, and thus that all the neurons will have a different tuning.
For $N=2$, the two angles are $0,\pi$, in which case the circuit can only encode one dimension and thus cannot encode a full circle, as was the case for $N=8,f=4$ in section ‘Degenerate circuits’.
For $N=4$, if $f=2$ the circuit can encode only the angles $0,\pi$ and we have the same case as for $N=2$. But if $f=1$ or $f=3$, the neurons are tuned to the angles $0,\pi /2,\pi ,3\pi /2$ which can encode a circle. However, by looking at the connectivity matrix from Equation 16, we find that neurons 1, 3 are connected to one another, but not to 2, 4, as was the case for $N=8,f=2$ in section ‘Degenerate circuits’. This implies that the circuit does not enforce a circular topology, and thus it does not work.
Notice that we could think of another structure where each neuron encodes a different direction. Intuitively, we would have a northsouth neuron and an eastwest neuron, as we would have in Cartesian coordinates. However, this would have the same problem as $N=4$, where the 2 neurons would not interact and thus the topology would be wrong. Additionally, when the insect heads north, the northsouth neuron would be very active, while when it heads south it would be very inactive. This means that the circuit as a whole would have a different firing rate depending on the direction, breaking the symmetry assumption.
Path counting
Request a detailed protocolWe counted the number of different paths between EPG neurons, accounting for the sign of the connections (whether the path passed through an inhibitory 7 neuron), and used the net path count as a proxy for connectivity strength. The results of this analysis are shown in section ‘Comparing the predicted circuit with biological data’, but we will detail the process here using the locust network from Pisokas et al., 2020, as an example, shown in Figure 6.
We consider the shortest excitatory and inhibitory pathways between EPG neurons, which have lengths of 2 and 3 respectively. There are no direct paths between $\mathrm{EPG}$ neurons in the locust circuit. In this case there are two paths of length 2 that implement selfexcitatory connections for $\mathrm{EPG}}_{1$:
$\mathrm{EPG}}_{1}\to {\mathrm{PEG}}_{1}\to {\mathrm{EPG}}_{1$
$\mathrm{EPG}}_{1}\to {\mathrm{PEN}}_{1}\to {\mathrm{EPG}}_{1$
And there is one connection of path length 2 that connects $\mathrm{EPG}}_{1$ to each of its nearest neighbours:
$\mathrm{EPG}}_{1}\to {\mathrm{PEN}}_{1}\to {\mathrm{EPG}}_{2$
$\mathrm{EPG}}_{1}\to {\mathrm{PEN}}_{1}\to {\mathrm{EPG}}_{8$
For inhibitory connections there are four paths of length 3 that connect $\mathrm{EPG}}_{1$ to the neuron on the opposite side of the ring, $\mathrm{EPG}}_{5$. These are:
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{5}\to {\mathrm{PEN}}_{5}\to {\mathrm{EPG}}_{5$
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{5}\to {\mathrm{PEG}}_{5}\to {\mathrm{EPG}}_{5$
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{4}\to {\mathrm{PEN}}_{4}\to {\mathrm{EPG}}_{5$
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{6}\to {\mathrm{PEN}}_{6}\to {\mathrm{EPG}}_{5$
There are three connections of path length 3 connecting $\mathrm{EPG}}_{1$ to $\mathrm{EPG}}_{4$:
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{4}\to {\mathrm{PEN}}_{4}\to {\mathrm{EPG}}_{4$
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{4}\to {\mathrm{PEG}}_{4}\to {\mathrm{EPG}}_{4$
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{5}\to {\mathrm{PEN}}_{5}\to {\mathrm{EPG}}_{4$
Finally, there is one path of length 3 connecting $\mathrm{EPG}}_{1$ to the neuron perpendicular to it around the ring, $\mathrm{EPG}}_{3$:
$\mathrm{EPG}}_{1}\to {\mathrm{Delta7}}_{4}\to {\mathrm{PEN}}_{4}\to {\mathrm{EPG}}_{3$
We then add these connections together to obtain the net path count profile for the network. Because the network is rotationally symmetric, the connections from $\mathrm{EPG}}_{1$ are the same as the connections from $\mathrm{EPG}}_{2$ but with the neurons indices incremented by 1. The generalised profile is shown in Table 3.
Data preprocessing
Request a detailed protocolThe following section details our data preprocessing to produce a central complex circuit for the fruit fly similar to that in Pisokas et al., 2020, using synaptic counts from the fruit fly connectome dataset (Scheffer et al., 2020). Full details are shown in our available code.
We first identified the 6 neuron types which corresponded to the neurons of interest in the central complex (Table 4).
We grouped the two PEN and two EPG populations together for further analysis.
From the connectome data we created a connectivity matrix (Figure 7) containing the number of synapses between all pairs of neurons of the abovementioned types. This contained 129,473 synapses between 152 neurons in total.
Next, neurons in the same glomerulus were grouped together. The EPG neurons were divided between 18 glomeruli, L1L9 and R1R9. The PEN neurons were also divided between 16 glomeruli, L2L9 and R2R9. The PEG neurons were divided into 18 glomeruli with only 1 neuron in each, L1L9 and R1R9. The Delta7 neurons had 10 unique subtypes (Table 5) which were grouped into eight glomeruli based on their left glomerulus index – i.e., L4R5_R and L4R6_R were grouped in L4.
After this grouping we had a connectivity matrix of 60 neurons in total (Figure 8).
For the EPG, PEN, and PEG neurons we then grouped glomeruli mirrored in the two hemispheres together – i.e., L1 and R1 were grouped into glomerulus 1. This resulted in a connectivity matrix of 34 neurons (Figure 9).
Finally, as in Pisokas et al., 2020, we grouped the PEG9 and PEG1 neurons and EPG9 and EPG1 neurons together. The former because they both output to the same ellipsoid body segment, and the latter because they both receive common input. This resulted in a connectivity matrix with 32 neurons (Figure 10) which we used for the analysis in Figure 4 as this network had the eightfold rotational symmetry compatible with our theoretical model.
Fitting weights
Request a detailed protocolHaving calculated the paths or synaptic strengths between $\mathrm{EPG}$ neurons in the network, we next computed the average connectivity profile for the network – how each $\mathrm{EPG}$ neuron connected to its neighbours a certain distance around the ring. We then compared this connectivity profile to the closest fitting sinusoid to quantify to what degree our prediction of sinusoidal weights was consistent with biological data.
We use the vector
where $d=((mn)+4\phantom{\rule{1em}{0ex}}\mathrm{mod}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}8)4\in [\mathrm{4...3}]$ is the signed circular distance from neuron $m$ to neuron $n$.
In the networks from Pisokas et al., 2020, the weights are rotationally symmetric, so we minimise the mean squared error between the sinusoidal profile and the biological connectivity profile,
by using least squares.
The network derived from the synaptic count data in Scheffer et al., 2020, is not rotationally symmetric so each value in the average connectivity profile, $\omega}_{d$, has a corresponding standard deviation, $\sigma}_{d$. In this case we minimise the precisionweighted mean squared error, which emphasises fitting connectivity profile values that are more consistently seen in the network:
The result of fitting sinusoidal profiles to the data is shown in section ‘Comparing the predicted circuit with biological data’. Since the fit for the fruit fly is not as clear as for the locust, we also compared the observed weights in the fruit fly with Gaussian and von Mises curves. Since both the Gaussian and von Mises distributions also have a parameter specifying their width, this requires fitting an extra parameter. For the Gaussian we have
where $\beta}_{g},{\sigma}_{g},{\gamma}_{g$ are parameters to fit. For the von Mises distribution we have
where $\beta}_{v},{\kappa}_{v},{\gamma}_{v$ are parameters to fit. For both the Gaussian and von Mises, we find a good agreement with the weights in Figure 11, and a lower root mean square error (RMSE) than with any combination of harmonics. However, we note that our sinusoidal model has two parameters instead of three, so to make a fair comparison we use the corrected Akaike information criterion (AICc) (Burnham and Anderson, 2004), which is given by
where $p$ is the number of distribution parameters, $N$ the number of samples, and $\mathrm{ln}\left(L\right)$ is the loglikelihood, which in this case is the sum of squared errors, $\mathrm{ln}\left(L\right)=N{\text{RMSE}}^{2}$. We find that the lowest AICc (corresponding to the best model) corresponds to the harmonic $f=1$ (Table 6).
As a complementary approach to evaluate the shape of the distribution, we first fit the Gaussian and von Mises distributions to the best fit $f=1$ curve. We then freeze the width parameters of the distributions ($\sigma}_{g$ for the Gaussian and $\kappa}_{v$ for the von Mises) and only optimise the amplitude and vertical offset parameters (β and γ) to fit the data. This approach limits the number of free parameters for the Gaussian and von Mises distributions to two, to match the sinusoid. The results are shown in Figure 11 and Table 6. Both the fixedwidth Gaussian and von Mises distributions are a slightly better fit to the data than the sinusoid, but the differences between the three curves are very small.
In simplifying the fruit fly connectome data, we assumed all synapses of different types were of equal weight, as no data to the contrary were available. Different synapse types having different strengths could introduce nonlinear distortions between our net synaptic path count and the true synaptic strength, which could in turn make the data a better or worse fit for a sinusoidal compared to a Gaussian profile. As such, we don’t consider the 2% relative difference in RMSE between the $f=1$ sinusoid and fixedwidth Gaussian and von Mises distributions to be conclusive.
Overall, we find that the cosine weights that emerge from our derivations are a very close match for the locust, but less precise for the fly, where other functions fit slightly better. Given the limitations in using the currently available data to provide an exact estimate of synaptic strength (for the locust), and due to the high variability of the synaptic count (for the fruit fly), we consider that our theory is compatible with the observed data.
Convergence of Oja’s rule with multiple harmonics
Linear neurons
Request a detailed protocolIntegrating our modified Oja’s learning rule updates (Equation 19) over all angles in the position space gives us the following:
which we can transform into the Fourier domain:
The steadystate solution when the weight update is 0 is:
By Parseval’s theorem, $\Vert \mathbf{a}{\Vert}^{2}=\sum _{f}\Vert {\mathcal{F}}_{f}[\mathbf{a}]{\Vert}^{2}$ so the Fourier spectrum of the steadystate weights is just the normalised Fourier spectrum of the input:
From this we can see that the stable $L}_{1$ norm of $\mathcal{F}[\omega ]$ is 1,
If we combine this result with Equation 10, we find that in the case of a single encoding frequency, ${\mathcal{F}}_{{f}^{\ast}}[\mathbf{a}]>0$, Oja’s rule will result in a single stable harmonic with ${\mathcal{F}}_{{f}^{\ast}}[\omega ]=1$ because $\sum _{f}{\mathcal{F}}_{f}[\mathbf{a}{]}^{2}={\mathcal{F}}_{{f}^{\ast}}[\mathbf{a}{]}^{2}$.
If Oja’s rule has a normalising factor added to account for the number of encoding harmonics used, $\left\mathbf{F}\right$,
then the steadystate solution for the weights becomes
where the $L}_{1$ norm of $\omega =\left\mathbf{F}\right$. In this case multiple harmonics could develop stable values of ${\mathcal{F}}_{f}[\omega ]=1$, but only if the activity magnitudes for these harmonics are identical. If any perturbation affects the activities, the harmonic which happens to have the larger activity will begin to dominate the other by the dynamics of Equation 43, until only one remains.
Therefore, the only case where our learning rule results in a stable solution robust to perturbations is when only one harmonic is used.
Neurons with nonlinear activations
Request a detailed protocolAn analysis similar to the previous case applies when the firing rate of the neurons has a nonlinear activation. For a given activity profile, Equation 41 still applies. Thus, for an activity profile $\mathbf{a}$, it will generate the weight profile
where we set $\Vert \omega \Vert =1$ for convenience of notation. Note that the squaring process will induce a weight distribution in which the larger harmonic will be enhanced more than the rest.
Having obtained the weights, the equation for the activity profile can be solved through the selfconsistency setting in which $u(t)=0$ and the activity is maintained,
where ϕ is the nonlinear activation function. Given the rotational symmetry from our assumptions, it is enough to solve for ${\mathbf{a}}_{n}={\mathbf{a}}_{n}(0)$, which gives us a set of $N/2$ equations which can then be combined with Equation 45, which gives us the following equation to solve
Since $\mathbf{a}}_{n}={\mathbf{a}}_{n\phantom{\rule{0.667em}{0ex}}\mathrm{mod}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}N$, we obtain $N/2$ equations for $N/2$ values of $\mathbf{a}}_{n$.
Note that there is no closed form solution in general, but some generic properties can be inferred. For example, as long as we have a single bump of activity, $\displaystyle {\mathcal{F}}_{1}\left[\mathbf{a}\right]>\left{\mathcal{F}}_{f}\left[\mathbf{a}\right]\right,\text{}\mathrm{\forall}f1$, therefore ${\mathcal{F}}_{1}\left[\omega \right]>{\mathcal{F}}_{f}\left[\omega \right],\text{}\mathrm{\forall}f1$. More specifically,
Figure 12 shows the results of the same simulation as section ‘Learning rules and development’, but using neurons with a nonlinear ($\mathrm{tanh}$) activation function, and distorting the network input to be a more square wavelike sinusoid. As predicted theoretically, our rule still reinforces the dominant spatial frequency in the network, causing the weights to converge to the sinusoidal profile our theory predicts.
Data availability
The current manuscript is a computational study, so no data have been generated for this manuscript. All modelling and analysis code is available at GitHub (copy archived at Dall'Osto, 2024).

Janelia Research CampusData for A Connectome of the Adult Drosophila Central Brain v1.0.https://doi.org/10.25378/janelia.11676099
References

BookA practical informationtheoretic approachIn: Burnham KP, editors. Model Selection and Multimodel Inference. Springer. pp. 1–488.https://doi.org/10.1007/b97636

Path integration: how details of the honeybee waggle dance and the foraging strategies of desert ants might help in understanding its mechanismsThe Journal of Experimental Biology 222:jeb205187.https://doi.org/10.1242/jeb.205187

SoftwareInsectnavigationsinusoidaloptimality, version swh:1:rev:0890b606d28f33bfbeffbc0d2aa594a91733b63dSoftware Heritage.

BookTheoretical Neuroscience: Computational and Mathematical Modeling of Neural SystemsMassachusetts Institute of Technology Press.

Inputoutput maps are strongly biased towards simple outputsNature Communications 9:761.https://doi.org/10.1038/s41467018031016

Integration of polarization and chromatic cues in the insect sky compassJournal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 200:575–589.https://doi.org/10.1007/s0035901408906

The brain behind straightline orientation in dung beetlesThe Journal of Experimental Biology 222:jeb192450.https://doi.org/10.1242/jeb.192450

Evolving a neural model of insect path integrationAdaptive Behavior 15:273–287.https://doi.org/10.1177/1059712307082080

The ant's path integration system: a neural architectureBiological Cybernetics 73:483–497.https://doi.org/10.1007/s004220050204

Transformation of polarized light information in the central complex of the locustThe Journal of Neuroscience 29:11783–11793.https://doi.org/10.1523/JNEUROSCI.187009.2009

Anatomical basis of sun compass navigation II: the neuronal composition of the central complex of the monarch butterflyThe Journal of Comparative Neurology 521:267–298.https://doi.org/10.1002/cne.23214

Central neural coding of sky polarization in insectsPhilosophical Transactions of the Royal Society of London. Series B, Biological Sciences 366:680–687.https://doi.org/10.1098/rstb.2010.0199

The insect central complex and the neural basis of navigational strategiesThe Journal of Experimental Biology 222:jeb188854.https://doi.org/10.1242/jeb.188854

Ring attractor dynamics emerge from a spiking model of the entire protocerebral bridgeFrontiers in Behavioral Neuroscience 11:8.https://doi.org/10.3389/fnbeh.2017.00008

Spectral properties of identified polarizedlight sensitive interneurons in the brain of the desert locust Schistocerca gregariaThe Journal of Experimental Biology 210:1350–1361.https://doi.org/10.1242/jeb.02744

Neural network models and deep learningCurrent Biology 29:R231–R236.https://doi.org/10.1016/j.cub.2019.02.034

Polarizationsensitive interneurons in the optic lobe of the desert ant Cataglyphis bicolorDie Naturwissenschaften 87:133–136.https://doi.org/10.1007/s001140050691

Anatomy and physiology of neurons with processes in the accessory medulla of the cockroach Leucophaea maderaeThe Journal of Comparative Neurology 439:193–207.https://doi.org/10.1002/cne.1342

Behavioral analysis of polarization vision in tethered flying locustsJournal of Comparative Physiology. A, Neuroethology, Sensory, Neural, and Behavioral Physiology 190:61–68.https://doi.org/10.1007/s0035900304734

The knowledge base of bee navigationJournal of Experimental Biology 199:141–146.https://doi.org/10.1242/jeb.199.1.141

Unraveling navigational strategies in migratory insectsCurrent Opinion in Neurobiology 22:353–361.https://doi.org/10.1016/j.conb.2011.11.009

BookAnalytical cybernetics of spider navigationIn: Barth FG, editors. Neurobiology of Arachnids. Springer. pp. 298–316.https://doi.org/10.1007/9783642703485_15

An anatomically constrained model for path integration in the bee brainCurrent Biology 27:3069–3085.https://doi.org/10.1016/j.cub.2017.08.052

The head direction signal: origins and sensorymotor integrationAnnual Review of Neuroscience 30:181–207.https://doi.org/10.1146/annurev.neuro.29.051605.112854

Neural representation of space using sinusoidal arraysNeural Computation 5:869–884.https://doi.org/10.1162/neco.1993.5.6.869

Evolving neural models of path integrationThe Journal of Experimental Biology 208:3349–3366.https://doi.org/10.1242/jeb.01772

Celestial navigation in DrosophilaJournal of Experimental Biology 222:jeb186148.https://doi.org/10.1242/jeb.186148

BookPath integration in insectsIn: Jeffery K, editors. The Neurobiology of Spatial Behaviour. Oxford University Press. pp. 9–30.https://doi.org/10.1093/acprof:oso/9780198515241.003.0001

Path integration — a network modelBiological Cybernetics 73:569–575.https://doi.org/10.1007/BF00199549

Representation of spatial orientation by the intrinsic dynamics of the headdirection cell ensemble: a theoryThe Journal of Neuroscience 16:2112–2126.https://doi.org/10.1523/JNEUROSCI.160602112.1996
Article and author information
Author details
Funding
Royal Society of Edinburgh (Saltire Early Career Fellowship)
 Ioannis Pisokas
Swiss National Science Foundation (Project Grant 182539)
 Pau Vilimelis Aceituno
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We would like to thank Benjamin F Grewe for his support and helpful comments.
Copyright
© 2024, Vilimelis Aceituno, Dall'Osto et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 214
 views

 34
 downloads

 0
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.