Signal denoising through topographic modularity of neural circuits
Abstract
Information from the sensory periphery is conveyed to the cortex via structured projection pathways that spatially segregate stimulus features, providing a robust and efficient encoding strategy. Beyond sensory encoding, this prominent anatomical feature extends throughout the neocortex. However, the extent to which it influences cortical processing is unclear. In this study, we combine cortical circuit modeling with network theory to demonstrate that the sharpness of topographic projections acts as a bifurcation parameter, controlling the macroscopic dynamics and representational precision across a modular network. By shifting the balance of excitation and inhibition, topographic modularity gradually increases task performance and improves the signaltonoise ratio across the system. We demonstrate that in biologically constrained networks, such a denoising behavior is contingent on recurrent inhibition. We show that this is a robust and generic structural feature that enables a broad range of behaviorally relevant operating regimes, and provide an indepth theoretical analysis unraveling the dynamical principles underlying the mechanism.
Editor's evaluation
This manuscript puts forward a new idea that topography in neural networks helps to remove noise from inputs. The authors show that there is a critical level of topography that is needed for network to denoise inputs.
https://doi.org/10.7554/eLife.77009.sa0Introduction
Sensory inputs are often ambiguous, noisy, and imprecise. Due to volatility in the environment and inaccurate peripheral representations, the sensory signals that arrive at the neocortical circuitry are often incomplete or corrupt (Faisal et al., 2008; Renart and Machens, 2014). However, from these noisy input streams, the system is able to acquire reliable internal representations and extract relevant computable features at various degrees of abstraction (Friston, 2005; Okada et al., 2010; DiCarlo et al., 2012). Sensory perception in the mammalian neocortex thus relies on efficiently detecting the relevant input signals while minimizing the impact of noise.
Making sense of the environment also requires the estimation of features not explicitly represented by lowlevel sensory inputs. These inferential processes (Młynarski and Hermundstad, 2018; Parr et al., 2019) rely on the propagation of internal signals such as expectations and predictions, the accuracy of which must be evaluated against the ground truth, that is the sensory input stream. In a highly dynamic environment, this translates to a continuous process whose precision hinges on the fidelity with which external stimuli are encoded in the neural substrate. Additionally, as the system is modular and hierarchical (strikingly so in the sensory and motor components; Meunier et al., 2010; Park and Friston, 2013), it is critical that the external signal permeates the different processing modules despite the increasing distance from the sensory periphery (the input source) and the various transformations it is exposed to along the way, which degrade the signal via the interference of taskirrelevant and intrinsic, ongoing activity.
Accurate signal propagation can be achieved in a number of ways. One obvious solution is the direct routing and distribution of the signal, such that direct sensory input can be fed to different processing modules, which may be partially achieved through thalamocortical projections (Sherman and Guillery, 2002; Nakajima and Halassa, 2017). Another possibility, which we explore in this study, is to propagate the input signal through tailored pathways that route the information throughout the system, allowing different processing stages to retrieve it without incurring much representational loss. Throughout the mammalian neocortex, the existence and characteristics of structured projections (topographic maps) present a possible substrate for such signal routing. By preserving the relative organization of tuned neuronal populations, such maps imprint spatiotemporal features of (noisy) sensory inputs onto the cortex (Kaas, 1997; Bednar and Wilson, 2016; Wandell and Winawer, 2011). In a previous study (Zajzon et al., 2019), we discovered that structured projections can create featurespecific pathways that allow the external inputs to be faithfully represented and propagated throughout the system, but it remains unclear which connectivity properties are critical and what the underlying mechanism is. Moreover, beyond mere sensory representation, there is evidence that such structurepreserving mappings are also involved in more complex cognitive processes in associative and frontal areas (Hagler and Sereno, 2006; Silver and Kastner, 2009; Patel et al., 2014), suggesting that topographic maps are a prominent structural feature of cortical organization.
In this study, we hypothesize that structured projection pathways allow sensory stimuli to be accurately reconstructed as they permeate multiple processing modules. We demonstrate that, by modulating effective connectivity and regional E/I balance, topographic projections additionally serve a denoising function, not merely allowing the faithful propagation of input signals, but systematically improving the system’s internal representations and increasing signaltonoise ratio. We identify a critical threshold in the degree of modularity in topographic projections, beyond which the system behaves effectively as a denoising autoencoder (note that the parallel is established here on conceptual, not formal, grounds as the system is capable of retrieving the original, uncorrupted input from a noisy source, but bears no formal similarity to denoising autoencoder algorithms). Additionally, we demonstrate that this phenomenon is robust, with the qualitative behavior persisting across very different models. Theoretical considerations and network simulations show that it hinges solely on the modularity of topographic projections and the presence of recurrent inhibition, with the external input and singleneuron properties influencing where/when, but not if, denoising occurs. Our results suggest that modular structure in feedforward projection pathways can have a significant effect on the system’s qualitative behavior, enabling a wide range of behaviorally relevant and empirically supported dynamic regimes. This allows the system to: (1) maintain stable representations of multiple stimulus features (Andersen et al., 2008); (2) amplify features of interest while suppressing others through winnertakesall (WTA) mechanisms (Douglas and Martin, 2004; Carandini and Heeger, 2011); and (3) dynamically represent different stimulus features as stable and metastable states and stochastically switch among active representations through a winnerless competition (WLC) effect (McCormick, 2005; Rabinovich et al., 2008; Rost et al., 2018).
Our key finding, that the modulation of information processing dynamics and the fidelity of stimulus/feature representations results from the structure of topographic feedforward projections, provides new meaning and functional relevance to the pervasiveness of these projection maps throughout the mammalian neocortex. Beyond routing featurespecific information from sensory transducers through brainstem, thalamus, and into primary sensory cortices (notably tonotopic, retinotopic, and somatotopic maps), their maintenance within the neocortex (Patel et al., 2014) ensures that even cortical regions that are not directly engaged with the sensory input (higherorder cortex), can receive faithful representations of it, and that these internal signals, emanating from lowerorder cortical areas, can dramatically skew and modulate the circuit’s E/I balance and local functional connectivity, resulting in fundamental differences in the systems’ responsiveness.
Results
To investigate the role of structured pathways between processing modules in modulating the fidelity of stimulus representations, we study a network comprising up to six sequentially connected subnetworks (SSNs, see Materials and methods and Figure 1a). Each SSN is a balanced random network (see e.g. Brunel, 2000) of 10,000, sparsely and randomly coupled leaky integrateandfire (LIF) neurons (80% excitatory and 20% inhibitory). In each SSN, neurons are assigned to subpopulations associated with a particular stimulus. Excitatory neurons belonging to such stimulusspecific subpopulations then project to the subsequent SSN with a varying degree of specificity. We refer to a set of stimulusspecific subpopulations across the network and the structured feedforward projections among them as a topographic map. The specificity of the map is determined by the degree of modularity of the corresponding projections matrices (see e.g. Figure 1a). Modularity is thus defined as the relative density of connections within a stimulusspecific pathway (i.e., connecting subpopulations associated to the same stimulus; see Materials and methods and Figure 1a). In the following, we study the role of topographic specificity in modulating the system’s functional and representational dynamics and its ability to cope with noisecorrupted input signals.
Sequential denoising through structured projections
By systematically varying the degree of modular specialization in the feedforward projections (modularity parameter, $m$, see Materials and methods and Figure 1), we can control the segregation of stimulusspecific pathways across the network and investigate how it influences the characteristics of neural representations as the signal propagates. If the feedforward projections are unstructured or moderately structured ($m\lesssim 0.8$), information about the input fails to permeate the network, resulting in a chancelevel reconstruction accuracy in the last subnetwork, SSN_{5}, even in the absence of noise (see Figure 1b, c). However, as $m$ approaches a switching value ${m}_{\mathrm{switch}}\approx 0.83$, there is a qualitative transition in the system’s behavior, leading to a consistently higher reconstruction accuracy across the subnetworks (Figure 1b–e), regardless of the amount of noise added to the signal (Figure 1f, g).
Beyond this transition point, reconstruction accuracy improves with depth, that is the signal is more accurately represented in SSN_{5} than in the initial subnetwork, SSN_{0}, with an effective accuracy gain of over 40% (Figure 1d, g). While the addition of noise does impair the absolute reconstruction accuracy in all cases (see Figure 1—figure supplement 1), the denoising effect persists even if the input is severely corrupted (${\sigma}_{\xi}=3$, see Figure 1f, g). This is a counterintuitive result, suggesting that topographic modularity is not only necessary for reliable communication across multiple populations (see Zajzon et al., 2019), but also supports an effective denoising effect, whereby representational precision increases with depth, even if the signal is profoundly distorted by noise.
Noise suppression and response amplification
The sequential denoising effect observed beyond the transition point ${m}_{\mathrm{switch}}\approx 0.83$ results in an increasingly accurate input encoding through progressively more precise internal representations. In general, such a phenomenon could be achieved either through noise suppression, stimulusspecific response amplification or both. In this section, we examine these possibilities by analyzing and comparing the inputdriven dynamics of the different subnetworks. The strict segregation of stimulusspecific subpopulations in SSN_{0} is only fully preserved across the system if $m=1$, in which case signal encoding and transmission primarily rely on this spatial segregation. Spiking activity across the different SSNs (Figure 2a) demonstrates that the system gradually sharpens the segregation of stimulusspecific subpopulations; indeed, in systems with fully modular feedforward projections, activity in the last subnetwork is concentrated predominantly in the stimulated subpopulations. This effect can be observed in both excitatory (E) and inhibitory (I) populations, as both are equally targeted by the feedforward excitatory projections. The sharpening effect consists of both noise suppression and response amplification (Figure 2b), measured as the relative firing rates of the nonstimulated ${\nu}_{5}^{\mathrm{NS}}/{\nu}_{0}^{\mathrm{NS}}$ and stimulated subpopulations ${\nu}_{5}^{\mathrm{S}}/{\nu}_{0}^{\mathrm{S}}$, respectively. For ,$m<{m}_{\mathrm{s}\mathrm{w}\mathrm{i}\mathrm{t}\mathrm{c}\mathrm{h}}$. noise suppression is only marginal and responses within the stimulated pathways are not amplified (${\nu}_{5}^{\mathrm{S}}/{\nu}_{0}^{\mathrm{S}}<1$).
Meanfield analysis of the stationary network activity (see Materials and methods and Appendix B) predicts that the firing rates of the stimulusspecific subpopulations increase systematically with modularity, whereas the untuned neurons are gradually silenced (Figure 2c, left). At the transition point ${m}_{\mathrm{switch}}\approx 0.83$, mean firing rates across the different subnetworks converge, which translates into a globally uniform signal encoding capacity, corresponding to the zerogain convergence point in Figure 1d, g. As the degree of modularity increases beyond this point, the selfconsistent state is lost again as the functional dynamics across the network shifts toward a gradual response sharpening, whereby the activity of stimulustuned neurons become increasingly dominant (Figure 2a–c). The effect is more pronounced for the deeper subnetworks. Note that the analytical results match well with those obtained by numerical simulation (Figure 2c, right).
In the limit of very deep networks (up to 50 SSNs, Figure 2d) the system becomes bistable, with rates converging to either a highactivity state associated with signal amplification or a lowactivity state driven by the background input. The transition point is observed at a modularity value of $m=0.83$, matching the results reported so far. Below this value, elevated activity in the stimulated subpopulations can be maintained across the initial subnetworks (<10), but eventually dies out; the rate of all neurons decays and information about the input cannot reach the deeper populations. Importantly, for $m=0.83$, the transition toward the highactivity state is slower. This allows the input signal to faithfully propagate across a large number of subnetworks ($\approx 15$), without being driven into implausible activity states.
E/I balance and asymmetric effective couplings
The departure from the balanced activity in the initial subnetworks can be better understood by zooming in at the synaptic level and analyzing how topography influences the synaptic input currents. The segregation of feedforward projections into stimulusspecific pathways breaks the symmetry between excitation and inhibition (see Figure 3a) that characterizes the balanced state (Haider et al., 2006; Shadlen and Newsome, 1994), for which the first two subnetworks were tuned (see Materials and methods). E/I balance is thus systematically shifted toward excitation in the stimulated populations and inhibition in the nonstimulated ones. Neurons belonging to subpopulations associated with the active stimulus receive significantly more net overall excitation, whereas the other neurons become gradually more inhibited. This disparity grows not only with modularity but also with network depth. Overall, across the whole system, increasing modularity results in an increasingly inhibitiondominated dynamical regime (inset in Figure 3a), whereby stronger effective inhibition silences nonstimulated populations, thus sharpening stimulus/feature representations by concentrating activity in the stimulusdriven subpopulations.
To gain an intuitive understanding of these effects from a dynamical systems perspective, we linearize the network dynamics around the stationary working points of the individual populations (Tetzlaff et al., 2012) in order to obtain the effective connectivity $W$ of the system (see Materials and methods and Appendix B). The effective impact of a single spike from a presynaptic neuron $j$ on the firing rate of a postsynaptic neuron $i$ (the effective weight ${w}_{ij}\in W$) is determined not only by the synaptic efficacies ${J}_{ij}$, but also by the statistics of the synaptic input fluctuations to the target cell $i$ that determine its excitability (see Materials and methods, Equation 6). This analysis reveals that there is an increase in the effective synaptic input onto neurons in the stimulated subpopulations as a function of modularity (Figure 3b). Conversely, nonstimulated neurons effectively receive weaker excitatory (and stronger inhibitory) drive and become increasingly less responsive (see Figure 3a, b). The role of topographic modularity in denoising can thus be understood as a transient, stimulusspecific change in effective connectivity.
For low and moderate topographic precision ($m\lesssim 0.83$), denoising does not occur as the effective weights are sufficiently similar to maintain a stable E/I balance across all populations and subnetworks (Figure 3a, b), resulting in a relatively uniform global dynamical state (indicated in Figure 3c by a constant spectral radius for $m\lesssim 0.83$, see also Materials and methods) and stable linearized dynamics ($\rho (W)<1$).
However, as the feedforward projections become more structured, the system undergoes qualitative changes: after a weak transient ($0.83\lesssim m\lesssim 0.85$) the spectral radius $\rho $ in the deep SSNs expands due to the increased effective coupling to the stimulated subpopulation (Figure 3b); the spectral radius eventually ($m\gtrsim 0.85$) contracts with increasing modularity (Figure 3c, d). Given that $\rho $ is determined by the variance of $W$, that is heterogeneity across connections (Rajan and Abbott, 2006), this behavior is expected: most weights are in the nonstimulated pathways, which decrease with larger $m$ and network depth (Figure 3b). Strong inhibitory currents (Figure 3a) suppress the majority of neurons, thereby reducing noise, as demonstrated by the collapse of the bulk of the eigenvalues toward the center for larger $m$ (Figure 3d). Indicative of a more constrained state space, this contractive effect suggests that population activity becomes gradually entrained by the spatially encoded input along the stimulated pathway, whereas the responses of the nonstimulated neurons have a diminishing influence on the overall behavior.
By biasing the effective connectivity of the system, precise topography can thus modulate the balance of excitation and inhibition in the different subnetworks, concentrating the activity along specific pathways. This results in both a systematic amplification of stimulusspecific responses and a systematic suppression of noise (Figure 2b). The sharpness/precision of topographic specificity along these pathways thus acts as a critical control parameter that largely determines the qualitative behavior of the system and can dramatically alter its responsiveness to external inputs.
Modulating inhibition
How can the system generate and maintain the elevated inhibition underlying such a noisesuppressing regime? On the one hand, feedforward excitatory input may increase the activity of certain excitatory neurons in ${\mathrm{E}}_{\mathrm{i}}$ of subnetwork ${\mathrm{SSN}}_{\mathrm{i}}$, which, in turn, can lead to increased mean inhibition through local recurrent connections. On the other hand, denoising could depend strongly on the concerted topographic projections onto ${\mathrm{I}}_{\mathrm{i}}$. Such structured feedforward inhibition is known to play important functional roles in, for example, sharpening the spatial contrast of somatosensory stimuli (Mountcastle and Powell, 1959) or enhancing coding precision throughout the ascending auditory pathways (Roberts et al., 2013).
To investigate whether recurrent activity alone can generate sufficiently strong inhibition for signal transmission and denoising, we maintained the modular structure between the excitatory populations and randomized the feedforward projections onto the inhibitory ones ($m=0$ for ${\mathrm{E}}_{\mathrm{i}}\to {\mathrm{I}}_{\mathrm{i}+1}$, compare top panels of Figure 4a, b). This leads to unstable firing patterns in the downstream subnetworks, characterized by significant accumulation of synchrony and increased firing rates (see bottom panels of Figure 4a, b and Figure 4—figure supplement 1a, b). These effects, known to result from shared presynaptic excitatory inputs (see e.g. Shadlen and Newsome, 1998; Tetzlaff et al., 2003; Kumar et al., 2008a), are more pronounced for larger $m$ and network depth (see Figure 4—figure supplement 1). Compared with the baseline network, whose activity shows clear spatially encoded stimuli (sequential activation of stimulusspecific subpopulations [Figure 4a, bottom]), removing structure from the projections onto inhibitory neurons abolishes the effect and prevents accurate signal transmission.
These effects of unstructured inhibitory projections are so marked that they can be observed even if a single set of projections is modified: this can be seen in Figure 4c, where only the ${\mathrm{E}}_{4}\to {\mathrm{I}}_{5}$ connections are randomized. It is worth noting, however, that the excessive synchronization that results from unstructured inhibitory projections (Figure 4c, bottom left, no additional input condition) can be easily counteracted by driving ${\mathrm{I}}_{5}$ (the inhibitory population that receives only unstructured projections) with additional uncorrelated external input. If strong enough (${\nu}_{\mathrm{X}}^{+}\approx 10\mathrm{s}\mathrm{p}\mathrm{k}/\mathrm{sec}$), this additional external drive pushes the inhibitory population into an asynchronous regime that restores the sharp, stimulusspecific responses in the excitatory population of the corresponding subnetwork (see Figure 4c, bottom right, and Figure 4—figure supplement 1c).
These results emphasize the control of inhibitory neurons’ responsiveness as the main causal mechanism behind the effects reported. Elevated local inhibition is strictly required, but whether this is achieved by tailored, stimulusspecific activation of inhibitory subpopulations, or by uncorrelated excitatory drive onto all inhibitory neurons appears to be irrelevant and both conditions result in sharp, stimulustuned responses in the excitatory populations.
A generalizable structural effect
We have demonstrated that, by controlling the different subnetworks’ operating point, the sharpness of feedforward projections allows the architecture to systematically improve the quality of internal representations and retrieve the input structure, even if profoundly corrupted by noise. In this section, we investigate the robustness of the phenomenon in order to determine whether it can be entirely ascribed to the topographic projections (a structural/architectural feature) or if the particular choices of models and model parameters for neuronal and synaptic dynamics contribute to the effect.
To do so, we study two alternative model systems on the signal denoising task. These are structured similar to the baseline system explored so far, comprising separate sequential subnetworks with modular feedforward projections among them (see Figure 1 and Materials and methods), but vary in total size, neuronal and synaptic dynamics. In the first test case, only the models of synaptic transmission and corresponding parameters are altered. To increase biological verisimilitude and following Zajzon et al., 2019, synaptic transmission is modeled as a conductancebased process, with different kinetics for excitatory and inhibitory transmission, corresponding to the responses of $\mathrm{AMPA}$ and ${\mathrm{GABA}}_{\mathrm{a}}$ receptors, respectively, see Materials and methods and Supplementary file 3 for details. The results, illustrated in Figure 5a, demonstrate that task performance and population activity across the network follow a similar trend to the baseline model (Figures 1 and 2a, b). Despite severe noise corruption, the system is able to generate a clear, discernible representation of the input as early as SSN_{2} and can accurately reconstruct the signal. Importantly, the relative improvement with increasing modularity and network depth is retained. In comparison to the baseline model, the transition occurs for a slightly different topographic configuration, ${m}_{\mathrm{switch}}\approx 0.85$, at which point the network dynamics converges toward a lowrate, stable asynchronous irregular regime across all populations, facilitating a linear firing rate propagation along the topographic maps (Figure 5—figure supplement 1).
The second test case is a smaller and simpler network of nonlinear rate neuron models (see Figure 5b and Materials and methods) which interact via continuous signals (rates) rather than discontinuities (spikes). Despite these profound differences in the neuronal and synaptic dynamics, the same behavior is observed, demonstrating that sequential denoising is a structural effect, dependent on the population firing rates and thus less sensitive to fluctuations in the precise spike times. Moreover, the robustness with respect to the network size suggests that denoising could also be performed in smaller, localized circuits, possibly operating in parallel on different features of the input stimuli.
Variable map sizes
Despite their ubiquity throughout the neocortex, the characteristics of structured projection pathways is far from uniform (Bednar and Wilson, 2016), exhibiting marked differences in spatial precision and specificity, aligned with macroscopic gradients of cortical organization. This nonuniformity may play an important functional role supporting feature aggregation (Hagler and Sereno, 2006) and the development of mixed representations (Patel et al., 2014) in higher (more anterior) cortical areas. Here, we consider two scenarios in the baseline (currentbased) model to examine the robustness of our findings to more complex topographic configurations.
First, we varied the size of stimulustuned subpopulations (parametrized by ${d}_{\mathrm{i}}$, see Materials and methods) but kept them fixed across the network. For small subpopulations and intermediate degrees of topographic modularity, the activity along the stimulated pathway decays with network depth, suggesting that input information does not reach the deeper SSNs (see Figure 6a and Figure 6—figure supplement 1). These results place a lower bound on the size of stimulustuned subpopulations below which no signal propagation can occur, as reflected by the negative gain in performance for $d=0.01$ (Figure 6b). Whereas denoising is robust to variation around the baseline value of $d=0.1$ that yielded perfect partitioning of the feedforward projections (see Supplementary Materials), an upper bound may emerge due to increasing overlap between the maps ($d=0.2$ in Figure 6b). In this case, the activity may ‘spill over’ to other pathways than the stimulated one, corrupting the input representations and hindering accurate transmission and decoding. This can be alleviated by reduced or no overlap (as in Figure 6a), in which case signal propagation and denoising is successful for larger map sizes (${\nu}_{5}^{\mathrm{S}}/{\nu}_{0}^{\mathrm{S}}>1$ also for $d>0.1$). We thus observe a tradeoff between map size, overlap and the degree of topographic precision that is required to accurately propagate stimulus representations (see Discussion).
Second, we took into account the fact that these structural features are known to vary with hierarchical depth resulting in increasingly larger subpopulations and, consequently, increasingly overlapping stimulus selectivity (Smith et al., 2001; Patel et al., 2014; Bednar and Wilson, 2016). To capture this effect, we introduce a linear scaling of map size with depth (${d}_{i+1}=\delta +{d}_{i}$ for $i\ge 1$, see Materials and methods). The ability of the circuit to gradually clean the signal’s representation is fully preserved, as illustrated in Figure 6c. In fact, for intermediate modularity ($m<0.9$) broadening the projections can further sharpen the reconstruction precision (compare curves for $\delta =0.02$ and $\delta =0$).
Taken together, these observations demonstrate that a gradual denoising of stimulus inputs can occur entirely as a consequence of the modular wiring between the subsequent processing circuits. Importantly, this effect generalizes well across diverse neuron and synapse models, as well as key system properties, making modular topography a potentially universal circuit feature for handling noisy data streams.
Modularity as a bifurcation parameter
The results so far indicate that the modular topographic projections, more so than the individual characteristics of neurons and synapses, lead to a sequential denoising effect through a joint process of signal amplification and noise suppression. To better understand how the system transitions to such an operating regime, it is helpful to examine its macroscopic dynamics in the limit of many subnetworks (Toyoizumi, 2012; CaycoGajic and SheaBrown, 2013; Kadmon and Sompolinsky, 2016). We apply standard meanfield techniques (Fourcaud and Brunel, 2002; Helias et al., 2013; Schuecker et al., 2015) to find the asymptotic firing rates (fixed points across subnetworks) of the stimulated and nonstimulated subpopulations as a function of topography (Figure 2d). For this, we can approximate the input μ to a group of neurons as a linear function of its firing rate $\nu $ with a slope $\kappa $ that is determined by the coupling within the group and an offset given by inputs from other groups of neurons (orange line in Figure 7a). With an approximately sigmoidal rate transfer function, the selfconsistent solutions are at the intersections marked in Figure 7a.
Formally, all neurons in the deep subnetworks of one topographic map form such a group as they share the same firing rate (asymptotic value). The coupling $\kappa $ within this group comprises not only recurrent connections of one subnetwork but also modular feedforward projections across subnetworks. For small modularity, the group is in an inhibitiondominated regime ($\kappa <0$) and we obtain only one fixed point at low activity (Figure 7a, left). Importantly, the firing rate of this fixed point is the same for stimulated and nonstimulated topographic maps. Any influence of input signals applied to SSN_{0} therefore vanishes in the deeper subnetworks and the signal cannot be reconstructed (fading regime). As topographic projections become more concentrated (larger $m$), $\kappa $ changes sign and gradually leads to two additional fixed points (as conceptually illustrated in Figure 7a and quantified in Figure 7b by numerically solving the selfconsistent meanfield equations, see also Appendix B): an unstable one (red) that eventually vanishes with increasing $m$ and a stable highactivity fixed point (black). The bistability opens the possibility to distinguish between stimulated and nonstimulated topographic maps and thereby reconstruct the signal in deep subnetworks: in the active regime beyond the critical modularity threshold (here $m\ge {m}_{\mathrm{crit}}=0.76$), a sufficiently strong input signal can drive the activity along the stimulated map to the highactivity fixed point, such that it can permeate the system, while the nonstimulated subpopulations still converge to the lowactivity fixed point. Note that this critical modularity represents the minimum modularity value for which bistability emerges. It typically differs from the actual switching point $m}_{switch$, which additionally depends on the input intensity.
In the potential energy landscape $U$ (see Materials and methods), where stable fixed points correspond to minima, the bistability that emerges for more structured topography $m\ge {m}_{\mathrm{crit}}=0.76$ can be understood as a transition from a single minimum at low rates (Figure 7c, inset) to a second minimum associated with the highactivity state (Figure 7c). Even though the full dynamics of the spiking network away from the fixed point cannot be entirely understood in this simplified potential picture (see Appendix B), qualitatively, more strongly modular networks cause deeper potential wells, corresponding to more attractive dynamical states and higher firing rates (see Figure 9—figure supplement 2).
Because the intensity of the input signal dictates the rate of different populations in the initial subnetwork SSN_{0} (Figure 7d), it also determines, for any given modularity, whether the rate of the stimulated subpopulation is in the basin of attraction of the highactivity (see Figure 7e, solid markers and arrows) or lowactivity (dashed, blue marker and arrow) fixed point. Denoising, and therefore increasing signal reconstruction, is thus achieved by successively (across subnetworks) pushing the population states toward the selfconsistent firing rates.
As reported above, for the baseline network and (standard) input ($\lambda =0.05$) used in Figures 1 and 2, the switching point between low and high activity is at $m=0.83$ (blue markers in Figure 7d, f). Stronger input signals move the switching point toward the minimal modularity $m=0.76$ of the active regime (black markers in Figure 7d, f), while weaker inputs only induce a switch at larger modularities (gray markers in Figure 7d, f).
Noise in the input simply shifts the transition point to the highactivity state in a similar manner, with more modular connectivity required to compensate for stronger jitter (Figure 7g). However, as long as the mean firing rate of the stimulated subpopulation in SSN_{0} is slightly higher than that of the nonstimulated ones (up to 0.5 spks/sec), it is sufficient to position the system in the attracting basin of the highrate fixed point and the system is able to clean the signal representation. This indicates a remarkably robust denoising mechanism.
Critical modularity for denoising
In addition to properties of the input, the critical modularity marking the onset of the active regime is also influenced by neuronal and connectivity features. To build some intuition, it is helpful to consider the sigmoidal activation function of spiking neurons (Figure 8a). The nonlinearity of this function prohibits us from obtaining quantitative, closedform analytical expressions for the critical modularity and requires a numerical solution of the selfconsistency equations (Figure 7b). However, since the continuous rate model shows a qualitatively similar behavior to the spiking baseline model (see Section ‘A generalizable structural effect’), we can study a fully analytically tractable model with piecewise linear activation function (Figure 8a, b) to expose the dependence of the critical modularity on both neuron and network properties (see detailed derivations in Appendix B).
In this simple model, the output is zero for inputs below ${\mu}_{\mathrm{min}}=15$ and at maximum rate ${\nu}_{\mathrm{max}}=150$ for inputs above ${\mu}_{\mathrm{max}}=400$. In between these two bounds, the output is linearly interpolated $\nu (\mu )={\nu}_{\mathrm{max}}(\mu {\mu}_{\mathrm{min}})/({\mu}_{\mathrm{max}}{\mu}_{\mathrm{min}})$. As discussed before, successful denoising is achieved if the nonstimulated subpopulations are silent, ${\nu}^{\mathrm{NS}}=0$, and the stimulated subpopulations are active, ${\nu}^{\mathrm{S}}>0$. Note that in the following we focus on this ideal scenario representing perfect denoising, but, in principle, intermediate solutions with ${\nu}^{\mathrm{S}}\gg {\nu}^{\mathrm{N}\mathrm{S}}>0$ may also occur and could still be considered as successful denoising. Analyzing for which neuron, network and input properties this scenario is achieved, we obtain multiple conditions for the modularity that need to be fulfilled.
The first condition illustrates the dependence of the critical modularity on the neuron model (Figure 8c, purple horizontal line)
where ${N}_{\mathrm{C}}$ is the number of stimulusspecific subpopulations and $\alpha \le 1$ (typically with a value of 0.25) represents the (reduced) noise ratio in the deeper subnetworks, with $\alpha $ scaling the noise and $1\alpha $ scaling the feedforward connections (see Materials and methods). This is necessary to ensure that the total excitatory input to each neuron is consistent across the network. In particular, the critical modularity depends on the dynamic range of input ${\mu}_{\mathrm{max}}{\mu}_{\mathrm{min}}$ and output ${\nu}_{\mathrm{max}}$. The condition represents a lower bound on the modularity required for denoising. Importantly, while it depends on the effective coupling strength $\mathcal{J}$, the noise ratio $\alpha $ and the number of maps ${N}_{\mathrm{C}}$ (see Materials and methods), it does not depend on the nature of the recurrent interactions (E/I ratio) and the strength of the external background input. In addition, we find two additional critical values of the modularity (cyan and green curves in Figure 8c–e), both of which do depend on the strength of the external background input ${\nu}_{\mathrm{X}}$ and the recurrent connectivity (E/I ratio $\gamma g$):
Depending on the external input strength ${\nu}_{\mathrm{X}}$, these are either upper or lower bounds. In the denominator of these expressions, the total input (recurrent and external) is compared to the limits of the dynamic range of the neuron model. The cancellation between recurrent and external inputs in the inhibitiondominated baseline model typically yields a total input within the dynamic range of the neuron, such that modularity in feedforward connections can decrease the input of the nonstimulated subpopulations to silence them, and increase the input of the stimulated subpopulations to support their activity. The competition between the excitatory and inhibitory contributions ensures that the total input does not lead to a saturating output activity. Thus, for inhibitory recurrence, denoising can be achieved at a moderate level of modularity over a large range of external background inputs (shaded black and hatched regions in Figure 8c), which demonstrates a robust denoising mechanism even in the presence of changes in the input environment.
In contrast, if recurrent connections are absent, strong inhibitory external background input is required to counteract the excitatory feedforward input and achieve a denoising scenario (Figure 8d). Fixed points at nonsaturated activity ${\nu}^{\mathrm{S}}>0$ are also present for low excitatory external input, but unstable due to the positive recurrent feedback. This is because in networks without recurrence, there is no competition between the recurrent input and the external and feedforward inputs. As a result, the input to both the stimulated and nonstimulated subpopulations is typically high, such that modulation of the feedforward input via topography cannot lead to a strong distinction between the pathways as required for denoising. In these networks, one typically observes high activity in all populations. A similar behavior can be observed in excitationdominated networks (Figure 8e), where the inhibitory external background input must be even stronger to compensate the excitatory feedforward and recurrent connectivity and reach a stable denoising regime.
Note that inhibitory external input is not in line with the excitatory nature of external inputs to local circuits in the brain and is therefore biologically implausible. One way to achieve denoising in excitationdominated networks for excitatory background inputs would be to shift the dynamic range of the activation function (see Figure 8—figure supplement 1), which is, however, not consistent with the biophysical properties of real neurons (distance between threshold and rest as compared to typical strengths of postsynaptic potentials). In summary, we find that recurrent inhibition is crucial to achieve denoising in biologically plausible settings.
These results on the role of recurrence and external input can be transferred to the behavior of the spiking model. While details of the fixed point behavior depend on the specific choice of the activation function, Figure 8f, h shows that there is also no denoising regime for the spiking model in case of no or excitationdominated recurrence and a biologically plausible level of external input. Instead, one finds high activity in both stimulated and nonstimulated subpopulations, as confirmed by network simulations (Figure 8g, i). Figure 8—figure supplement 2 further confirms that even reducing the external input to zero does not avoid this highactivity state in both stimulated and nonstimulated subpopulations for $m<1$.
Input integration and multistability
The analysis considered in the sections above is restricted to a system driven with a single external stimulus. However, to adequately understand the system’s dynamics, we need to account for the fact that it can be concurrently driven by multiple input streams. If two simultaneously active stimuli drive the system (see illustration in Figure 9a), the qualitative behavior where the responses along the stimulated (nonstimulated) maps are enhanced (silenced) is retained if the strength of the two input channels is sufficiently different (Figure 9b, top panel). In this case, the weaker stimulus is not strong enough to drive the subpopulation it stimulates toward the basin of attraction of the highactivity fixed point. Consequently, the subpopulation driven by this second stimulus behaves as a nonstimulated subpopulation and the system remains responsive to only one of the two inputs, acting as a WTA circuit. If, however, the ratio of stimulus intensities varies, two active subpopulations may coexist (Figure 9b, center) and/or compete (bottom panel), depending also on the degree of topographic modularity.
To quantify these variations in macroscopic behavior, we focus on the dynamics of SSN_{5} and measure the similarity (correlation coefficient) between the firing rates of the two stimulusspecific subpopulations as a function of modularity and ratio of input intensities ${\lambda}_{2}/{\lambda}_{1}$ (see Materials and methods and Figure 9c). In the case that both inputs have similar intensities but the feedforward projections are not sufficiently modular, both subpopulations are activated simultaneously (CoEx, red area in Figure 9c). This is the dynamical regime that dominates the earlier subnetworks. However, this is a transient state, and the CoEx region gradually shrinks with network depth until it vanishes completely after approximately 9–10 SSNs (see Figure 9d).
For low modularity, the system settles in the single stable state associated with nearzero firing rates, as illustrated schematically in the energy landscape in Figure 9e, (1) (see Materials and methods, Appendix B, and Supplementary Materials for derivations and numerical simulations). Above the critical modularity value, the system enters one of two different regimes. For $m>0.84$ and an input ratio below 0.7 (Figure 9c, gray area), one stimulus dominates (WTA) and the responses in the two populations are uncorrelated (Figure 9b, top panel). Although the potential landscape contains two minima corresponding to either population being active, the system always settles in the highactivity attractor state corresponding to the dominating input (Figure 9e, (2)).
If, however, the two inputs have comparable intensities and the topographic projections are sharp enough ($m>0.84$), the system transitions into a different dynamical state where neither stimulusspecific subpopulation can maintain an elevated firing rate for extended periods of time. In the extreme case of nearly identical intensities (${\lambda}_{2}/{\lambda}_{1}\ge 0.9)$ and high modularity, the responses become anticorrelated (Figure 9b, bottom panel), that is the activation of the two stimulusspecific subpopulations switches, as they engage in a dynamic behavior reminiscent of WLC between multiple neuronal groups (Lagzi and Rotter, 2015; Rost et al., 2018). The switching between the two states is driven by stochastic fluctuations (Figure 9e, (3)). The depth of the wells and width of barrier (distance between fixed points) increase with modularity (see Figure 9e, (4) and Figure 9—figure supplement 2), suggesting a greater difficulty in moving between the two attractors and consequently fewer state changes. Numerical simulations confirm this slowdown in switching (Figure 9f).
We wish to emphasize that the different dynamical states arise primarily from the feedforward connectivity profile. Nevertheless, even though the synaptic weights are not directly modified, varying the topographic modularity does translate to a modification of the effective connectivity weights (Figure 3b). The ratio of stimulus intensities also plays a role in determining the dynamics, but there is a (narrow) range (approximately between 0.75 and 0.8) for which all 3 regions can be reached through sole modification of the modularity. Together, these results demonstrate that topography can not only lead to spatial denoising but also enable various, functionally important network operating points.
Reconstruction and denoising of dynamical inputs
Until now, we have considered continuous but piecewise constant, step signals, with each step lasting for a relatively long and fixed period of $200\mathrm{ms}$. This may give the impression that the denoising effects we report only works for static or slowly changing inputs, whereas naturalistic stimuli are continuously varying. Nevertheless, sensory perception across modalities relies on varying degrees of temporal and spatial discretization (VanRullen and Koch, 2003), with individual (sub)features of the input encoded by specific (sub)populations of neurons in the early stages of the sensory hierarchy. In this section, we will demonstrate that denoising is robust to the temporal properties of the input and its encoding, as we relax many of the assumptions made in previous sections.
We consider a sinusoidal input signal, which we discretize and map onto the network according to the depiction in Figure 10a. This approach is similar to previous works, for instance it can mimic the movement of a light spot across the retina (Klos et al., 2018). By varying the sampling interval $dt$ and number of channels $k$, we can change the coarseness of the discretization from steplike signals to more continuous approximations of the input. If we choose a high sampling rate ($dt=1\mathrm{ms}$) and sufficient channels ($k=40$), we can accurately encode even fast changing signals (Figure 10b). Given that each inputdriven SSN is inhibitiondominated and therefore close to the balanced state, the network exhibits a fast tracking property (van Vreeswijk and Sompolinsky, 1996) and can accurately represent and denoise the underlying continuous signal in the spiking activity (Figure 10c, top). This is also captured by the readout, with the tracking precision increasing with network depth (Figure 10c, bottom). In this condition, there is a performance gain of up to 50% in the noiseless case (Figure 10d, top) and similar values for varying levels of noise (Figure 10d, bottom).
Note that due to the increased number of input channels (40 compared to 10) projecting to the same number of neurons in SSN_{0} as before $(800)$, for the same ${\sigma}_{\xi}$ the effective amount of noise each neuron receives is, on average, four times larger than in the baseline network. Moreover, the task was made more difficult by the significant overlap between the maps (${N}_{\mathrm{C}}=20$) as well as the resulting decrease in neuronal input selectivity. Nevertheless, similar results were obtained for slower and more coarsely sampled signals (Figure 10e–g).
We found comparable denoising dynamics for a large range of parameter combinations involving the map size, number of maps, number of channels, and signal complexity. Although there are limits with respect to the frequencies (and noise intensity) the network can track (see Figure 10—figure supplement 1), these findings indicate a very robust and flexible phenomenon for denoising spatially encoded sensory stimuli.
Discussion
The presence of stimulus or featuretuned subpopulations of neurons in primary sensory cortices (as well as in downstream areas) provides an efficient spatial encoding strategy (Pouget et al., 1999; Seriès et al., 2004; Tkacik et al., 2010) that ensures the relevant computable features are accurately represented. Here, we propose that beyond primary sensory areas, modular topographic projections play a key role in preserving accurate representations of sensory inputs across many processing modules. Acting as a structural scaffold for a sequential denoising mechanism, we show how they simultaneously enhance relevant stimulus features and remove noisy interference. We demonstrate this phenomenon in a variety of network models and provide a theoretical analysis that indicates its robustness and generality.
When reconstructing a spatially encoded input signal corrupted by noise in a network of sequentially connected populations, we find that a convergent structure in the feedforward projections is not only critical for successfully solving the task, but that the performance increases significantly with network depth beyond a certain modularity (Figure 1). Through this mechanism, the response selectivity of the stimulated subpopulations is sharpened within each subsequent subnetwork, while others are silenced (Figure 2). Such wiring may support efficient and robust information transmission from the thalamus to deeper cortical centers, retaining faithful representations even in the presence of strong noise. We demonstrate that this holds for a variety of signals, from approximately static (stepwise) to smoothly and rapidly changing dynamic inputs (Figure 10). Thanks to the balance of excitation and inhibition, the network is able to track spatially encoded signals on very short timescales, and is flexible with respect to the level of spatial and temporal discretization. Accurate tracking and denoising requires that the encoding is locally static/semistationary for only a few tens of milliseconds, which is roughly in line with psychophysics studies on the limits of sensory perception (Borghuis et al., 2019).
More generally, topographic modularity, in conjunction with other topdown processes (Kok et al., 2012), could provide the anatomical substrate for the implementation of a number of behaviorally relevant processes. For example, feedforward topographic projections on the visual pathway could contribute, together with various attentional control processes, to the widely observed popout effect in the later stages of the visual hierarchy (BrefczynskiLewis et al., 2009; Itti et al., 1998). The popout effect, at its core, assumes that in a given context some neurons exhibit sharper selectivity to their preferred stimulus feature than the neighboring regions, which can be achieved through a winnertakeall (WTA) mechanism (see Figure 9 and Himberger et al., 2018).
The WTA behavior underlying the denoising is caused by a reshaping of the E/I balance across the network (see Figure 3). As the excitatory feedforward projections become more focused, they modulate the system’s effective connectivity and thereby the gain on the stimulusspecific pathways, gating or allowing (and even enhancing) signal propagation. This change renders the stimulated pathway excitatory in the active regime (see Figure 7), leading to multiple fixed points such as those observed in networks with local recurrent excitation (Renart et al., 2007; LitwinKumar and Doiron, 2012). While the highactivity fixed point of such clustered networks is reached over time, in our model it unfolds progressively in space, across multiple populations. Importantly, in the range of biologically plausible numbers of cortical areas relevant for signal transmission (up to 10 for some visual stimuli, see Felleman and Van Essen, 1991; Hegdé and Felleman, 2007) and intermediate modularity, the firing rates remain within experimentally observed limits and do not saturate. The basic principle is similar to other approaches that alter the gain on specific pathways to facilitate stimulus propagation, for example through stronger synaptic weights (Vogels and Abbott, 2005), stronger nonlinearity (Toyoizumi, 2012), tuning of connectivity strength, and neuronal thresholds (CaycoGajic and SheaBrown, 2013), via detailed balance of local excitation and inhibition (amplitude gating; Vogels and Abbott, 2009) or with additional subcortical structures (Cortes and van Vreeswijk, 2015). Additionally, our model also displays some activity characteristics reported previously, such as the response sharpening observed for synfire chains (Diesmann et al., 1999) or (almost) linear firing rate propagation (Kumar et al., 2010) (for intermediate modularity).
However, due to the reliance on increasing inhibitory activity at every stage, we speculate that denoising, as studied here, would not occur in such a system containing a single, shared inhibitory pool with homogeneous connectivity. In this case, inhibition would affect all excitatory populations uniformly, with stronger activity potentially preventing accurate stimulus transmission from the initial subnetworks. Nevertheless, this problem could be alleviated using a more realistic, localized spatial connectivity profile as in Kumar et al., 2008a, or by adding shadow pools (groups of inhibitory neurons) for each layer of the network, carefully wired in a recurrent or feedforward manner (Aviel et al., 2003; Aviel et al., 2005; Vogels and Abbott, 2009). In such networks with nonrandom or spatially dependent connectivity, structured (modular) topographic projections onto the inhibitory populations will likely be necessary to maintain stable dynamics and attain the appropriate inhibitiondominated regimes (Figure 3). Alternatively, these could be achieved through additional, targeted inputs from other areas (Figure 4), with feedforward inhibition known to provide a possible mechanism for contextdependent gating or selective enhancement of certain stimulus features (Ferrante et al., 2009; Roberts et al., 2013).
While our findings build on the above results, we here show that the experimentally observed topographic maps may serve as a structural denoising mechanism for sensory stimuli. In contrast to most works on signal propagation where noise mainly serves to stabilize the dynamics and is typically avoided in the input, here the system is driven by a continuous signal severely corrupted by noise. Taking a more functional approach, this input is reconstructed using linear combinations of the full network responses, rather than evaluating the correlation structure of the activity or relying on precise firing rates. Focusing on the modularity of such maps in recurrent spiking networks, our model also differs from previous studies exploring optimal connectivity profiles for minimizing information loss in purely feedforward networks (Renart and van Rossum, 2012; Zylberberg et al., 2017), also in the context of sequential denoising autoencoders (Kadmon and Sompolinsky, 2016) and stimulus classification (Babadi and Sompolinsky, 2014), which used simplified neuron models or shallow networks, made no distinction between excitatory and inhibitory connections, or relied on specific, trained connection patterns (e.g., chosen by the pseudoinverse model). Although the bistability underlying denoising can, in principle, also be achieved in such feedforward or networks without inhibition, our theoretical predictions and network simulations indicate that for biologically constrained circuits (i.e., where the background and longrange feedforward input is excitatory), inhibitory recurrence is indispensable for the spatial denoising studied here (see Section ‘Critical modularity for denoising’). Recurrent inhibition compensates for the feedforward and external excitation, generating competition between the topographic pathways and allowing the populations to rapidly track their input.
Moreover, our findings provide an explanation for how lowintensity stimuli (1–2 spks/sec above background activity, see Figure 2 and Supplementary Materials) could be amplified across the cortex despite significant noise corruption, and relies on a generic principle that persists across different network models (Figure 5) while also being robust to variations in the map size (Figure 6). We demonstrated both the existence of a lower and upper (due to increased overlap) bound on their spatial extent for signal transmission, as well as an optimal region for which denoising was most pronounced. These results indicate a tradeoff between modularity and map size, with larger maps sustaining stimulus propagation at lower modularity values, whereas smaller maps must compensate through increased topographic density (see Figure 6a and Supplementary Materials). In the case of smaller maps, progressively enlarging the receptive fields enhanced the denoising effect and improved task performance (Figure 6c), suggesting a functional benefit for the anatomically observed decrease in topographic specificity with hierarchical depth (Bednar and Wilson, 2016; Smith et al., 2001). One advantage of such a wiring could be spatial efficiency in the initial stages of the sensory hierarchy due to anatomical constraints, for instance the retina or the lateral geniculate nucleus. While we get a good qualitative description of how the spatial variation of topographic maps influences the system’s computational properties, the numerical values in general are not necessarily representative. Cortical maps are highly dynamic and exhibit more complex patterning, making (currently scarce) precise anatomical data a prerequisite for more detailed investigations. For instance, despite abundant information on the size of receptive fields (Smith et al., 2001; Liu et al., 2016; Keliris et al., 2019), there is relatively little data on the connectivity between neurons tuned to related or different stimulus features across distinct cortical circuits. Should such experiments become feasible in the future, our model provides a testable prediction: the projections must be denser (or stronger) between smaller maps to allow robust communication whereas for larger maps fewer connections may be sufficient.
Finally, our model relates topographic connectivity to competitionbased network dynamics. For two input signals of comparable intensities, moderately structured projections allow both representations to coexist in a decodable manner up to a certain network depth, whereas strongly modular connections elicit WLC like behavior characterized by stochastic switching between the two stimuli (see Figure 9). Computation by switching is a functionally relevant principle (McCormick, 2005; Schittler Neves and Timme, 2012), which relies on fluctuation or inputdriven competition between different metastable (unstable) or stable attractor states. In the model studied here, modular topography induced multistability (uncertainty) in representations, alternating between two stable fixed points corresponding to the two input signals. Structured projections may thus partially explain the experimentally observed competition between multiple stimulus representations across the visual pathway (Li et al., 2016), and is conceptually similar to an attractorbased model of perceptual bistability (MorenoBote et al., 2007). Moreover, this multistability across subnetworks can be ‘exploited’ at any stage by control signals, that is additional modulation (inihibitory) could suppress one and amplify (bias) another.
Importantly, all these different dynamical regimes emerge progressively through the hierarchy and are not discernible in the initial modules. Previous studies reporting on similar dynamical states have usually considered either the synaptic weights as the main control parameter (Lagzi and Rotter, 2015; Lagzi et al., 2019; Vogels and Abbott, 2005) or studied specific architectures with clustered connectivity (Schaub et al., 2015; LitwinKumar and Doiron, 2012; Rost et al., 2018). Our findings suggest that in a hierarchical circuit a similar palette of behaviors can be also obtained given appropriate effective connectivity patterns modulated exclusively through modular topography. Although we used fixed projections throughout this study, these could also be learned and shaped continuously through various forms of synaptic plasticity (see e.g. Tomasello et al., 2018). To achieve such a variety of dynamics, cortical circuits most likely rely on a combination of all these mechanisms, that is, prewired modular connections (within and between distant modules) and heterogeneous gain adaptation through plasticity, along with more complex processes such as targeted inhibitory gating.
Overall, our results highlight a novel functional role for topographically structured projection pathways in constructing reliable representations from noisy sensory signals, and accurately routing them across the cortical circuitry despite the plethora of noise sources along each processing stage.
Materials and methods
Network architecture
Request a detailed protocolWe consider a feedforward network architecture where each subnetwork (SSN) is a balanced random network (Brunel, 2000) composed of $N=10000$ homogeneous LIF neurons, grouped into a population of ${N}^{\mathrm{E}}=0.8N$ excitatory and ${N}^{\mathrm{I}}=0.2N$ inhibitory units. Within each subnetwork, neurons are connected randomly and sparsely, with a fixed number of ${K}_{\mathrm{E}}=\u03f5{N}^{\mathrm{E}}$ local excitatory and ${K}_{\mathrm{I}}=\u03f5{N}^{\mathrm{I}}$ local inhibitory inputs per neuron. The subnetworks are arranged sequentially, that is the excitatory neurons ${\mathrm{E}}_{\mathrm{i}}$ in ${\mathrm{SSN}}_{\mathrm{i}}$ project to both ${\mathrm{E}}_{\mathrm{i}+1}$ and ${\mathrm{I}}_{\mathrm{i}+1}$ populations in the subsequent subnetwork ${\mathrm{SSN}}_{\mathrm{i}+1}$ (for an illustrative example, see Figure 1a). There are no inhibitory feedforward projections. Although projections between subnetworks have a specific, nonuniform structure (see next section), each neuron in ${\mathrm{SSN}}_{\mathrm{i}+1}$ receives the same total number of synapses from the previous SSN, ${K}_{\mathrm{FF}}$.
In addition, all neurons receive ${K}_{\mathrm{X}}$ inputs from an external source representing stochastic background noise. For the first subnetwork, we set ${K}_{\mathrm{X}}={K}_{\mathrm{E}}$, as it is commonly assumed that the number of background input synapses modeling local and distant cortical input is in the same range as the number of recurrent excitatory connections (see e.g. Brunel, 2000; Kumar et al., 2008b; Duarte and Morrison, 2014). To ensure that the total excitatory input to each neuron is consistent across the network, we scale ${K}_{\mathrm{X}}$ by a factor of $\alpha =0.25$ for the deeper SSNs and set ${K}_{\mathrm{FF}}=\left(1\alpha \right){K}_{\mathrm{E}}$, resulting in a ratio of 3:1 between the number of feedforward and background synapses.
Modular feedforward projections
Within each SSN, each neuron is assigned to one or more of ${N}_{\mathrm{C}}$ subpopulations SP associated with a specific stimulus (${N}_{\mathrm{C}}=10$ unless otherwise stated). This is illustrated in Figure 1a for ${N}_{\mathrm{C}}=2$. We choose these subpopulations so as to minimize their overlap within each ${\mathrm{SSN}}_{\mathrm{i}}$, and control their effective size ${C}_{\mathrm{i}}^{\beta}={d}_{\mathrm{i}}{N}^{\beta},\beta \in [\mathrm{E},\mathrm{I}]$, through the scaling parameter ${d}_{\mathrm{i}}\in [0,1]$. Depending on the size and number of subpopulations, it is possible that some neurons are not part of any or that some neurons belong to multiple such subpopulations (overlap).
Map size
Request a detailed protocolIn what follows, a topographic map refers to the sequence of subpopulations in the different subnetworks associated with the same stimulus. To enable a flexible manipulation of the map sizes, we constrain the scaling factor ${d}_{\mathrm{i}}$ by introducing a stepwise linear increment $\delta $, such that ${d}_{\mathrm{i}}={d}_{0}+i\delta ,i\ge 1$. Unless otherwise stated, we set ${d}_{0}=0.1$ and $\delta =0$. Note that all SPs within a given SSN have the same size. In this study, we will only explore values in the range $0\le \delta \le 0.02$ to ensure consistent map sizes across the system, that is, $0\le {d}_{\mathrm{i}}\le 1$ for all ${\mathrm{SSN}}_{\mathrm{i}}$ (see constraints in Appendix A).
Modularity
Request a detailed protocolTo systematically modify the degree of modular segregation in the topographic projections, we define a modularity parameter that determines the relative probability for feedforward connections from a given SP in ${\mathrm{SSN}}_{\mathrm{i}}$ to target the corresponding SP in ${\mathrm{SSN}}_{\mathrm{i}+1}$. Specifically, we follow (Newman, 2009; Pradhan et al., 2011) and define $m=1\frac{{p}_{0}}{{p}_{\mathrm{c}}}\in [0,1]$ as the ratio of the feedforward projection probabilities between neurons belonging to different SPs $\left({p}_{0}\right)$ and between neurons on the same topographic map $\left({p}_{\mathrm{c}}\right)$. According to the above definition, the feedforward connectivity matrix is random and homogeneous (ErdősRényi graph) if $m=0$ or ${d}_{\mathrm{i}}=1$ (see Figure 1a). For $m=1$ it is a blockdiagonal matrix, where the individual SPs overlap only when $d}_{\mathrm{i}}>1/{N}_{\mathrm{C}$. In order to isolate the effects on the network dynamics and computational performance attributable exclusively to the topographic structure, the overall density of the feedforward connectivity matrix is kept constant at $\left(1\alpha \right)*\u03f5=0.075$ (see also previous section). We note that, while providing the flexibility to implement the variations studied in this manuscript, this formalism has limitations (see Appendix A).
Neuron and synapse model
Request a detailed protocolWe study networks composed of LIF neurons with fixed voltage threshold and static synapses with exponentially decaying postsynaptic currents or conductances. The subthreshold membrane potential dynamics of such a neuron evolves according to:
where ${\tau}_{\mathrm{m}}$ is the membrane time constant, and $R{I}^{\beta}$ is the total synaptic input from population $\beta \in [E,I]$. The background input ${I}^{\mathrm{X}}$ is assumed to be excitatory and stochastic, modeled as a homogeneous Poisson process with constant rate ${\nu}_{\mathrm{X}}$. Synaptic weights ${J}_{\mathrm{ij}}$, representing the efficacy of interaction from presynaptic neuron $j$ to postsynaptic neuron $i$, are equal for all realized connections of a given type, that is, ${J}_{\mathrm{EE}}={J}_{\mathrm{IE}}=J$ for excitatory and ${J}_{\mathrm{EI}}={J}_{\mathrm{II}}=gJ$ for inhibitory synapses. All synaptic delays and time constants are equal in this setup. For a complete, tabular description of the models and model parameters used throughout this study, see Supplementary files 1–5.
Following previous works (Zajzon et al., 2019; Duarte and Morrison, 2014), we choose the intensity of the stochastic input ${\nu}_{\mathrm{X}}$ and the E–I ratio $g$ such that the first two subnetworks operate in a balanced, asynchronous irregular regime when driven solely by background input. This is achieved with ${\nu}_{\mathrm{X}}=12\mathrm{spikes}/\mathrm{s}$ and $g=12$, resulting in average firing rates of $\sim 3\mathrm{spikes}/\mathrm{s}$, coefficient of variation ($C{V}_{\mathrm{ISI}}$) in the interval $[1.0,1.5]$ and Pearson crosscorrelation (CC) ≤0.01 in ${\mathrm{SSN}}_{0}$ and ${\mathrm{SSN}}_{1}$.
In Section ‘A generalizable structural effect’ we consider two additional systems, a network of LIF neurons with conductancebased synapses and a continuous firing rate model. The LIF network is described in detail in Zajzon et al., 2019. Spiketriggered synaptic conductances are modeled as exponential functions, with fixed and equal conduction delays for all synapses. Key differences to the currentbased model include, in addition to the biologically more plausible synapse model, longer synaptic time constants and stronger input (see also Zajzon et al., 2019 and Supplementary file 3 for the numerical values of all parameters).
The continuous rate model contains $N=3000$ nonlinear units, the dynamics of which are governed by:
where $\mathit{x}$ represents the activation and $\mathit{r}$ the output of all units, commonly interpreted as the synaptic current variable and the firing rate estimate, respectively. The rates ${r}_{\mathrm{i}}$ are obtained by applying the nonlinear transfer function $\mathrm{tanh}\left({x}_{\mathrm{i}}\right)$, modified here to constrain the rates to the interval $[0,1]$ is the neuronal time constant, ${\mathit{b}}^{\mathrm{rec}}$ is a vector of individual neuronal bias terms (i.e., a baseline activation), and $J$ and ${J}^{\mathrm{in}}$ are the recurrent (including feedforward) and input weight matrices, respectively. These are constructed in the same manner as for the spiking networks, such that the overall connectivity, including the input mapping onto ${\mathrm{SSN}}_{0}$, is identical for all three models. Input weights are drawn from a uniform distribution, while the rest follow a normal distribution. Finally, $\mathit{\xi}$ is a vector of $N$ independent realizations of Gaussian white noise with zero mean and variance scaled by ${\sigma}_{\mathrm{X}}$. The differential equations are integrated numerically, using the Euler–Maruyama method with step $\delta t=1\mathrm{ms}$, with specific parameter values given in Supplementary file 5.
Signal reconstruction task
Request a detailed protocolWe evaluate the system’s ability to recover a simple, continuous step signal from a noisy variation using linear combinations of the population responses in the different SSNs (Maass et al., 2002). This is equivalent to probing the network’s ability to function as a denoising autoencoder (Bengio et al., 2013).
To generate the ${N}_{\mathrm{C}}$dimensional input signal $u\left(t\right)$, we randomly draw stimuli from a predefined set $S=\{{S}_{1},{S}_{2},\mathrm{\dots},{S}_{{N}_{\mathrm{C}}}\}$ and set the corresponding channel to active for a fixed duration of 200 ms (Figure 1a, left). This binary step signal $u\left(t\right)$ is also the target signal to be reconstructed. The effective input is obtained by adding a Gaussian white noise process with zero mean and variance ${\sigma}_{\xi}^{2}$ to $u\left(t\right)$, and scaling the sum with the input rate ${\nu}_{\mathrm{in}}$. Rectifying the resulting signal leads to the final form of the continuous input signal $z\left(t\right)={\left[{\nu}_{\mathrm{in}}\left(u\left(t\right)+\xi \left(t\right)\right)\right]}_{+}$. This allows us to control the amount of noise in the input, and thus the task difficulty, through a single parameter ${\sigma}_{\xi}$.
To deliver the input to the circuit, the analog signal $z\left(t\right)$ is converted into spike trains, with its amplitude serving as the rate of an inhomogeneous Poisson process generating independent spike trains. We set the scaling amplitude to ${\nu}_{\mathrm{in}}={K}_{E}\lambda {\nu}_{\mathrm{X}}$, modeling stochastic input with fixed rate $\lambda {\nu}_{\mathrm{X}}$ from ${K}_{E}=800$ neurons. If not otherwise specified, $\lambda =0.05$ holds, resulting in a mean firing rate below 8 spks/sec in ${\mathrm{SSN}}_{0}$ (see Figure 2c).
Each input channel $k$ is mapped onto one of the ${N}_{\mathrm{C}}$ stimulusspecific subpopulations of excitatory and inhibitory neurons in the first (input) subnetwork ${\mathrm{SSN}}_{0}$, chosen according to the procedure described above (see also Figure 1a). This way, each stimulus ${S}_{\mathrm{k}}$ is mapped onto a specific set of subpopulations in the different subnetworks, that is, the topographic map associated with ${S}_{\mathrm{k}}$.
For each stimulus in the sequence, we sample the responses of the excitatory population in each ${\mathrm{SSN}}_{\mathrm{i}}$ at fixed time points (once every ms) relative to stimulus onset. We record from the membrane potentials ${V}_{m}$ as they represent a parameterfree and direct measure of the population state (Duarte et al., 2018; Uhlmann et al., 2017). The activity vectors are then gathered in a state matrix $X}_{{\mathrm{S}\mathrm{S}\mathrm{N}}_{\mathrm{i}}}\in {\mathbb{R}}^{{N}^{\mathrm{E}}\times T$, which is then used to train a linear readout to approximate the target output of the task (Lukoševičius and Jaeger, 2009). We divide the input data, containing a total of 100 stimulus presentations (yielding $T=20,000$ samples), into a training and a testing set (80/20%), and perform the training using ridge regression (L2 regularization), with the regularization parameter chosen by leaveoneout crossvalidation on the training dataset.
Reconstruction performance is measured using the normalized root mean squared error (NRMSE). For this particular task, the effective delay in the buildup of optimal stimulus representations varies greatly across the subnetworks. In order to close in on the optimal delay for each ${\mathrm{SSN}}_{\mathrm{i}}$, we train the state matrix ${X}_{{\mathrm{SSN}}_{\mathrm{i}}}$ on a larger interval of delays and choose the one that minimizes the error, averaged across multiple trials.
In Section ‘Reconstruction and denoising of dynamical inputs’, we generalize the input to a sinusoidal signal $x\left(t\right)=\mathrm{sin}\left(a\cdot t\right)+\mathrm{cos}\left(b\cdot t\right)$, with parameters $a$ and $b$. From this, we obtain $u\left(t\right)$ through the sampling and discretization process described in the respective section, and compute the final input $z\left(t\right)={\left[{\nu}_{\mathrm{in}}\left(u\left(t\right)+\xi \left(t\right)\right)\right]}_{+}$ as above.
Effective connectivity and stability analysis
Request a detailed protocolTo better understand the role of structural variations on the network’s dynamics, we determine the network’s effective connectivity matrix $W$ analytically by linear stability analysis around the system’s stationary working points (see Appendix B for the complete derivations). The elements ${w}_{ij}\in W$ represent the integrated linear response of a target neuron $i$, with stationary rate $\nu}_{i$, to a small perturbation in the input rate $\nu}_{j$ caused by a spike from presynaptic neuron $j$. In other words, ${w}_{ij}$ measures the average number of additional spikes emitted by a target neuron $i$ in response to a spike from the presynaptic neuron $j$, and its relation to the synaptic weights is defined by Tetzlaff et al., 2012; Helias et al., 2013:
Note that in Figure 3 we ignore the contribution $\stackrel{~}{\beta}$ resulting from the modulation in the input variance ${\sigma}_{\mathrm{j}}^{2}$ which is significantly smaller due to the additional factor $1/{\sigma}_{i}\sim \mathcal{O}(1/\sqrt{N})$. Importantly, the effective connectivity matrix $W$ allows us to gain insights into the stability of the system by eigenvalue decomposition. For large random coupling matrices, the effective weight matrix has a spectral radius $\rho ={\mathrm{max}}_{k}\left(\mathrm{Re}\left\{{\lambda}_{\mathrm{k}}\right\}\right)$ which is determined by the variances of $W$ (Rajan and Abbott, 2006). For inhibitiondominated systems, such as those we consider, there is a single negative outlier representing the mean effective weight, given the eigenvalue ${\lambda}_{\mathrm{k}}^{*}$ associated with the unit vector. The stability of the system is thus uniquely determined by the spectral radius $\rho $: values smaller than unity indicate stable dynamics, whereas $\rho >1$ lead to unstable linearized dynamics.
Fixed point analysis
Request a detailed protocolFor the meanfield analysis, the ${N}_{\mathrm{C}}$ subpopulations in each subnetwork can be reduced to only two groups of neurons, the first one comprising all neurons of the stimulated SPs and the second one comprising all neurons in all nonstimulated SPs. This is possible because (1) the firing rates of the excitatory and inhibitory neurons within one SP are identical, owing to homogeneous neuron parameters and matching incoming connection statistics, and (2) all neurons in nonstimulated SPs have the same rate ${\nu}^{\mathrm{NS}}$ that is in general different from the rate of the stimulated SP ${\nu}^{\mathrm{S}}$. Here we only sketch the main steps, with a detailed derivation given in Appendix B.
The mean inputs to the first subnetwork can be obtained via
where $\gamma ={K}_{\mathrm{I}}/{K}_{\mathrm{E}}$ and $\mathcal{J}={\tau}_{\mathrm{m}}{K}_{\mathrm{E}}J$. Both equations are of the form
where $\kappa $ is the effective selfcoupling of a group of neurons with rate $\nu $ and input μ, and $I$ denotes the external inputs from other groups. Equation 8 describes a linear relationship between the rate $\nu $ and the input μ. To find a selfconsistent solution for the rates ${\nu}^{\mathrm{S}}$ and ${\nu}^{\mathrm{NS}}$, the above equations need to be solved numerically, taking into account in addition the f–I curve $\nu \left(\mu \right)$ of the neurons that in the case of LIF model neurons also depends on the variance ${\sigma}^{2}$ of inputs. The latter can be obtained analogous to the mean input μ (see Appendix B). Note that for general nonlinearity $\nu \left(\mu \right)$ there is no analytical closedform solution for the fixed points.
Starting from ${\mathrm{SSN}}_{1}$, networks are connected in a fixed pattern such that the rate ${\nu}_{i}$ in ${\mathrm{SSN}}_{i}$ also depends on the excitatory input from the previous subnetwork ${\mathrm{SSN}}_{i1}$ with rate ${\nu}_{i1}$. For a fixed point, we have ${\nu}_{i}={\nu}_{i1}$ (Toyoizumi, 2012). In this case, we can effectively group together stimulated/nonstimulated neurons in successive subnetworks and regroup equations for the mean input in the limit of many subnetworks, obtaining the simplified description (details see Appendix B)
The scaling terms of the firing rates incorporate the recurrent and feedforward contributions from the stimulated and nonstimulated groups of neurons. They depend solely on some fixed parameters of the system, including modularity $m$ (see Appendix B). Importantly, Equations 9 and 10 and have the same linear form as (Equation 8) Equation 8 and can be solved numerically as described above. Again, for general nonlinear $\nu \left(\mu \right)$ there is no closedform analytical solution, but see below for a piecewise linear activation function $\nu \left(\mu \right)$. The numerical solutions for fixed points are obtained using the root finding algorithm root of the scipy.optimize package (Virtanen et al., 2020). The stability of the fixed points is obtained by inserting the corresponding firing rates into the effective connectivity Equation 6. On the level of stimulated and nonstimulated subpopulations, the effective connectivity matrix reads
from which we obtain the maximum eigenvalue $\rho $, which for stable fixed points must be smaller than 1.
The structure of fixed points for the stimulated subpopulation (see discussion in ‘Modularity as a bifurcation parameter’) can furthermore be intuitively understood by studying the potential landscape of the system. The potential $U$ is thereby defined via the conservative force $F=\frac{dU}{d{\nu}^{\mathrm{S}}}={\nu}^{\mathrm{S}}+\nu (\mu ,{\sigma}^{2})$ that drives the system toward its fixed points via the equation of motion $\frac{d{\nu}^{\mathrm{S}}}{dt}=F$ (Wong and Wang, 2006; LitwinKumar and Doiron, 2012; Schuecker et al., 2017). Note that μ and ${\sigma}^{2}$ are again functions of ${\nu}^{\mathrm{S}}$ and ${\nu}^{\mathrm{NS}}$, where the latter is the selfconsistent rate of the nonstimulated subpopulations for given rate ${\nu}^{S}$ of the stimulated subpopulation, ${\nu}^{\mathrm{NS}}={\nu}^{\mathrm{NS}}\left({\nu}^{\mathrm{S}}\right)$ (details see Appendix B).
Multiple inputs and correlationbased similarity score
Request a detailed protocolIn Figure 9, we consider two stimuli ${S}_{1}$ and ${S}_{2}$ to be active simultaneously for 10 s. Let $S{P}_{1}$ and $S{P}_{2}$ be the two corresponding SPs in each subnetwork. The firing rate of each SP is estimated from spike counts in time bins of 10 ms and smoothed with a SavitzkyGolay filter (length 21 and polynomial order 4). We compute a similarity score based on the correlation between these rates, scaled by the ratio of the input intensities ${\lambda}_{2}/{\lambda}_{1}$ (with ${\lambda}_{1}$ fixed). This scaling is meant to introduce a gradient in the similarity score based on the firing rate differences, ensuring that high (absolute) scores require comparable activity levels in addition to strong correlations. To ensure that both stimuli are decodable where appropriate, we set the score to 0 when the difference between the rate of $S{P}_{2}$ and the nonstimulated SPs was <1 spks/sec ($S{P}_{1}$ had significantly higher rates). The curves in Figure 9c mark the regime boundaries: coexisting (CoEx) where score is >0.1 (red curve); WLC where score is <−0.1 (blue); WTA (gray) and where the score is in the interval (−0.1, 0.1), and either ${\lambda}_{2}/{\lambda}_{1}<0.5$ holds or the score is 0. While the CoEx region is a dynamical regime that only occurs in the initial subnetworks (Figure 9d), the WTA and WLC regimes persist and can be understood again with the help of a potential $U$, which is in this case a function of the rates of the two SPs (details see Appendix B).
Numerical simulations and analysis
Request a detailed protocolAll numerical simulations were conducted using the Neural Microcircuit Simulation and Analysis Toolkit (NMSAT) v0.2 (Duarte et al., 2017), a highlevel Python framework for creating, simulating and evaluating complex, spiking neural microcircuits in a modular fashion. It builds on the PyNEST interface for NEST (Gewaltig and Diesmann, 2007), which provides the core simulation engine. To ensure the reproduction of all the numerical experiments and figures presented in this study, and abide by the recommendations proposed in Pauli et al., 2018, we provide a complete code package that implements projectspecific functionality within NMSAT (see Data availability) using NEST 2.18.0 (Jordan et al., 2019).
Competing interests
Request a detailed protocolThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Appendix 1
Constraints on feedforward connectivity
This section expands on the limitations arising from the definitions of topographic modularity and map sizes used in this study. By imposing a fixed connection density on the feedforward connection matrices, the projection probabilities between neurons tuned to the same $\left({p}_{\mathrm{c}}\right)$ and different $\left({p}_{0}\right)$ stimuli are uniquely determined by the modularity $m$ and the parameter ${d}_{0}$ and $\delta $, which control the size of stimulusspecific subpopulations (see Materials and methods). For notational simplicity, here we consider the merged excitatory and inhibitory subpopulations tuned to a particular stimulus in a given subnetwork $\mathrm{S}\mathrm{S}\mathrm{N}}_{\mathrm{i}$, with a total size ${C}_{\mathrm{i}}={C}_{\mathrm{i}}^{\mathrm{E}}+{C}_{\mathrm{i}}^{\mathrm{I}}$.
Under the constraints applied in this work, the total density of a feedforward adjacency matrix between ${\mathrm{SSN}}_{\mathrm{i}}$ and ${\mathrm{SSN}}_{\mathrm{i}+1}$ can be computed as:
where ${U}_{0}^{\mathrm{i}}$ and ${U}_{\mathrm{c}}^{\mathrm{i}}$ are the number of realizable connections between similarly and differently tuned subpopulations, respectively. Since ${U}_{\mathrm{c}}^{\mathrm{i}}={N}^{2}{U}_{0}^{\mathrm{i}}$, we can simplify the notation and focus only on ${U}_{0}^{\mathrm{i}}$. We distinguish between the cases of nonoverlapping and overlapping stimulusspecific subpopulations:
where each potential synapse is counted only once, regardless of whether the involved neurons belong to any or multiple overlapping subpopulations. This ensures consistency with the definitions of the probabilities ${p}_{\mathrm{c}}$ and ${p}_{0}$. Alternatively, we can express ${U}_{0}^{\mathrm{i}}$ as:
For the case with no overlap, we can derive an additional constraint on the minimum subpopulations size ${C}_{\mathrm{i}}$ for the required density ${\sigma}_{\mathrm{i}}$ to be satisfied, which we define in relation to the total number of subpopulations $N}_{C$:
The equality holds in the case of $m=1$ and alltoall feedforward connectivity between similarly tuned subpopulations, that is, ${p}_{\mathrm{c}}=1$.
Appendix 2
Meanfield analysis of network dynamics
For an analytical investigation of the role of topographic modularity on the network dynamics, we used meanfield theory (Fourcaud and Brunel, 2002; Helias et al., 2013; Schuecker et al., 2015). Under the assumptions that each neuron receives a large number of small amplitude inputs at every time step, the synaptic time constants ${\tau}_{\mathrm{s}}$ are small compared to the membrane time constant ${\tau}_{\mathrm{m}}$, and that the network activity is sufficiently asynchronous and irregular, we can make use of theoretical results obtained from the diffusion approximation of the LIF neuron model to determine the stationary population dynamics. The equations in this section were partially solved using a modified version of the LIF Meanfield Tools library (Layer et al., 2020).
Stationary firing rates and fixed points
In the circumstances described above, the total synaptic input to each neuron can be replaced by a Gaussian white noise process (independent across neurons) with mean $\mu \left(t\right)$ and variance ${\sigma}^{2}\left(t\right)$. In the stationary state, these quantities, along with the firing rates of each afferent, can be well approximated by their constant time average. The stationary firing rate of the LIF neuron in response to such input is:
where erf is the error function and the integration limits are defined as ${y}_{\mathrm{r}}=\left({V}_{\mathrm{reset}}\mu \right)/\sigma +\frac{q}{2}\sqrt{{\tau}_{\mathrm{s}}/{\tau}_{\mathrm{eff}}}$ and ${y}_{\theta}=\left(\theta \mu \right)/\sigma +\frac{q}{2}\sqrt{{\tau}_{\mathrm{s}}/{\tau}_{\mathrm{eff}}}$, with $q=\sqrt{2}\left\zeta \left(1/2\right)\right$ and Riemann zeta function $\zeta $ (see Fourcaud and Brunel, 2002, Eq. 4.33). As we will see below, the mean μ and variance ${\sigma}^{2}$ of the input also depend on the stationary firing rate $\nu $, rendering Equation 14 an implicit equation that needs to be solved selfconsistently using fixedpoint iteration.
For simplicity, throughout the meanfield analyses we consider perfectly partitioned networks where each neuron belongs to exactly one topographic map, that is, to one of the ${N}_{\mathrm{C}}$ stimulusspecific, identically sized subpopulations SP (no overlap condition). We denote the firing rate of a neuron in the currently stimulated SP (receiving stimulus input in ${\mathrm{SSN}}_{0}$) in subnetwork ${\mathrm{SSN}}_{\mathrm{i}}$ by ${\nu}_{\mathrm{i}}^{\mathrm{S}}$, and by ${\nu}_{\mathrm{i}}^{\mathrm{NS}}$ that of neurons not associated with the stimulated pathway. Since the firing rates of excitatory and inhibitory neurons are equal (due to identical synaptic time constants and input statistics), we can write the constant mean synaptic input to neurons in the input subnetwork as
The variances ${\left({\sigma}_{0}^{\mathrm{S}}\right)}^{2}$ and ${\left({\sigma}_{0}^{\mathrm{NS}}\right)}^{2}$ can be obtained by squaring each weight $J$ in the above equation. To derive these equations for the deeper subnetworks $\mathrm{S}\mathrm{S}\mathrm{N}}_{\mathrm{i}>0$, it is helpful to include auxiliary variables ${K}_{\mathrm{S}}$ and ${K}_{\mathrm{NS}}$, representing the number of feedforward inputs to a neuron in ${\mathrm{SSN}}_{\mathrm{i}}$ from its own SP in ${\mathrm{SSN}}_{\mathrm{i}1}$, and from one different SP (there are ${N}_{\mathrm{C}}1$ such subpopulations), respectively. Both ${K}_{\mathrm{S}}$ and ${K}_{\mathrm{NS}}$ are uniquely defined by the modularity $m$ and projection density $d$, and ${K}_{\mathrm{NS}}=\left(1m\right){K}_{\mathrm{S}}=\left(1m\right)\left(1\alpha \right){K}_{\mathrm{E}}$ holds as well. The mean synaptic inputs to the neurons in the deeper subnetworks can thus be written as:
Again, one can obtain the variances by squaring each weight $J$. The stationary firing rates for the stimulated and nonstimulated subpopulations in all subnetworks are then found by first solving Equations 14 and 15 for the first subnetwork and then (Equation 16) Equations 14 and 16 successively for deeper subnetworks.
For very deep networks, one can ask the question, whether firing rates approach fixed points across subnetworks. If there are multiple fixed points, the initial condition, that is the externally stimulated activity of subpopulations in the first subnetwork, decides in which of the fixed points the rates evolve, in a similar spirit as in recurrent networks after a startup transient. For a fixed point, we have ${\nu}_{i1}={\nu}_{i}$. In effect, we can regroup terms in Equation 16 that have the same rates such that formally we obtain an effective new group of neurons from the excitatory and inhibitory SPs of the current subnetwork and the corresponding excitatory SPs of the previous subnetwork, as indicated by the square brackets in the following formulas:
with $\beta ={K}_{\mathrm{X}}/{K}_{\mathrm{E}}$, $\gamma ={K}_{\mathrm{I}}/{K}_{\mathrm{E}}$, and $\mathcal{J}=\tau {K}_{\mathrm{E}}J$.
For the parameters $g$ and $\gamma $ chosen here, ${\kappa}_{\mathrm{S},\mathrm{NS}}$, ${\kappa}_{\mathrm{NS},\mathrm{S}}$, and ${\kappa}_{\mathrm{NS},\mathrm{NS}}$ in Equations 17 and 18 are always negative for any modularity $m$ due to the large recurrent inhibition. Therefore, for the nonstimulated group, $\kappa <0$ in Equation 8 (see main text), such that one always finds a single fixed point, which, as desired, is at a low rate. Interestingly, the excitatory feedforward connections can switch the sign of ${\kappa}_{\mathrm{S},\mathrm{S}}$ from negative to positive for large values of $m$, thereby rendering the active group effectively excitatory, leading to a saddlenode bifurcation and the emergence of a stable highactivity fixed point (see Figure 7b in the main text).
The structure of fixed points can also be understood by studying the potential landscape of the system: Equation 14 can be regarded as the fixedpoint solution of the following evolution equations for the stimulated and nonstimulated subpopulations (Wong and Wang, 2006; Schuecker et al., 2017)
where ${\mathrm{\Phi}}_{\mathrm{S}}$ and ${\mathrm{\Phi}}_{\mathrm{NS}}$ are defined via the righthand side of Equation 14 with ${\mu}^{\mathrm{S}}$ and ${\mu}^{\mathrm{NS}}$ inserted as defined in Equations 17 and 18 (and likewise for ${\sigma}^{\mathrm{S}}$ and ${\sigma}^{\mathrm{NS}}$). Due to the asymmetry in connections between stimulated and nonstimulated subpopulations, the righthand side of Equations 19 and 20 cannot be interpreted as a conservative force. Following the idea of effective response functions (Mascaro and Amit, 1999), a potential $U\left({\nu}^{\mathrm{S}}\right)$ for the stimulated subpopulation alone can, however, be defined by inserting the solution ${\nu}^{\mathrm{NS}}=f\left({\nu}^{\mathrm{S}}\right)$ of Equation 20 into Equation 19
and interpreting the righthand side as a conservative force $F=\frac{dU}{d{\nu}^{\mathrm{S}}}$ (LitwinKumar and Doiron, 2012). The potential then follows from integration as
where $U\left(0\right)$ is an inconsequential constant. We solved the latter integral numerically using the scipy.integrate.trapz function of SciPy (Virtanen et al., 2020). The minima and maxima of the resulting potential correspond to locally stable and unstable fixed points, respectively. Note that while this singlepopulation potential is useful to study the structure of fixed points, the full dynamics of all populations and global stability cannot be straightforwardly infered from this reduced picture (Mascaro and Amit, 1999; Rost et al., 2018), here for two reasons: (1) For spiking networks, Equation 19 and Equation 20 do not describe the real dynamics of the mean activity. Their righthand side only defines the stationary state solution. (2) The global stability of fixed points also depends on the time constants of all subpopulations’ mean activities (here ${\tau}_{\mathrm{S}}$ and ${\tau}_{\mathrm{NS}}$), but the temporal dynamics of the nonstimulated subpopulations is neglected here.
Meanfield analysis for two input streams
In the case of two simultaneously active stimuli (see Section ‘Input integration and multistability’), if the stimulated group 1 is in the highactivity state with rate ${\nu}^{\mathrm{S1}}$, the second stimulated group 2 will receive an additional nonvanishing input of the form
which is negative for all values of $m$ and can therefore lead to the silencing of group 2. If the stimuli are similarly strong, network fluctuations can dynamically switch the roles of the stimulated groups 1 and 2.
The dynamics and fixedpoint structure in deep subnetworks can be studied using a twodimensional potential landscape that is defined via the following evolution equation
where $f({\nu}^{\mathrm{S1}},{\nu}^{\mathrm{S2}})={\nu}^{\mathrm{NS}}$ is the fixed point of the nonstimulated subpopulations for given rates ${\nu}^{\mathrm{S1}},{\nu}^{\mathrm{S2}}$ of the two stimulated subpopulations, respectively. The functions ${\mathrm{\Phi}}_{\mathrm{S1}}$ and ${\mathrm{\Phi}}_{\mathrm{S2}}$ are again defined via the righthand side of Equation 14 with inserted ${\mu}^{\mathrm{S1}}$, ${\mu}^{\mathrm{S2}}$ and ${\mu}^{\mathrm{NS}}$ that are defined as follows (derivation analogous to the singleinput case):
Due to the symmetry between the two stimulated subpopulations, the righthand side of Equations 24 and 25 can be viewed as a conservative force $\mathit{F}$ of the potential $U({\nu}^{\mathrm{S}1},{\nu}^{\mathrm{S}2})={\int}_{\mathcal{C}}\mathit{F}\phantom{\rule{thinmathspace}{0ex}}ds$, where we parameterized the line integral along the path $\nu :[0,1]\to \mathcal{C},t\mapsto t\cdot ({\nu}^{\mathrm{S}1},{\nu}^{\mathrm{S}2})$, which yields
The numerical evaluation of this twodimensional potential is shown in Figure 9—figure supplement 2, whereas sketches in Figure 9e show a onedimensional section (gray lines in Figure 9—figure supplement 2) that goes antidiagonal through the two minima corresponding to one population being in the highactivity state and the other one being in the lowactivity state.
Critical modularity for piecewise linear activation function
To obtain a closedform analytic solution for the critical modularity, in the following we consider a neuron model with piecewise linear activation function
for $\mu \in [{\mu}_{\mathrm{min}},{\mu}_{\mathrm{max}}]$, $\nu \left(\mu \right)=0$ for $\mu <{\mu}_{\mathrm{m}\mathrm{i}\mathrm{n}}$ and $\nu \left(\mu \right)={\nu}_{\mathrm{max}}$ for $\mu >{\mu}_{\mathrm{m}\mathrm{a}\mathrm{x}}$ (Figure 8a). Successful denoising requires the nonstimulated subpopulations to be silent, ${\nu}^{NS}=0$, and the stimulated subpopulations to be active, ${\nu}^{S}>0$. We first study solutions where $0<{\nu}^{S}<{\nu}_{\mathrm{m}\mathrm{a}\mathrm{x}}$ and afterwards the case where ${\nu}^{S}={\nu}_{\mathrm{max}}$. Inserting Equation 31 into Equations 9 and 10, we obtain
The first equation can be solved for ${\mu}^{\mathrm{S}}$
which holds for
Requirement (Equation 33) is equivalent to an inequality for $m$
that, depending on the dynamic range of the neuron, the strength of the external background input and the recurrence, yields
as an upper or lower bound for the modularity (Figure 8). Requirement (Equation 34) with the solution (Equation 32) for ${\mu}^{\mathrm{S}}$ inserted yields a further lower bound
for the modularity that is required for denoising. This criterion is independent of the external background input and the recurrence of the SSN.
Now we turn to the saturated scenario ${\nu}^{S}={\nu}_{\mathrm{max}}$ and ${\nu}^{NS}=0$ and obtain
with the criteria
The first criterion (Equation 37) yields the same critical value (Equation 35) that for ${\mu}_{\mathrm{m}\mathrm{a}\mathrm{x}}\alpha \mathcal{J}{\nu}_{\mathrm{X}}\frac{\mathcal{J}}{{N}_{\mathrm{C}}}\left(1+\gamma g\right){\nu}_{\mathrm{m}\mathrm{a}\mathrm{x}}\ge 0$ is a lower bound and otherwise an upper bound. The second criterion (Equation 38) yields an additional lower bound for $\mathcal{J}(1\alpha ){\nu}_{\mathrm{m}\mathrm{a}\mathrm{x}}({N}_{\mathrm{C}}1)\left({\mu}_{\mathrm{m}\mathrm{i}\mathrm{n}}\alpha \mathcal{J}{\nu}_{\mathrm{X}}\frac{\mathcal{J}}{{N}_{\mathrm{C}}}\left(1+\gamma g\right){\nu}_{\mathrm{m}\mathrm{a}\mathrm{x}}\right)\ge 0$ (Figure 8):
The above criteria yield necessary conditions for the existence of a fixed point with ${\nu}^{S}>0$ and ${\nu}^{NS}=0$. Next we study the stability of such solutions. This works analogous to the stability in the spiking models discussed in Section ‘Effective connectivity and stability analysis’ by studying the spectrum of the effective connectivity matrix. For the model Equation 31, the effective connectivity is given by
with ${\nu}^{\prime}\left(\mu \right)=\frac{d\nu}{d\mu}\left(\mu \right)$ and $\mathcal{J}}_{ij}={\tau}_{x}{J}_{ij$. On the level of stimulated and nonstimulated subpopulations across layers, the effective connectivity becomes
with eigenvalues
The saturated fixed point ${\nu}^{S}={\nu}_{\mathrm{max}}$ and ${\nu}^{NS}=0$ has ${\nu}^{\prime}\left({\mu}^{\mathrm{S}}\right)={\nu}^{\prime}\left({\mu}^{\mathrm{NS}}\right)=0$, leading to ${\lambda}_{\pm}=0$. This fixed point is always stable. The nonsaturated fixed point also has ${\nu}^{\prime}\left({\mu}^{\mathrm{NS}}\right)=0$. Consequently, Equation 42 simplifies to ${\lambda}_{}=0$ and
For $\lambda >1$ fluctuations in the stimulated subpopulation are being amplified. These fluctuations also drive fluctuations of the nonstimulated subpopulation via the recurrent coupling. The fixed point thus becomes unstable and the necessary distinction between the stimulated and nonstimulated subpopulations vanishes. For inhibitiondominated recurrence, ${\kappa}_{\mathrm{S},\mathrm{S}}\left(m\right)$ is small enough to obtain stable fixed points at nonsaturated rates (Figure 8c). In the case of no recurrence or excitationdominated recurrence, ${\kappa}_{\mathrm{S},\mathrm{S}}\left(m\right)$ is much larger, typically driving ${\lambda}_{+}$ across the line of instability and preventing nonsaturated fixed points to be stable. In such networks, only the saturated fixed point at ${\nu}^{S}={\nu}_{\mathrm{max}}$ is stable and reachable (Figure 8d and e).
Data availability
The current manuscript is a computational study, so no data have been generated for this manuscript. Modelling code can be found at https://doi.org/10.5281/zenodo.6326496 (see also Supplementary Files). Source data and code files are also attached as zip folders to the individual main figures of this manuscript.
References

On embedding synfire chains in a balanced networkNeural Computation 15:1321–1340.https://doi.org/10.1162/089976603321780290

Memory capacity of balanced networksNeural Computation 17:691–713.https://doi.org/10.1162/0899766053019962

Representation learning: a review and new perspectivesIEEE Transactions on Pattern Analysis and Machine Intelligence 35:1798–1828.https://doi.org/10.1109/TPAMI.2013.50

The topography of visuospatial attention as revealed by a novel visual field mapping techniqueJournal of Cognitive Neuroscience 21:1447–1460.https://doi.org/10.1162/jocn.2009.21005

Dynamics of networks of randomly connected excitatory and inhibitory spiking neuronsJournal of Physiology, Paris 94:445–463.https://doi.org/10.1016/s09284257(00)010846

Normalization as a canonical neural computationNature Reviews. Neuroscience 13:51–62.https://doi.org/10.1038/nrn3136

Neutral stability, rate propagation, and critical branching in feedforward networksNeural Computation 25:1768–1806.https://doi.org/10.1162/NECO_a_00461

Pulvinar thalamic nucleus allows for asynchronous spike propagation through the cortexFrontiers in Computational Neuroscience 9:60.https://doi.org/10.3389/fncom.2015.00060

Neuronal circuits of the neocortexAnnual Review of Neuroscience 27:419–451.https://doi.org/10.1146/annurev.neuro.27.070203.144152

Dynamic stability of sequential stimulus representations in adapting neuronal networksFrontiers in Computational Neuroscience 8:124.https://doi.org/10.3389/fncom.2014.00124

ConferenceEncoding symbolic sequences with spiking neural reservoirs2018 International Joint Conference on Neural Networks (IJCNN.https://doi.org/10.1109/IJCNN.2018.8489114

Dynamics of the firing probability of noisy integrateandfire neuronsNeural Computation 14:2057–2110.https://doi.org/10.1162/089976602320264015

A theory of cortical responsesPhilosophical Transactions of the Royal Society B 360:815–836.https://doi.org/10.1098/rstb.2005.1622

Neocortical network activity in vivo is generated through a dynamic balance of excitation and inhibitionThe Journal of Neuroscience 26:4535–4545.https://doi.org/10.1523/JNEUROSCI.529705.2006

Echoes in correlated neural systemsNew Journal of Physics 15:023002.https://doi.org/10.1088/13672630/15/2/023002

A model of saliencybased visual attention for rapid scene analysisIEEE Transactions on Pattern Analysis and Machine Intelligence 20:1254–1259.https://doi.org/10.1109/34.730558

Topographic maps are fundamental to sensory processingBrain Research Bulletin 44:107–112.https://doi.org/10.1016/s03619230(97)000944

Advances in Neural Information Processing SystemsOptimal architectures in a solvable model of deep networks, Advances in Neural Information Processing Systems, Curran Associates, Inc.

Bridging structure and function: a model of sequence learning and prediction in primary visual cortexPLOS Computational Biology 14:e1006187.https://doi.org/10.1371/journal.pcbi.1006187

Conditions for propagating synchronous spiking and asynchronous firing rates in a cortical network modelThe Journal of Neuroscience 28:5268–5280.https://doi.org/10.1523/JNEUROSCI.254207.2008

The highconductance state of cortical networksNeural Computation 20:1–43.https://doi.org/10.1162/neco.2008.20.1.1

Spiking activity propagation in neuronal networks: reconciling different perspectives on neural codingNature Reviews. Neuroscience 11:615–627.https://doi.org/10.1038/nrn2886

Neurons in primate visual cortex alternate between responses to multiple stimuli in their receptive fieldFrontiers in Computational Neuroscience 10:141.https://doi.org/10.3389/fncom.2016.00141

Slow dynamics and high variability in balanced cortical networks with clustered connectionsNature Neuroscience 15:1498–1505.https://doi.org/10.1038/nn.3220

Reservoir computing approaches to recurrent neural network trainingComputer Science Review 3:127–149.https://doi.org/10.1016/j.cosrev.2009.03.005

Effective neural response function for collective population statesNetwork 10:351–373.

Neuronal networks: flipflops in the brainCurrent Biology 15:R294–R296.https://doi.org/10.1016/j.cub.2005.04.009

Modular and hierarchically modular organization of brain networksFrontiers in Neuroscience 4:200.https://doi.org/10.3389/fnins.2010.00200

NoiseInduced alternations in an attractor network model of perceptual bistabilityJournal of Neurophysiology 98:1125–1139.https://doi.org/10.1152/jn.00116.2007

Neural mechanisms subserving cutaneous sensibility, with special reference to the role of afferent inhibition in sensory perception and discriminationBulletin of the Johns Hopkins Hospital 105:201–232.

Thalamic control of functional cortical connectivityCurrent Opinion in Neurobiology 44:127–131.https://doi.org/10.1016/j.conb.2017.04.001

Random graphs with clusteringPhysical Review Letters 103:058701.https://doi.org/10.1103/PhysRevLett.103.058701

Perceptual awareness and active inferenceNeuroscience of Consciousness 2019:09.https://doi.org/10.1093/nc/niz012

Topographic organization in the brain: searching for general principlesTrends in Cognitive Sciences 18:351–363.https://doi.org/10.1016/j.tics.2014.03.008

Reproducing polychronization: a guide to maximizing the reproducibility of spiking network modelsFrontiers in Neuroinformatics 12:46.https://doi.org/10.3389/fninf.2018.00046

Transient cognitive dynamics, metastability, and decision makingPLOS Computational Biology 4:e1000072.https://doi.org/10.1371/journal.pcbi.1000072

Eigenvalue spectra of random matrices for neural networksPhysical Review Letters 97:188104.https://doi.org/10.1103/PhysRevLett.97.188104

Transmission of populationcoded informationNeural Computation 24:391–407.https://doi.org/10.1162/NECO_a_00227

Variability in neural activity and behaviorCurrent Opinion in Neurobiology 25:211–220.https://doi.org/10.1016/j.conb.2014.02.013

Emergence of slowswitching assemblies in structured neuronal networksPLOS Computational Biology 11:e1004196.https://doi.org/10.1371/journal.pcbi.1004196

Computation by switching in complex networks of statesPhysical Review Letters 109:018701.https://doi.org/10.1103/PhysRevLett.109.018701

Modulated escape from a metastable state driven by colored noisePhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 92:052119.https://doi.org/10.1103/PhysRevE.92.052119

Fundamental activity constraints lead to specific interpretations of the connectomePLOS Computational Biology 13:e1005179.https://doi.org/10.1371/journal.pcbi.1005179

Noise, neural codes and cortical organizationCurrent Opinion in Neurobiology 4:569–579.https://doi.org/10.1016/09594388(94)900590

The variable discharge of cortical neurons: implications for connectivity, computation, and information codingThe Journal of Neuroscience 18:3870–3896.https://doi.org/10.1523/JNEUROSCI.181003870.1998

The role of the thalamus in the flow of information to the cortexPhilosophical Transactions of the Royal Society of London. Series B, Biological Sciences 357:1695–1708.https://doi.org/10.1098/rstb.2002.1161

Topographic maps in human frontal and parietal cortexTrends in Cognitive Sciences 13:488–495.https://doi.org/10.1016/j.tics.2009.08.005

The spread of rate and correlation in stationary cortical networksNeurocomputing 52–54:949–954.https://doi.org/10.1016/S09252312(02)008548

Decorrelation of neuralnetwork activity by inhibitory feedbackPLOS Computational Biology 8:e1002596.https://doi.org/10.1371/journal.pcbi.1002596

A neurobiologically constrained cortex model of semantic grounding with spiking neurons and brainlike connectivityFrontiers in Computational Neuroscience 12:88.https://doi.org/10.3389/fncom.2018.00088

Nearly extensive sequential memory lifetime achieved by coupled nonlinear neuronsNeural Computation 24:2678–2699.https://doi.org/10.1162/NECO_a_00324

ConferenceThe Best Spike Filter Kernel Is a NeuronConference on Cognitive Computational Neuroscience.

Is perception discrete or continuous?Trends in Cognitive Sciences 7:207–213.https://doi.org/10.1016/s13646613(03)000950

Signal propagation and logic gating in networks of integrateandfire neuronsThe Journal of Neuroscience 25:10786–10795.https://doi.org/10.1523/JNEUROSCI.350805.2005

Imaging retinotopic maps in the human brainVision Research 51:718–737.https://doi.org/10.1016/j.visres.2010.08.004

A recurrent network mechanism of time integration in perceptual decisionsThe Journal of Neuroscience 26:1314–1328.https://doi.org/10.1523/JNEUROSCI.373305.2006

Passing the message: representation transfer in modular balanced networksFrontiers in Computational Neuroscience 13:79.https://doi.org/10.3389/fncom.2019.00079

Robust information propagation through noisy neural circuitsPLOS Computational Biology 13:e1005497.https://doi.org/10.1371/journal.pcbi.1005497
Article and author information
Author details
Funding
Initiative and Networking Fund of the Helmholtz Association
 Barna Zajzon
 Abigail Morrison
 Renato Duarte
 David Dahmen
Helmholtz Portfolio theme Supercomputing and Modeling for the Human Brain
 Barna Zajzon
 Abigail Morrison
 Renato Duarte
Excellence Initiative of the German federal and state governments (G:(DE82)EXSSFneuroIC002)
 Barna Zajzon
 Abigail Morrison
 Renato Duarte
Helmholtz Association (VHNG1028)
 David Dahmen
European Commission HBP (945539)
 David Dahmen
The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.
Acknowledgements
The authors gratefully acknowledge the computing time granted by the JARAHPC Vergabegremium on the supercomputer JURECA at Forschungszentrum Jülich.
Version history
 Preprint posted: January 12, 2022 (view preprint)
 Received: January 12, 2022
 Accepted: January 25, 2023
 Accepted Manuscript published: January 26, 2023 (version 1)
 Version of Record published: March 2, 2023 (version 2)
Copyright
© 2023, Zajzon et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 735
 views

 149
 downloads

 0
 citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Neuroscience
Cortical folding is an important feature of primate brains that plays a crucial role in various cognitive and behavioral processes. Extensive research has revealed both similarities and differences in folding morphology and brain function among primates including macaque and human. The folding morphology is the basis of brain function, making crossspecies studies on folding morphology important for understanding brain function and species evolution. However, prior studies on crossspecies folding morphology mainly focused on partial regions of the cortex instead of the entire brain. Previously, our research defined a wholebrain landmark based on folding morphology: the gyral peak. It was found to exist stably across individuals and ages in both human and macaque brains. Shared and unique gyral peaks in human and macaque are identified in this study, and their similarities and differences in spatial distribution, anatomical morphology, and functional connectivity were also dicussed.

 Neuroscience
Complex skills like speech and dance are composed of ordered sequences of simpler elements, but the neuronal basis for the syntactic ordering of actions is poorly understood. Birdsong is a learned vocal behavior composed of syntactically ordered syllables, controlled in part by the songbird premotor nucleus HVC (proper name). Here, we test whether one of HVC’s recurrent inputs, mMAN (medial magnocellular nucleus of the anterior nidopallium), contributes to sequencing in adult male Bengalese finches (Lonchura striata domestica). Bengalese finch song includes several patterns: (1) chunks, comprising stereotyped syllable sequences; (2) branch points, where a given syllable can be followed probabilistically by multiple syllables; and (3) repeat phrases, where individual syllables are repeated variable numbers of times. We found that following bilateral lesions of mMAN, acoustic structure of syllables remained largely intact, but sequencing became more variable, as evidenced by ‘breaks’ in previously stereotyped chunks, increased uncertainty at branch points, and increased variability in repeat numbers. Our results show that mMAN contributes to the variable sequencing of vocal elements in Bengalese finch song and demonstrate the influence of recurrent projections to HVC. Furthermore, they highlight the utility of species with complex syntax in investigating neuronal control of ordered sequences.