A reservoir of timescales emerges in recurrent circuits with heterogeneous neural assemblies
Abstract
The temporal activity of many physical and biological systems, from complex networks to neural circuits, exhibits fluctuations simultaneously varying over a large range of timescales. Long-tailed distributions of intrinsic timescales have been observed across neurons simultaneously recorded within the same cortical circuit. The mechanisms leading to this striking temporal heterogeneity are yet unknown. Here, we show that neural circuits, endowed with heterogeneous neural assemblies of different sizes, naturally generate multiple timescales of activity spanning several orders of magnitude. We develop an analytical theory using rate networks, supported by simulations of spiking networks with cell-type specific connectivity, to explain how neural timescales depend on assembly size and show that our model can naturally explain the long-tailed timescale distribution observed in the awake primate cortex. When driving recurrent networks of heterogeneous neural assemblies by a time-dependent broadband input, we found that large and small assemblies preferentially entrain slow and fast spectral components of the input, respectively. Our results suggest that heterogeneous assemblies can provide a biologically plausible mechanism for neural circuits to demix complex temporal input signals by transforming temporal into spatial neural codes via frequency-selective neural assemblies.
Editor's evaluation
This fundamental work uses computational network models to suggest a possible origin of the wide range of time scales observed in cortical activity. This claim is supported by convincing evidence based on comparisons between mathematical theory, simulations of spiking network models, and analysis of recordings from the orbitofrontal cortex. This manuscript will be of interest to the broad community of systems and computational neuroscience.
https://doi.org/10.7554/eLife.86552.sa0Introduction
Experimental evidence shows that the temporal activity of many physical and biological systems exhibits fluctuations simultaneously varying over a large range of timescales. In condensed matter physics, spin glasses typically exhibit aging and relaxation effects whose timescales span several orders of magnitude (Bouchaud, 1992). In biological systems, metabolic networks of E. coli generate fluxes with a power-law distribution of rates (Almaas et al., 2004; Emmerling et al., 2002). Gas release in yeast cultures exhibits frequency distributions spanning many orders of magnitude (Roussel and Lloyd, 2007), endowing them with robust and flexible responses to the environment (Aon et al., 2008). In the mammalian brain, a hierarchy of timescales in the activity of single neurons is observed across different cortical areas from occipital to frontal regions (Murray et al., 2014; Siegle et al., 2019; Gao et al., 2020). Moreover, neurons within the same local circuit exhibit a large range of timescales from milliseconds to minutes (Bernacchia et al., 2011; Cavanagh et al., 2016; Miri et al., 2011). This heterogeneity of neuronal timescales was observed in awake animals during periods of ongoing activity in the absence of external stimuli or behavioral tasks, suggesting that long-tailed distributions of intrinsic timescales may be an intrinsic property of recurrent cortical circuits. Recent studies highlighted the benefits of leveraging computations on multiple timescales when performing complex tasks in primates (Iigaya et al., 2019) as well as in artificial neural networks (Perez-Nieves et al., 2021). However, the neural mechanisms underlying the emergence of multiple timescales are not yet understood.
Here, we present a simple and robust neural mechanism generating heterogeneous timescales of activity in recurrent circuits. The central feature of our model is a heterogeneous distribution of cell assemblies, a common ingredient observed in cortical architecture (Perin et al., 2011; Marshel et al., 2019; Figure 1a). We first demonstrate that rate networks, whose rate units represent cell-assemblies, can generate long-tailed distributions of timescales when endowed with heterogeneous assemblies (Figure 1b). We then show that the heterogeneity of timescales, observed in electrophysiological recordings from awake primate cortex (Cavanagh et al., 2016), can be explained by the presence of heterogeneous cell assemblies (Figure 1b). Using methods from statistical physics, we develop an analytical framework explaining how an assembly’s intrinsic timescale depends on size, revealing the emergence of a new chaotic regime where activity is bistable. We show that our theory applies to biologically plausible models of cortical circuits based on spiking networks with cell-type specific clustered architectures.
We then study the stimulus-response properties of networks with heterogeneous assemblies. In networks with homogeneous assemblies, chaotic activity is suppressed at a single resonant frequency (Rajan et al., 2010). However, when we drive heterogeneous networks with a time-dependent broadband input featuring a superposition of multiple frequencies, we find that the chaotic activity is suppressed across multiple frequencies which depend on the assembly own size. Large and small assemblies are preferentially entrained by the low and high-frequency components of the input, respectively (Figure 1c). This spectral specificity suggests that a reservoir of timescales may be a natural mechanism for cortical circuits to flexibly demix different spectral features of complex time-varying inputs. This mechanism may endow neural circuits with the ability to transform temporal neural codes into spatial neural codes via frequency-selective neural assemblies.
Results
To develop a theory of heterogeneous timescales, we first focus on random neuronal networks whose rate units are recurrently connected, with couplings that are chosen randomly. In this model, we will be able to leverage analytical methods from statistical field theory (Sompolinsky et al., 1988; Buice and Chow, 2013; Helias and Dahmen, 2020) to link analytical model parameters to circuit dynamics. In our rate network model, each network unit represents a functional assembly of cortical neurons with similar response properties. We interpret the unit’s self-coupling as the size of the corresponding neural assembly (if recurrent couplings across the population vary significantly, we also interpret the self-coupling as representing the average coupling strength within an assembly). In the case where the self-couplings are zero or weak (order , with being the size of the network), random networks are known to undergo a phase transition from silence to chaos when the variance of the random couplings exceeds a critical value (Sompolinsky et al., 1988). When the self-couplings are strong (order 1) and are all equal, a third phase appears, featuring multiple stable fixed points accompanied by long transient activity (Stern et al., 2014). In all these cases, all network units exhibit the same intrinsic timescale, estimated from their autocorrelation function. Here, we demonstrate a novel class of recurrent networks, capable of generating temporally heterogeneous activity whose multiple timescales span several orders of magnitude. We show that when the self-couplings are heterogeneous, a reservoir of multiple timescales emerges, where each unit’s intrinsic timescale depends both on its own self-coupling and the network’s self-coupling distribution.
Random networks with heterogeneous self-couplings
We start by considering a recurrent network of rate units obeying the dynamical equations
where the random couplings from unit to unit are drawn independently from a Gaussian distribution with mean 0 and variance ; represents the network gain and we chose a transfer function . The self-couplings are drawn from a distribution . The special case of equal self-couplings () was studied by Stern et al., 2014 and a summary of the results can be found in Appendix 1 for convenience. Here, we study the network properties in relation to both discrete and continuous distributions .
Using standard methods of statistical field theory (Buice and Chow, 2013; Helias and Dahmen, 2020, see Methods: 'Dynamic mean-field theory with multiple self-coupling' for details), in the limit of large we can average over realizations of the disordered couplings to derive a set of self-consistent dynamic mean-field equations for each population of units with self-coupling strengths
In our notation, denotes the set of different values of self-couplings , indexed by , and we denote by the number of units with the same self-coupling , and accordingly by their fraction. The mean-field is the same Gaussian process for all units and has zero mean and autocorrelation
where denotes an average over the mean-field.
We found that networks with heterogeneous self-couplings exhibit a complex landscape of fixed points , obtained as the self-consistent solutions to the static version of Equation 2 and Equation 3, subject to stability conditions (see Methods: 'Fixed points and transition to chaos' and 'Stability conditions'). For fixed values of the network gain , these fixed points can be destabilized by varying the self-couplings of different assemblies, inducing a transition to time-varying chaotic activity (Figure 2). The fixed points landscape exhibits remarkable features inherited directly from the single value self-coupling case, as was extensively researched in Stern et al., 2014. Here, we focus on the dynamical properties of the time-varying chaotic activity, which constitute new features resulting from the heterogeneity of the self-couplings. We illustrate the network’s dynamical features in the case of a network with two sub-populations with and portions of the units with self-couplings and , respectively. In the plane, this model gives rise to a phase diagram with a single chaotic region separating four disconnected stable fixed-point regions (Figure 2a). In the case of a Gaussian distribution of self-couplings in the stable fixed point regime, a complex landscape of stable fixed points emerges. The unit values at the stable fixed points continuously interpolate between around zero (for units with ) and a bi-modal distribution (for units with ) within the same network (Figure 3a).
A reservoir of heterogeneous timescales explains cortical recordings
In the chaotic phase, we can estimate the intrinsic timescale of a unit from its autocorrelation function as the half-width at its autocorrelation half maximum (Figure 2a-iv, , and ). The chaotic phase in the network, Equation 1, is characterized by a large range of timescales that can be simultaneously realized across the units with different self-couplings. In a network with two self-couplings and in the chaotic regime, we found that the ratio of the timescales increases as we increase the self-couplings ratio (Figure 2b). The separation of timescales depends on the relative fractions and of the fast and slow populations, respectively. When the fraction of approaches zero, (with ), the log of the timescale ratio exhibits a supralinear dependence on the self-couplings ratio, as described analytically in Methods ('Universal colored-noise approximation to the Fokker-Planck theory'), with a simplified estimation given in Equation 4, leading to a vast separation of timescales. Other self-couplings ratios approach the timescale supralinear separation as the fraction of increases. We note that all uses of ‘log’ to evaluate the timescale growth and otherwise assume the base e.
In the case of a lognormal distribution of self-couplings, in the chaotic regime the network generates a reservoir of multiple timescales ’s of chaotic activity across network units, spanning across several orders of magnitude (Figure 3b). For long-tailed distributions such as the lognormal, mean-field theory can generate predictions for rare units with large self-couplings from the tail end of the distribution by solving Equation 2 and the continuous version of Equation 3, see Methods ('Dynamic mean-field theory with multiple self-couplings') Equation 13. The solution highlights the exponential relation between a unit’s self-coupling and its autocorrelation decay time (Figure 3aii, purple dashed line).
During periods of ongoing activity, the distribution of single-cell autocorrelation timescales in primate cortex was found to be right-skewed and approximately lognormal, ranging from 10 ms to a few seconds (Cavanagh et al., 2016; Figure 3bi). Can the reservoir of timescales generated in our heterogeneous network model explain the distribution of timescales observed in primate cortex? We found that a model with a lognormal distribution of self-couplings can generate a long-tailed distribution of timescales which fits the distribution observed in awake primate orbitofrontal cortex (Figure 3bi). This result shows that neural circuits with heterogeneous assemblies can naturally generate the heterogeneity in intrinsic timescales observed in cortical circuits from awake primates.
Separation of timescales in the bistable chaotic regime
To gain an analytical understanding of the parametric separation of timescales in networks with heterogeneous self-couplings, we consider the special case of a network with two self-couplings where a large sub-population () with comprises all but one slow probe unit, , with large self-coupling (see Methods:'Universal colored-noise approximation to the Fokker-Planck theory' for details). In the large limit, we can neglect the backreaction of the probe unit on the mean-field and approximate the latter as an external Gaussian colored noise with autocorrelation , independent of . The noise then represents the effect on the probe unit of all other units in the network and can be parameterized by the noise strength and its timescale (color) . For large , the dynamics of the probe unit leads to the emergence of a bi-stable chaotic phase whereby its activity is localized around the critical points (Figure 4a–i) and switches between them at random times. In the regime of colored noise (as we have here, with ), the stationary probability distribution (Figure 4a–ii and b) satisfies the unified colored noise approximation to the Fokker Planck equation (see Methods:'Universal colored-noise approximation to the Fokker-Planck theory', Hänggi and Jung, 1995; Jung and Hänggi, 1987), based on an analytical expression for its effective potential as a function of the self-coupling and the noise color (). The distribution is concentrated around the two minima of . The main effect of the strong color is to sharply decrease the variance of the distribution around the minima , compared to the white noise case (). This is evident from comparing the colored noise with white noise (Figure 4a-iv,v,vi).
In our network with colored noise, the probe unit’s temporal dynamics are captured by the mean first passage time for the escape out of the potential well defined by the effective potential , yielding good agreement with simulations at increasing , as expected on theoretical ground (Hänggi and Jung, 1995; Jung and Hänggi, 1987; Figure 4c). The asymptotic scaling of the mean first passage time for large is
In this slow probe regime, we thus achieved a parametric separation of timescales between the population , with its intrinsic timescale , and the probe unit whose activity fluctuations exhibit two separate timescales: the slow timescale of the dwelling in each of the bistable states and the fast timescale of the fluctuations around the metastable states (obtained by expanding the dynamical equation around the meta-stable values ). One can generalize this metastable regime to a network with units which belong to a group with and slow probe units , for , with large self-couplings . The slow dynamics of each probe unit is captured by its own mean first passage time (between the bistable states) in (Equation 22) and all slow units are driven by a shared external colored noise with timescale . In summary, in our model multiple timescales can be robustly generated with specific values, varying over several orders of magnitude.
Is the relationship between the unit’s self-coupling and its timescale relying on single-unit properties, or does it rely on network effects? To answer this question, we compare the dynamics of a unit when driven by a white noise input vs. the self-consistent input generated by the rest of the recurrent network (i.e. the mean-field). If the neural mechanism underlying the timescale separation was a property of the single-cell itself, we would observe the same effect regardless of the details of the input noise. We found that the increase in the unit’s timescale as a function of is absent when driving the unit with white noise, and it only emerges when the unit is driven by the self-consistent mean-field (Figure 4c). We thus concluded that this neural mechanism is not an intrinsic property of a single unit but requires the unit to be part of a recurrently connected network.
A reservoir of timescales in E-I spiking networks
We next investigated whether the neural mechanism endowing the rate network (1) with a reservoir of timescales could be implemented in a biologically plausible model based on spiking activity and excitatory/inhibitory cell-type specific connectivity. To this end, we modeled the local cortical circuit as a recurrent network of excitatory (E) and inhibitory (I) current-based leaky integrated-and-fire neurons (see Methods:'Spiking network model' for details), where both E and I populations were arranged in neural assemblies (Figure 5a; Amit and Brunel, 1997; Litwin-Kumar and Doiron, 2012; Wyrick and Mazzucato, 2021). Synaptic couplings between neurons in the same assembly were potentiated compared to those between neurons in different assemblies. Using mean-field theory, we found that the recurrent interactions of cell-type specific neurons belonging to the same assembly can be interpreted as a self-coupling, expressed in terms of the underlying network parameters as , where is the assembly size and is the average synaptic coupling between E neurons within the assembly (see Methods: 'Spiking network model' for details). The spiking network time-varying activity unfolds through sequences of metastable attractors (Litwin-Kumar and Doiron, 2012; Wyrick and Mazzucato, 2021), characterized by the co-activation of different subsets of neural assemblies (Figure 5a). These dynamics rely on the bistable activity of each assembly, switching between high and low firing rate states. The dwell time of an assembly in a high-activity state increases with larger sizes and with stronger intra-assembly coupling strength (Figure 5b). This metastable regime in spiking networks is similar to the bi-stable, heterogeneous timescales activity observed in the random neural networks endowed with heterogeneous self-couplings. We further examined the features of the metastable regime in spiking networks in order to compare the mechanism underlying the heterogeneous timescale distributions in the rate and spiking models. The two models exhibit clear differences in their “building blocks”. In the rate network, the transfer function is odd () leading to bistable states with values localized around . In the spiking model, the single neuron current-to-rate transfer function is strictly positive so that the bistable states can have both high and low firing rate values. As a result, in the spiking network, unlike in the rate network, a variety of firing rate levels can be attained during the high activity epochs, depending on both the assembly size and the the number of other simultaneously active assemblies. In other words, the rate level depends on the specific network attractor visited within the realization of the complex attractor landscape (Mazzucato et al., 2015; Wyrick and Mazzucato, 2021).
Despite these differences, we found crucial similarities between the time-varying activity in the two models related to the underlying neural mechanism. The characteristic timescale of the assembly metastable dynamics can be estimated from its average activation time (Figure 5a and b). We tested whether the heterogeneity in the assembly self-coupling distribution could lead to a heterogeneous distribution of timescales. We endowed the network with a heterogeneous distribution of assembly sizes (an additional source of heterogeneity originates from the Erdos-Renyi connectivity), yielding a heterogeneous distribution of self-couplings (Figure 5b). We found that the assembly activation timescales grew as the assembly’s self-coupling increased, spanning overall a large range of timescales from 20 ms to 100 s i.e. the whole range of our simulation epochs (Figure 5b). In particular, the functional dependence of vs. self-coupling was best fit by a quadratic polynomial (Figure 5b inset, see Methods: 'Spiking network model' for details), in agreement with the functional dependence obtained from the analytical calculation in the rate model (4). We thus concluded that a reservoir of timescales can naturally emerge in biologically plausible spiking models of cortical circuits from a heterogeneous distribution of assembly sizes. Both the range of timescales (20 ms-100 s) (Cavanagh et al., 2016) and the distribution of assembly sizes (50–100 neurons) (Perin et al., 2011; Marshel et al., 2019) are consistent with experimental observations.
What is the relationship between the distribution of timescales in the E/I spiking model and the chaotic rate network? We can obtain crucial insights by combining analysis of the synaptic weight matrix together with a mean-field approach in a linear approximation, following the approach of Murphy and Miller, 2009; Schaub et al., 2015. In the spiking network, the non-normal weight matrix exhibits the typical E/I structure with the four submatrices representing E/I cell-type specific connectivity; within each of the four E/I submatrices, diagonal blocks highlight the paired E/I clustered architecture (the heterogeneous distribution of cluster sizes is manifest in the increasing size of the diagonal blocks, Appendix 2—figure 1a). To interpret the dynamical features emerging from this weight matrix, we examined a mean-field reduction of the -dimensional network to a set of mean-field variables, representing the E and I clusters plus the two unclustered background E and I populations (see Appendix 2). The eigenvalues of the mean-field-reduced weight matrix comprise a subset of the full weight matrix , capturing the salient features of the spiking network dynamics in a linear approximation (Figure 5c; see Appendix 2 and Murphy and Miller, 2009; Schaub et al., 2015). The weights matrix exhibits a spectral gap, beyond which a distribution of eigenvalues with real parts larger than one correspond to slow dynamical modes. To identify these slow modes, we examined the Schur eigenvectors of , which represent independent dynamical modes in the linearized theory (i.e. an orthonormal basis; see Appendix 2 and Appendix 2—figure 2).
We found that the Schur eigenvectors associated with those large positive eigenvalues can be approximately mapped onto E/I cluster pairs. More specifically, eigenvalues with increasingly larger values correspond to assemblies of increasingly larger sizes (Figure 5c), which, in turn, are associated with slower timescales (Figure 5b). We conclude that the slow switching dynamics in the spiking network is linked to large positive eigenvalues of the synaptic weight matrix, and the different timescales emerge from a heterogeneous distribution of these eigenvalues. For comparison, in the chaotic rate network, the eigenvalue distribution of the weight matrix exhibits a set of eigenvalues with large positive real parts as well, Appendix 2—figure 3a. The relation between the value of an eigenvalue and the slow dynamics holds in the rate networks as well: increasingly larger eigenvalues correspond to increasingly larger cluster self-couplings (Appendix 2—figure 3) which are associated with slower dynamics (Figure 3aii). Therefore, the dynamical structure in the rate networks qualitatively matches the structure uncovered in the mean-field-reduced spiking network weight matrix (Figure 5C). In summary, our analysis shows that in both spiking and rate networks the reservoir of timescales is associated with the emergence of a heterogeneous distribution of large positive eigenvalues in the weight matrix. This analysis suggests that the correspondence between rate networks and spiking networks should be performed at the level of dynamical modes associated with these large positive eigenvalues in the Schur basis, where rate units in the rate network can be related to E/I cluster pairs (Schur eigenvectors) in the spiking network. A potential difference between the two models may be related to the nature of the transitions between bistable states in the rate network vs the transitions between low and high activity states in the spiking network. In the rate network, transitions are driven by the self-consistent colored noise, the hallmark of the chaotic activity arising from the disorder in the synaptic couplings. In the spiking network, although the disorder in the inter-assembly effective couplings may contribute to the state transitions, there might be finite size effects at play, due to the small assembly size.
Spatial de-mixing of time-varying broadband input
What are the computational benefits of having multiple timescales simultaneously operating in the same circuit? Previous work in random networks with no self-couplings ( in Equation 1) showed that stimulus-driven suppression of chaos is enhanced at a particular input frequency, related to the network’s intrinsic timescale (Rajan et al., 2010). The phenomenon was preserved when a single rate of adaptation was added to all units (Muscinelli et al., 2019). We investigated whether, in our network with two different self-couplings (in the chaotic regime), the stimulus-dependent suppression of chaos exhibited different features across the two sub-populations, depending on their different intrinsic timescale. We drove each network unit with an external broadband stimulus consisting of the superposition of sinusoidal inputs of different frequencies in the range Hz, with an equal amplitude and random phases . We found that the sub-population with a slow, or fast, intrinsic timescale preferentially entrained its activity with slower, or faster, spectral components of the broadband stimulus, respectively (Figure 6a). We quantified this effect using a spectral modulation index , where is the power-spectrum peak of sub-population at the frequency (Figure 6b). A positive, or negative, value of reveals a stronger, or weaker, respectively, entrainment at frequency in the sub-population compared to exhibited a crossover behavior whereby the low frequency component of the input predominantly entrained the slow population , while the fast component of the input predominantly entrained the fast population . When fixing and varying , we found that the dependence of the crossover frequency on was strong at low input amplitudes and was progressively tamed at larger input amplitudes (Figure 6c). This is consistent with the fact that the input amplitude completely suppresses chaos beyond a certain critical value, as previously reported in network’s with no self-couplings (Rajan et al., 2010) and with adaptation (Muscinelli et al., 2019).
Discussion
We demonstrated a new robust and biologically plausible network mechanism whereby multiple timescales emerge across units with heterogeneous self-couplings. In our model, units are interpreted as neural assemblies consistent with experimental evidence from cortical circuits (Perin et al., 2011; Lee et al., 2016; Kiani et al., 2015; Miller et al., 2014; Marshel et al., 2019), and previous theoretical modeling (Litwin-Kumar and Doiron, 2012; Wyrick and Mazzucato, 2021). We found that the neural mechanism underlying the large range of timescales is the heterogeneity in the distribution of self-couplings (representing neural assembly size). We showed that this mechanism can be naturally implemented in a biologically plausible model of a neural circuit based on spiking neurons with excitatory/inhibitory cell-type specific connectivity. This spiking network represents a microscopic realization of our mechanism where neurons are arranged in assemblies, and an assembly’s self-coupling represents the strength of the recurrent interactions between neurons belonging to that assembly, proportional to its size. A heterogeneous distribution of assembly sizes, in turn, generates a reservoir of timescales. Crucially, our model captured the distribution of intrinsic timescales observed across neurons recorded within the same area in primate cortex (Cavanagh et al., 2016).
Several experimental studies uncovered heterogeneity of timescales of neural activity across brain areas and species. Comparison of the population-averaged autocorrelations across cortical areas revealed a hierarchical structure, varying from 50 ms to 350 ms along the occipital-to-frontal axis (Murray et al., 2014). Neurons within the same area exhibit a wide distribution of timescales as well. A heterogeneous distribution of timescales (from 0.5 s to 50 s) was found across neurons in the oculomotor system of the fish (Miri et al., 2011) and primate brainstem (Joshua et al., 2013), suggesting that timescale heterogeneity is conserved across phylogeny. During periods of ongoing activity, the distribution of single-cell autocorrelation timescales in primates was found to be right-skewed and approximately lognormal, ranging from 10 ms to a few seconds (Cavanagh et al., 2016 and Figure 3). Single neuron activity was found to encode long reward memory traces in primate frontal areas over a wide range of timescales up to 10 consecutive trials (Bernacchia et al., 2011). In these studies, autocorrelation timescales were estimated using parametric fits, which may be affected by statistical biases, although Bayesian generative approaches might overcome this issue (Zeraati et al., 2020). In our model, we estimated timescales nonparametrically as the half-width at half-maximum of the autocorrelation function. In our biologically plausible model based on a spiking network with cell-type specific connectivity, the distribution of timescales was in the range between 20 ms and 100 s, similar to the range of timescales observed in experiments (Miri et al., 2011; Joshua et al., 2013; Cavanagh et al., 2016). Moreover, the distribution of assembly sizes in our model is within the range of 50–100 neurons, consistent with the size of functional assemblies experimentally observed in cortical circuits (Perin et al., 2011; Marshel et al., 2019). A fundamental new prediction of our model, to be tested in future experiments, is the direct relationship between assembly size and its timescale.
Previous neural mechanisms for generating multiple timescales of neural activity relied on single-cell bio-physical properties, such as membrane or synaptic time constants (Gjorgjieva et al., 2016). In feedforward networks, developmental changes in single-cell conductance can modulate the timescale of information transmission, explaining the transition from slow waves to rapid fluctuations observed in the developing cortex (Gjorgjieva et al., 2014). However, the extent to which this single-cell mechanism might persist in the presence of strong recurrent dynamics was not assessed. To elucidate this issue, we examined whether a heterogeneous distribution of single-unit integration time constants could lead to a separation of timescales in a random neural network (see Appendix 3 for details). In this model, half of the units were endowed with a fixed fast time constant and the other half with a slow time constant, whose value varied across networks. We found that, although the average network timescale increased proportionally to the value of the slower time constants, the difference in autocorrelation time between the two populations remained negligible. These results suggest that, although the heterogeneity in single-cell time constants may affect the dynamics of single neurons in isolation or within feedforward circuits (Gjorgjieva et al., 2014), the presence of strong recurrent dynamics fundamentally alters these single-cell properties in a counterintuitive way. Our results suggest that a heterogeneity in single-cell time constants may not lead to a diversity of timescales in the presence of recurrent dynamics.
Our results further clarified that the relationship between an assembly’s self-coupling and its timescale relies on the strong recurrent dynamics. This relationship is absent when driving an isolated assembly with white noise external input (Figure 4). Indeed, the mechanism linking the self-coupling to the timescale only emerged when driving the unit with a mean-field whose color was self-consistently obtained from an underlying recurrent network of self-coupled units.
Previous models showed that a heterogeneity of timescales may emerge from circuit dynamics through a combination of structural heterogeneities and heterogeneous long-range connections arranged along a spatial feedforward gradient (Chaudhuri et al., 2014; Chaudhuri et al., 2015). These networks can reproduce the population-averaged hierarchy of timescales observed across the cortex in the range of 50–350 ms (Murray et al., 2014; Chaudhuri et al., 2015). A similar network architecture can also reproduce the heterogeneous relaxation time after a saccade, found in the brainstem oculomotor circuit (Miri et al., 2011; Joshua et al., 2013), in a range between 10–50 s (Inagaki et al., 2019; Recanatesi et al., 2022). This class of models can explain a timescale separation within a factor of 10, but it is not known whether they can be extended to several orders of magnitude, as observed between neurons in the same cortical area (Cavanagh et al., 2016). Moreover, while the feedforward spatial structure underlying these two models is a known feature of the cortical hierarchy and of the brainstem circuit, respectively, it is not known whether such a feedforward structure is present within a local cortical circuit. Our model, on the other hand, relies on strong recurrent connectivity and local functional assemblies, two hallmarks of the architecture of local cortical circuits (Perin et al., 2011; Lee et al., 2016; Kiani et al., 2015; Miller et al., 2014; Marshel et al., 2019). Other network models generating multiple timescales of activity fluctuations were proposed based on self-tuned criticality with anti-hebbian plasticity (Magnasco et al., 2009), or multiple block-structured connectivity (Aljadeff et al., 2015).
In our model, the dynamics of units with large self-couplings, exhibiting slow switching between bistable states, can be captured analytically using the universal colored noise approximation (UCNA) to the Fokker-Planck equation (Hänggi and Jung, 1995; Jung and Hänggi, 1987). This is a classic tool from the theory of stochastic processes, which we successfully applied to investigate neural network dynamics for the first time. This slow-switching regime may underlie the emergence of metastable activity, ubiquitously observed in the population spiking activity of behaving mammals (Abeles et al., 1995; Jones et al., 2007; Mazzucato et al., 2015; Mazzucato et al., 2019; Recanatesi et al., 2022; Engel et al., 2016; Kadmon Harpaz et al., 2019). In these spiking networks, it is not known how to estimate the timescales of metastable activity from network parameters, and we anticipate that our UCNA may provide a powerful new tool for investigating network dynamics in these biologically plausible models.
What is the functional relevance of neural circuits exhibiting a reservoir of multiple timescales? The presence of long timescales deeply in the chaotic regime is a new feature of our model which may be beneficial for memory capacity away from the edge of chaos (Toyoizumi and Abbott, 2011). Moreover, we found that, in our model, time-dependent broadband inputs suppress chaos in a population-specific way, whereby populations of large (small) assemblies preferentially entrain slow (fast) spectral components of the input. Previously studied spiking models suggested that preferential entrainment of input is possible by cellular mechanisms (Lindner, 2016) or finite-size fluctuations in a feedforward network structure (Deger et al., 2014). Here, we presented a recurrent network mechanism for population-specific chaos suppression, independent of the network size. This mechanism may thus grant recurrent networks with a natural and robust tool to spatially demix complex temporal inputs (Perez-Nieves et al., 2021) as observed in visual cortex (Mazzoni et al., 2008). Third, the presence of multiple timescales may be beneficial for performing flexible computations involving simultaneously fast and slow timescales, such as in role-switching tasks (Iigaya et al., 2019) or as observed in time cells in the hippocampus (Kraus et al., 2013; Howard et al., 2014). A promising direction for future investigation is the exploration of the computational properties of our model in the context of reservoir computing (Sussillo and Abbott, 2009) or recurrent networks trained to perform complex cognitive tasks (Yang et al., 2019).
Methods
Dynamic mean-field theory with multiple self-couplings
We derive the dynamic mean-field theory in the limit by using the moment generating functional (Sompolinsky and Zippelius, 1982; Crisanti and Sompolinsky, 1987). For the derivation we follow the Martin-Siggia-Rose-De Dominicis-Janssen path integral approach formalism (Martin et al., 1973) as appears extensively in Helias and Dahmen, 2020; we borrow their notations as well.
The model includes two sets of random variables, the connectivity couplings for , are drawn independently from a Gaussian distribution with variance and mean 0; and the self-couplings for , whose values are of order 1. When we examine the dynamics governing each unit in Equation 1, the sum over the random couplings contributes terms which, in the limit , ensure that the net contribution (mean and variance) from this sum remains of order 1. Hence, in our model, as in previous models, is the quenched disorder parameter, whose sum gives rise to the mean-field. The self-couplings (one for each unit) contribute an additional term to the moment generating functional. Each unit’s activity strongly depends on the value of its own self-coupling, and hence can’t be averaged over when we study a unit’s dynamics. After averaging over , we can study all units with the same self-coupling together, as they obey the same mean-filed equation, Equation 2. Moreover we show that all units, regardless of their self-coupling, obey a single mean-field due to the structure of . We note that the results of this Methods section, including Equation 5 and Equation 6, will not be affected by diagonal elements in J which are not zero but rather drawn from the same distribution as the off-diagonal elements (as in the main text) since the contribution of such non-zero elements is negligible overall. To maintain the clarity of the text, and since the results are not affected by it, we left out the differentiation between including and excluding diagonal elements of J of order 1/sqrt(N) in the main text.
For our model, Equation 1, the moment generating functional is, therefore, given by:
where and . To start, we calculate . We take advantage of the self-averaging nature of our model, particularly by averaging over the quenched disorder, . The couplings, , are i.i.d. variables extracted from a normal distribution and appear only in the last term in (Equation 5). We, hence, focus our current calculation step on that term, and we derive the result to the leading term in , yielding:
The result above suggests that all the units in our network are coupled to one another equivalently (by being coupled only to sums that depend on all units’ activity). To further decouple the network, we define the quantity
We enforce this definition by multiplying the disordered averaged moment generating functional with the appropriate Dirac delta function, , in its integral form:
where is an integral over the imaginary axis (including its factor). We can now rewrite the disordered averaged moment generating functional, using (Equation 6) to replace its last term, the definition of , and with multiplying the functional by the function above. All together we get:
with the fraction of units with self-couplings across the population, for . In the expression above we made use of the fact that and , now in a role of auxiliary fields, couple to sums of the fields and and hence the generating functional for and can be factorized with identical multiplications of . Note that in our network, due to the dependency on , -s are equivalent as long as -s are equivalent. Hence, the factorization is for for all with . Now each factor includes the functional integrals and for a single unit with self-coupling .
In the large limit we evaluate the auxiliary fields in (Equation 7) by the saddle point approach.e note variable valued at the saddle point by (), obtaining:
and yielding the saddle point values :
where , with , represents the average autocorrelation function of the network (as was defined in the main text, Equation 3). The second saddle point vanishes due to as can be immediately extended from Helias and Dahmen, 2020; Sompolinsky and Zippelius, 1982. The action at the saddle point reduces to the sum of actions for individual, non-interacting units with self-coupling . All units are coupled to a common external field . Inserting the saddle point values back into Equation 7, we obtain where
Thus in the large limit the network dynamics are reduced to those of a number of units , each represents the sub-population with self-couplings and follows dynamics governed by
for all and where is a Gaussian mean-field with autocorrelation
The results above can be immediately extended for the continuous case of self-coupling distribution yielding:
with the density function of the self-couplings distribution in the network and the units dynamics dependent on their respective self-couplings:
An iterative solution
We use an iterative approach to solving the mean-field equations, Equation 11 and Equation 12 for a discrete distribution of self-couplings, or Equation 13 and Equation 14 for a continuous distribution of self couplings. The approach is adopted from Stern et al., 2014 and adapted to allow for a consideration of multiple self-couplings. We briefly describe it in what follows. We start by making an initial guess for the mean-field autocorrelation function , as defined in Equation 3. In its Fourier space, we multiply it by a random angle and and transform it back to generate an instance of the mean-field (see Stern et al., 2014 for more details). We create additional instances by repeating the procedure described above. At least one instance is created per each value drawn from a discrete distribution of self-couplings with support set , or per each value drawn from in a case of a continuous distribution. We then solve Equation 11 (or equivalently Equation 14 in the case of a continuous distribution) to obtain solutions for , one solution for each value of . The set of solutions allows us to calculate the set . For a discrete distribution, we then multiply each by its relative weight to compute , Equation 12. For a continuous distribution, we sum all , multiplied by , with their amount, to estimate , Equation 13 (since values were drawn from each captures approximately of the distribution). We use these sampled mean-field autocorrelations instead of our initial guess to repeat the entire procedure. This leads to obtaining another . We iterate until the average across iterations of converges. We note that for the continuous distribution case, we increase the number of drawn values as the iterations progress (starting from very few and ending with many). This allows us to maintain a rapid iterative process and yet receive an accurate solution thanks to the refining of the process with each iteration.
Fixed points and transition to chaos
Networks with heterogeneous self-couplings exhibit a complex landscape of fixed points , obtained as the self-consistent solutions to the static version of Equation 2 and Equation 3, namely
where the mean-field is a Gaussian random variable with zero mean and its variance is given by
The solution for each unit depends on its respective (Appendix 1-Figure 1). If a single interval around zero is available. For , for a range of values of , can take values in 1 of 3 possible intervals. Let us consider a network with two sub-populations with and portions of the units with self-couplings and , respectively. In the plane, this model gives rise to a phase diagram with a single chaotic region separating four disconnected stable fixed-point regions (Figure 2a). We will first discuss the stable fixed points, which present qualitatively different structures depending on the values of the self-couplings. Within the region of both self-couplings , the only possibility for a stable fixed point is the trivial solution, with all (Figure 2a), where the network activity quickly decays to zero. When at least one self-coupling is greater than one, there are three stable fixed point regions (Figure 2a); in these three regions, the network activity starting from random initial conditions unfolds via a long-lived transient period, and then it eventually settles into a stable fixed point. This transient activity with late fixed points is a generalization of the network phase found in Stern et al., 2014. When both self-couplings are greater than one () the fixed point distribution in each sub-population is bi-modal (Figure 2a–ii). When and , the solutions for the respective sub-populations are localized around bi-modal fixed points and around zero, respectively (Figure 2a–i).
However, the available solutions in the latter case are further restricted by stability conditions. In the next Methods section we derive the stability condition by expanding the dynamical Equation 1 around the fixed point and requiring that all eigenvalues of the corresponding stability matrix are negative. Briefly, the fraction of units with at a stable fixed point are restricted to have support on two disjoint intervals . We refer to this regime as multi-modal, a direct generalization of the stable fixed points regime found in Stern et al., 2014 for a single self-coupling , characterized by transient dynamics leading to an exponentially large number of stable fixed points. For the portion of units with , the stable fixed point is supported by a single interval around zero.
Stability conditions
To determine the onset of instability, we look for conditions such that at least one eigenvalue develops a positive real part. An eigenvalue of the stability matrix exists at a point in the complex plane if Stern et al., 2014; Ahmadian et al., 2015
The denominator of the expression above is plus the slope of the curve in Appendix 1—figure 1 ai and aii. Hence, a solution whose value gives a negative slope (available when ) leads to a vanishing value of the denominator at some positive and to a positive eigenvalue and instability. Therefore, the fraction of units with at a stable fixed point are restricted to have support on two disjoint intervals . We refer to this regime as multi-modal, a direct generalization of the stable fixed points regime found in Stern et al., 2014 for a single self-coupling , characterized by transient dynamics leading to an exponentially large number of stable fixed points. For the portion of units with , the stable fixed point is supported by a single interval around zero.
A fixed point solution becomes unstable as soon as an eigenvalue occurs at , obtaining from Equation 17 the stability condition
where . For the two possible consistent solutions to (Equation 15) that may result in a stable fixed point (from the two disjoint intervals in Appendix 1—figure 1a-i), contribute differently to . Larger decreases (Appendix 1—figure 1b–i), thus improving stability. Choices for distributions of along the two intervals become more restricted as increases or decreases, since both render higher values for the stability condition, Equation 18, forcing more solutions of to decrease . This restricts a larger fraction of at the fixed points to the one solution with a higher absolute value. At the transition to chaos, a single last and most stable solution exists with all values chosen with their higher absolute value (Appendix 1—figure 1b–i, light green segments). For those with only one solution is available, obtained by the distribution of through consistency (Equation 15) at the fixed point. In this configuration, the most stable solution is exactly transitioning from stability to instability where (Equation 18) reaches unity. Hence, the transition from stable fixed points to chaos occurs for a choice of and such that solving consistently (Equation 15) and (Equation 16) leads to saturation of the stability condition (Equation 18) at one.
Universal colored-noise approximation to the Fokker-Planck theory
We consider the special case of a network with two self-couplings where a large sub-population () with comprises all but one slow probe unit, , with large self-coupling . The probe unit obeys the dynamical equation , with . In the large limit, we can neglect the backreaction of the probe unit on the mean-field and approximate the latter as an external Gaussian colored noise with autocorrelation , independent of . The noise can be parameterized by its strength, defined as and its timescale (color) . For large , the dynamics of the probe unit can be captured by a bi-stable chaotic phase whereby its activity is localized around the critical points (Figure 4a–i) and switches between them at random times. In the regime of colored noise (as we have here, with ), the stationary probability distribution (for , Figure 4a–ii and b) satisfies the unified colored noise approximation to the Fokker Planck equation (Hänggi and Jung, 1995; Jung and Hänggi, 1987):
where is a normalization constant, , and the effective potential is therefore:
The distribution has support in the region comprising two disjoint intervals where (Figure 4b). Moreover, the probability distribution is concentrated around the two minima of . The new UCNA-based term dominates the effective potential. The main effect of the strong color is to sharply decrease the variance of the distribution around the minima . This is evident from comparing the colored noise with white noise, when the latter is driving the same bi-stable probe , where is a white noise with an equivalent strength to the colored noise, Figure 4iv-vi. The naive potential for the white noise case is obtained from Equation 19 by sending in the prefactor and in potential . It results in wider activity distribution compared to our network generated colored noise, in agreement with the simulations, Figure 4b.
In our colored-noise network, the probe unit’s temporal dynamics are captured by the mean first passage time for the escape out of the potential well:
where the upper limit in the outer integral is the edge of the support of . In the small approximation, we can evaluate the integrals by steepest descent. The inner integrand is peaked at the minimum of the effective potential, yielding
The outer integrand can be rewritten as , where peaks at with . The mean first passage time can thus be estimated as
where and its asymptotic scaling for large leads to Equation 4. We validated the UCNA approach to calculate the mean first passage time by estimating the distribution of escape points from one well to the other well, which was found to lie predominantly within the support of the stationary probability distribution . Only a fraction of activity in the simulations (1.8+/-0.4) * 10-3 (mean±SD over 10 probe units run with parameters as in Figure 4b) entered the forbidden region (see Figure 4—figure supplement 1 for details ).
A comparison with white noise
To test the impact of the input generated by the network (or equivalently as mimicked by the colored noise), we replaced this input (Equation 1, most rhs term) with white noise. The probe unit in the white noise case is, therefore, following the dynamical equation:
with taken from a normal distribution, and ,, and are constants receiving their values according to the probe unit dynamics driven by the network case; specifically these constants are the probe unit self-coupling strength, the original network gain, and strength (the integral under the autocorrelation function of the network input to the probe). Simulation results of the probe dynamics are in Figure 4aii, along with its distribution, Figure 4b (light green area, parameters’ values for ,, and are specified in the caption). To estimate the probability (Equation 19) and the potential (Equation 20) in this case, and since here is white noise, we substitute as no correlation in the input exists. Similarly, we calculate the probe’s approximated mean first passage time when driven by white noise (Equation 4). The result (Figure 4c light green dashed line) estimates the simulations well (Figure 4c green line). Note that since depends on linearly, its exponent, the mean first passage time, depends on exponentially. Hence, the importance of the ‘color’ (correlations) in the network input in generating long timescales and the failure of these long timescales to materialize when the ‘color’ is removed (as in this particular white noise-driven probe case, which replicates the assemblies endowed network model except for its generated correlated input drive).
Spiking network model
Network architecture
We simulated a recurrent network of excitatory (E) and inhibitory (I) spiking neurons (for ) with relative fractions and and connection probabilities and (Figure 5). Non-zero synaptic weights from pre-synaptic neuron to post-synaptic neuron were , whose values only depended on the two neurons types for . Neurons were arranged in cell-type specific assemblies. E assemblies had heterogeneous sizes drawn from a uniform distribution with a mean of E-neurons and 30% standard deviation. The number of assemblies was determined as , where is the fraction of background neurons in each population, i.e., not belonging to any assembly. I assemblies were paired with E assemblies and the size of each I assembly was matched to the corresponding E assembly with a proportionality factor . Neurons belonging to the same assembly had potentiated intra-assembly weights by a factor , while those belonging to different assemblies had depressed inter-assembly weights by a factor , where: , , , and for , with . is the fraction of E neurons in each assembly. Parameter values are in Table 1.
Single neuron dynamics
Single neuron dynamics. We simulated current-based leaky-integrate-and-fire (LIF) neurons, with membrane potential and dynamical equation
where is the membrane time constant. Input currents included a contribution from the other recurrently connected neurons and a constant external current (units of mV s-1), for , representing afferent inputs from other brain areas and . When the membrane potential hits the threshold (for ), a spike is emitted and is held at the reset value for a refractory period . We chose the thresholds so that the homogeneous network (i.e. where all ) was in a balanced state with average spiking activity at rates spks/s. The post-synaptic currents evolved according to
where is the synaptic time constant, are the recurrent couplings and is the time of the k-th spike from the j-th presynaptic neuron. Parameter values are in Table 1.
Self-couplings from mean-field theory
We can estimate the E-assembly self-couplings in this model using mean-field methods (Amit and Brunel, 1997; Wyrick and Mazzucato, 2021). This method allows obtaining, self-consistently, the fixed point values of the firing rates in the l-th assembly () via the equation
where is the leaky-integrate-and-fire current-to-rate transfer function for each population
where and and are terms accounting for the synaptic dynamics (Fourcaud and Brunel, 2002). The infinitesimal means and variances of the network populations comprising E and I assemblies (for assemblies) are themselves functions of the firing rates, thus leading to self-consistent equations for the fixed points (for more details see Wyrick and Mazzucato, 2021). The infinitesimal mean of the postsynaptic input to a neuron in a representative E assembly in focus is
where is the firing rate of the E assembly in focus and is the firing rate of its paired I assembly; , for are the firing rates of the other E and I assemblies; are the firing rates of the background (unclustered) populations. represent the fraction of E and I neurons in each assembly, which are drawn from a uniform distribution (see above). The first line in Equation 26 represents the contribution to the input current coming from neurons within the same E assembly, or, in other words, the self-coupling of the assembly in focus. We can thus recast the first term in the input current as where . The number of neurons in the assembly is given by , and the average E-to-E synaptic coupling is , from which we obtain , which is the expression we used in Figure 5. We can thus recast Equation 26 as
where represent effective synaptic couplings which depend on the underlying spiking network parameters in Equation 26. In the spiking model, the self-couplings have units of [mV]. The first line in Equation 27 exhibits the same functional form as the rate model in Equation 1, if we identify each rate unit as a functional assembly with a corresponding self-coupling. A crucial simplification occurring in the rate model Equation 1 is the absence of cell-type specific connectivity and the corresponding difference in the statistics of the distribution of the effective couplings , whose mean is zero in Equation 1 but non-zero in Equation 27. If we interpret the current in the rate model as a membrane potential with units of [mV] (see Miller and Fumarola, 2012), and the current-to-rate function as a normalized (min-maxed) firing rate fluctuation around baseline (see Ahmadian et al., 2015), then the self-coupling in the rate model exhibits units of [mV] as in the spiking model. However, direct numerical comparison of the self-couplings between the two models is hampered by the fact that the spiking model is a balanced network, where E and I contributions to the total input current are large and cancel to leading order (Wyrick and Mazzucato, 2021), whereas the rate network does not operate in the balanced regime.
Model selection for timescale fit
Model selection for timescale fit The degree of the polynomial that best fits the dependence of the logarithm of the assembly timescales on the assembly self-couplings was estimated using cross-validation (inset in Figure 5b), according to the following steps. (1) The dataset was split into a training set and a test set of the same size. (2) Polynomial functions with increasing degrees were fit to the training set. (3) The mean-squared error of the test set was obtained from the corresponding fit. (4). A minimum was achieved for a polynomial of degree 2. All logarithms, abbreviated as ‘log’ in the main text, is in the natural base .
Appendix 1
Dynamical regions of networks with identical self-couplings
It is constructive to quickly survey the results of Stern et al., 2014 who studied the special case of including a single value self-coupling for all assemblies in the network, . In this case, the dynamics of all units in the network follow:
Two variables determine the network dynamics, the network gain and the self-coupling value . The network gain defines the strength of the network impact on its units. It brings the network into chaotic activity, without self-coupling (), for values (Sompolinsky et al., 1988). The self-coupling generates bi-stability. Without network impact () the dynamical Equation 28 for each unit becomes
which has two stable solutions for (Appendix 1—figure 2a), both at . For (Appendix 1—figure 2b), a single stable solution exists at .
When small values of network gain are introduced to the network dynamics, Equation 28, with identical bi-stable units (), each unit solution jitters around one of its two possible fixed points. After an irregular activity, the network settles into a stable fixed point. This generates a region of transient irregular activity with stable fixed points (Appendix 1—figure 2c). As increases and decreases, different possible fixed point configurations lose their stability (as a result, the typical time spent in the transient activity increases). When the critical line is crossed, no fixed point remains stable and the network activity becomes chaotic (Stern et al., 2014). The ‘last’ stable fixed point at the transition line has a unique configuration with all unit values located farthest from (Appendix 1—figure2a, light green lines). Additional decrease of and leads to a region where any initial activity of the network decays and the trivial solution ( for all ) is stable (Appendix 1—figure 2c)
Appendix 2
Metastable dynamics in the spiking network model
The spiking network model with E/I clusters is based on a weight matrix with four E/I submatrices, each one exhibiting diagonal blocks representing potentiated intra-cluster synaptic weights (see Appendix 2—figure 1a, b). We can approximate the full weight matrix with a mean-field reduction to a set of mean-field variables , representing the E and I clusters and the two unclustered background E and I populations. In order to gain an intuitive understanding of the population dynamics encoded by , we follow the approach in Murphy and Miller, 2009; Schaub et al., 2015 and can consider an auxiliary linear system
which was shown to capture the transition from asynchronous irregular activity to metastable attractor dynamics in Schaub et al., 2015. A Schur decomposition of gives an upper triangular matrix , whose Schur eigenvectors (columns of ) represent independent dynamical modes (i.e. an orthonormal basis; Appendix 2—figure 1b). The Schur eigenvalues (diagonal values in the Schur matrix) correspond to the real part of the eigenvalues of the original matrix . The Schur eigenvectors associated with the large positive eigenvalues approximately correspond to E/I cluster pairs; larger eigenvalues correspond to clusters of increasingly larger size (Appendix 2—figure 1b).
Clustered networks generate metastable attractor dynamics where coupled E/I cluster pairs switch between periods of low and high firing rate activity, yielding a bimodal firing rate distribution (Appendix 2—figure 2a, b, see Litwin-Kumar and Doiron, 2012; Deco and Hugues, 2012; Mazzucato et al., 2015; Mazzucato et al., 2019; Mazzucato et al., 2016; Wyrick and Mazzucato, 2021; Schaub et al., 2015). A more detailed analysis revealed that the network activity explores metastable attractors with varying number of simultaneously active clusters (from one to four in the representative network with neurons in Appendix 2—figure 2c), yielding a complex attractor landscape (Mazzucato et al., 2015; Wyrick and Mazzucato, 2021). The firing rate of neurons in active clusters is inversely proportional to the number of simultaneously active clusters. Therefore, the activity of single neurons is not simply bistable, but rather multistable, with several levels of firing rates attainable depending on which attractor the network is expressing at any given time (one inactive and four active levels in this representative case). This single-neuron multistability property is a biologically plausible effect observed in cortical activity (Mazzucato et al., 2015; Recanatesi et al., 2022), however, it is not present in the rate network, where neurons are just bistable (Figure 4). In the spiking model, clusters are uncorrelated (0.01±0.12, mean±S.D. across 20 networks; Appendix 2—figure 2c), similarly to neurons in the rate network.
Appendix 3
RNN with heterogeneous time constants
Our recurrent neural network model in Equation 1, assumes that all units share the same time constant, ms, which measures the rate of change of a neuron’s membrane potential. We examined whether a network of units with heterogeneous time constants could give rise to multiple timescales of dynamics. We simulated the model from Equation 1 with no self-coupling term, , with neuron-specific time constant :
Following the same strategy as in Figure 2, we consider the scenario when our network contains two equal-size populations of neurons () with different time constants . We quantified each unit’s timescales, , as the width of the autocorrelation function at the mid height. When keeping fixed and increasing , we found that both populations increased their timescale Figure 1a(i-v), and the ratio between the timescales of the two populations, did not appreciably change over a large range of time constant ratios , Figure 1b. Hence, we conclude that heterogeneity in single-cell time constants does not lead to large separation of timescales in networks with strong recurrent dynamics.
Data availability
https://github.com/nistrate/multipleTimescalesRNN. (Copy archive at Istrate, 2022).
-
Dryad Digital RepositoryData from: Autocorrelation structure at rest predicts value correlates of single neurons during reward-guided choice.https://doi.org/10.5061/dryad.5b331
References
-
Properties of networks with partially structured and partially random connectivityPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 91:012820.https://doi.org/10.1103/PhysRevE.91.012820
-
Transition to chaos in random networks with cell-type-specific connectivityPhysical Review Letters 114:088101.https://doi.org/10.1103/PhysRevLett.114.088101
-
A reservoir of time constants for memory traces in cortical neuronsNature Neuroscience 14:366–372.https://doi.org/10.1038/nn.2752
-
Weak ergodicity breaking and aging in disordered systemsJournal de Physique I 2:1705–1713.https://doi.org/10.1051/jp1:1992238
-
Beyond mean field theory: statistical field theory for neural networksJournal of Statistical Mechanics 2013:03003.https://doi.org/10.1088/1742-5468/2013/03/P03003
-
Neural network mechanisms underlying stimulus driven variability reductionPLOS Computational Biology 8:e1002395.https://doi.org/10.1371/journal.pcbi.1002395
-
Fluctuations and information filtering in coupled populations of spiking neurons with adaptationPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 90:062704.https://doi.org/10.1103/PhysRevE.90.062704
-
Metabolic flux responses to pyruvate kinase knockout in Escherichia coliJournal of Bacteriology 184:152–164.https://doi.org/10.1128/JB.184.1.152-164.2002
-
Dynamics of the firing probability of noisy integrate-and-fire neuronsNeural Computation 14:2057–2110.https://doi.org/10.1162/089976602320264015
-
Intrinsic neuronal properties switch the mode of information transmission in networksPLOS Computational Biology 10:e1003962.https://doi.org/10.1371/journal.pcbi.1003962
-
Colored noise in dynamical systemsAdvances in Chemical Physics 89:239–326.https://doi.org/10.1002/9780470141489
-
A unified mathematical framework for coding time, space, and sequences in the hippocampal regionThe Journal of Neuroscience 34:4692–4707.https://doi.org/10.1523/JNEUROSCI.5808-12.2014
-
SoftwaremultipleTimescalesRNN, version swh:1:rev:bb4e9879442c1f3e3efa72420d75f453488bd0c4Software Heritage.
-
Dynamical systems: A unified colored-noise approximationPhysical Review. A, General Physics 35:4464–4466.https://doi.org/10.1103/physreva.35.4464
-
Movement decomposition in the primary motor cortexCerebral Cortex 29:1619–1633.https://doi.org/10.1093/cercor/bhy060
-
Mechanisms of information filtering in neural systemsIEEE Transactions on Molecular, Biological and Multi-Scale Communications 2:5–15.https://doi.org/10.1109/TMBMC.2016.2618863
-
Slow dynamics and high variability in balanced cortical networks with clustered connectionsNature Neuroscience 15:1498–1505.https://doi.org/10.1038/nn.3220
-
Self-tuned critical anti-Hebbian networksPhysical Review Letters 102:258102.https://doi.org/10.1103/PhysRevLett.102.258102
-
Statistical dynamics of classical systemsPhysical Review A 8:423–437.https://doi.org/10.1103/PhysRevA.8.423
-
Dynamics of multistable states during ongoing and evoked cortical activityThe Journal of Neuroscience 35:8214–8231.https://doi.org/10.1523/JNEUROSCI.4819-14.2015
-
Stimuli reduce the dimensionality of cortical activityFrontiers in Systems Neuroscience 10:11.https://doi.org/10.3389/fnsys.2016.00011
-
Spatial gradients and multidimensional dynamics in a neural integrator circuitNature Neuroscience 14:1150–1159.https://doi.org/10.1038/nn.2888
-
A hierarchy of intrinsic timescales across primate cortexNature Neuroscience 17:1661–1663.https://doi.org/10.1038/nn.3862
-
How single neuron properties shape chaotic dynamics and signal transmission in random neural networksPLOS Computational Biology 15:e1007122.https://doi.org/10.1371/journal.pcbi.1007122
-
Neural heterogeneity promotes robust learningNature Communications 12:5791.https://doi.org/10.1038/s41467-021-26022-3
-
Stimulus-dependent suppression of chaos in recurrent neural networksPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 82:011903.https://doi.org/10.1103/PhysRevE.82.011903
-
Emergence of slow-switching assemblies in structured neuronal networksPLOS Computational Biology 11:e1004196.https://doi.org/10.1371/journal.pcbi.1004196
-
Chaos in random neural networksPhysical Review Letters 61:259–262.https://doi.org/10.1103/PhysRevLett.61.259
-
Dynamics of random neural networks with bistable unitsPhysical Review E 90:062710.https://doi.org/10.1103/PhysRevE.90.062710
-
State-dependent regulation of cortical processing speed via gain modulationThe Journal of Neuroscience 41:3988–4005.https://doi.org/10.1523/JNEUROSCI.1895-20.2021
-
Task representations in neural networks trained to perform many cognitive tasksNature Neuroscience 22:297–306.https://doi.org/10.1038/s41593-018-0310-2
Article and author information
Author details
Funding
National Institute of Neurological Disorders and Stroke (R01-NS118461)
- Luca Mazzucato
National Institute on Drug Abuse (R01-DA055439)
- Luca Mazzucato
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We would like to thank J Wallis and S Cavanagh for sharing their previously published data Cavanagh et al., 2016; and G Mongillo and G La Camera for discussions. LM was supported by National Institute of Neurological Disorders and Stroke grant R01-NS118461 and by the National Institute on Drug Abuse grant R01-DA055439 (CRCNS). MS was supported by The Hebrew University of Jerusalem 'Emergency response to COVID-19’ grant and hosted by the Theoretical Sciences Visiting Program at the Okinawa Institute of Science and Technology while completing this study.
Copyright
© 2023, Stern, Istrate et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 1,263
- views
-
- 224
- downloads
-
- 5
- citations
Views, downloads and citations are aggregated across all versions of this paper published by eLife.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Neuroscience
Memory deficits are a hallmark of many different neurological and psychiatric conditions. The Rey–Osterrieth complex figure (ROCF) is the state-of-the-art assessment tool for neuropsychologists across the globe to assess the degree of non-verbal visual memory deterioration. To obtain a score, a trained clinician inspects a patient’s ROCF drawing and quantifies deviations from the original figure. This manual procedure is time-consuming, slow and scores vary depending on the clinician’s experience, motivation, and tiredness. Here, we leverage novel deep learning architectures to automatize the rating of memory deficits. For this, we collected more than 20k hand-drawn ROCF drawings from patients with various neurological and psychiatric disorders as well as healthy participants. Unbiased ground truth ROCF scores were obtained from crowdsourced human intelligence. This dataset was used to train and evaluate a multihead convolutional neural network. The model performs highly unbiased as it yielded predictions very close to the ground truth and the error was similarly distributed around zero. The neural network outperforms both online raters and clinicians. The scoring system can reliably identify and accurately score individual figure elements in previously unseen ROCF drawings, which facilitates explainability of the AI-scoring system. To ensure generalizability and clinical utility, the model performance was successfully replicated in a large independent prospective validation study that was pre-registered prior to data collection. Our AI-powered scoring system provides healthcare institutions worldwide with a digital tool to assess objectively, reliably, and time-efficiently the performance in the ROCF test from hand-drawn images.
-
- Neuroscience
Combining electrophysiological, anatomical and functional brain maps reveals networks of beta neural activity that align with dopamine uptake.