A reservoir of timescales emerges in recurrent circuits with heterogeneous neural assemblies

  1. Merav Stern
  2. Nicolae Istrate
  3. Luca Mazzucato  Is a corresponding author
  1. Institute of Neuroscience, University of Oregon, United States
  2. Faculty of Medicine, The Hebrew University of Jerusalem, Israel
  3. Departments of Physics, University of Oregon, United States
  4. Mathematics and Biology, University of Oregon, United States

Abstract

The temporal activity of many physical and biological systems, from complex networks to neural circuits, exhibits fluctuations simultaneously varying over a large range of timescales. Long-tailed distributions of intrinsic timescales have been observed across neurons simultaneously recorded within the same cortical circuit. The mechanisms leading to this striking temporal heterogeneity are yet unknown. Here, we show that neural circuits, endowed with heterogeneous neural assemblies of different sizes, naturally generate multiple timescales of activity spanning several orders of magnitude. We develop an analytical theory using rate networks, supported by simulations of spiking networks with cell-type specific connectivity, to explain how neural timescales depend on assembly size and show that our model can naturally explain the long-tailed timescale distribution observed in the awake primate cortex. When driving recurrent networks of heterogeneous neural assemblies by a time-dependent broadband input, we found that large and small assemblies preferentially entrain slow and fast spectral components of the input, respectively. Our results suggest that heterogeneous assemblies can provide a biologically plausible mechanism for neural circuits to demix complex temporal input signals by transforming temporal into spatial neural codes via frequency-selective neural assemblies.

Editor's evaluation

This fundamental work uses computational network models to suggest a possible origin of the wide range of time scales observed in cortical activity. This claim is supported by convincing evidence based on comparisons between mathematical theory, simulations of spiking network models, and analysis of recordings from the orbitofrontal cortex. This manuscript will be of interest to the broad community of systems and computational neuroscience.

https://doi.org/10.7554/eLife.86552.sa0

Introduction

Experimental evidence shows that the temporal activity of many physical and biological systems exhibits fluctuations simultaneously varying over a large range of timescales. In condensed matter physics, spin glasses typically exhibit aging and relaxation effects whose timescales span several orders of magnitude (Bouchaud, 1992). In biological systems, metabolic networks of E. coli generate fluxes with a power-law distribution of rates (Almaas et al., 2004; Emmerling et al., 2002). Gas release in yeast cultures exhibits frequency distributions spanning many orders of magnitude (Roussel and Lloyd, 2007), endowing them with robust and flexible responses to the environment (Aon et al., 2008). In the mammalian brain, a hierarchy of timescales in the activity of single neurons is observed across different cortical areas from occipital to frontal regions (Murray et al., 2014; Siegle et al., 2019; Gao et al., 2020). Moreover, neurons within the same local circuit exhibit a large range of timescales from milliseconds to minutes (Bernacchia et al., 2011; Cavanagh et al., 2016; Miri et al., 2011). This heterogeneity of neuronal timescales was observed in awake animals during periods of ongoing activity in the absence of external stimuli or behavioral tasks, suggesting that long-tailed distributions of intrinsic timescales may be an intrinsic property of recurrent cortical circuits. Recent studies highlighted the benefits of leveraging computations on multiple timescales when performing complex tasks in primates (Iigaya et al., 2019) as well as in artificial neural networks (Perez-Nieves et al., 2021). However, the neural mechanisms underlying the emergence of multiple timescales are not yet understood.

Here, we present a simple and robust neural mechanism generating heterogeneous timescales of activity in recurrent circuits. The central feature of our model is a heterogeneous distribution of cell assemblies, a common ingredient observed in cortical architecture (Perin et al., 2011; Marshel et al., 2019; Figure 1a). We first demonstrate that rate networks, whose rate units represent cell-assemblies, can generate long-tailed distributions of timescales when endowed with heterogeneous assemblies (Figure 1b). We then show that the heterogeneity of timescales, observed in electrophysiological recordings from awake primate cortex (Cavanagh et al., 2016), can be explained by the presence of heterogeneous cell assemblies (Figure 1b). Using methods from statistical physics, we develop an analytical framework explaining how an assembly’s intrinsic timescale depends on size, revealing the emergence of a new chaotic regime where activity is bistable. We show that our theory applies to biologically plausible models of cortical circuits based on spiking networks with cell-type specific clustered architectures.

Summary of the main results.

(a) Left: Microscopic model based on a recurrent network of spiking neurons with excitatory and inhibitory cell types, arranged in neural assemblies of heterogeneous sizes. Right: Phenomenological model based on a recurrent network of rate units. Each unit corresponds to an E/I neural assembly, whose size is represented by the unit’s self-couplings si. (b) A lognormal distribution of self-couplings (representing assemblies of different sizes) generates time-varying activity whose heterogeneous distribution of timescale fits population activity recorded from awake monkey orbitofrontal cortex (data from Cavanagh et al., 2016). (c) When driving our heterogeneous network with broadband time-varying input, comprising a superposition of sine waves of different frequencies, large and small assemblies preferentially entrain with low and high spectral components of the input, respectively, thus demixing frequencies into responses of different populations.

We then study the stimulus-response properties of networks with heterogeneous assemblies. In networks with homogeneous assemblies, chaotic activity is suppressed at a single resonant frequency (Rajan et al., 2010). However, when we drive heterogeneous networks with a time-dependent broadband input featuring a superposition of multiple frequencies, we find that the chaotic activity is suppressed across multiple frequencies which depend on the assembly own size. Large and small assemblies are preferentially entrained by the low and high-frequency components of the input, respectively (Figure 1c). This spectral specificity suggests that a reservoir of timescales may be a natural mechanism for cortical circuits to flexibly demix different spectral features of complex time-varying inputs. This mechanism may endow neural circuits with the ability to transform temporal neural codes into spatial neural codes via frequency-selective neural assemblies.

Results

To develop a theory of heterogeneous timescales, we first focus on random neuronal networks whose rate units are recurrently connected, with couplings that are chosen randomly. In this model, we will be able to leverage analytical methods from statistical field theory (Sompolinsky et al., 1988; Buice and Chow, 2013; Helias and Dahmen, 2020) to link analytical model parameters to circuit dynamics. In our rate network model, each network unit represents a functional assembly of cortical neurons with similar response properties. We interpret the unit’s self-coupling as the size of the corresponding neural assembly (if recurrent couplings across the population vary significantly, we also interpret the self-coupling as representing the average coupling strength within an assembly). In the case where the self-couplings are zero or weak (order 1/N, with N being the size of the network), random networks are known to undergo a phase transition from silence to chaos when the variance of the random couplings exceeds a critical value (Sompolinsky et al., 1988). When the self-couplings are strong (order 1) and are all equal, a third phase appears, featuring multiple stable fixed points accompanied by long transient activity (Stern et al., 2014). In all these cases, all network units exhibit the same intrinsic timescale, estimated from their autocorrelation function. Here, we demonstrate a novel class of recurrent networks, capable of generating temporally heterogeneous activity whose multiple timescales span several orders of magnitude. We show that when the self-couplings are heterogeneous, a reservoir of multiple timescales emerges, where each unit’s intrinsic timescale depends both on its own self-coupling and the network’s self-coupling distribution.

Random networks with heterogeneous self-couplings

We start by considering a recurrent network of N rate units obeying the dynamical equations

(1) dxidt=xi+siϕ(xi)+gj=1NJijϕ(xj)

where the random couplings Jij from unit j to unit i are drawn independently from a Gaussian distribution with mean 0 and variance 1/N; g represents the network gain and we chose a transfer function ϕ(x)tanh(x). The self-couplings si are drawn from a distribution P(s). The special case of equal self-couplings (si=s) was studied by Stern et al., 2014 and a summary of the results can be found in Appendix 1 for convenience. Here, we study the network properties in relation to both discrete and continuous distributions P(s).

Using standard methods of statistical field theory (Buice and Chow, 2013; Helias and Dahmen, 2020, see Methods: 'Dynamic mean-field theory with multiple self-coupling' for details), in the limit of large N we can average over realizations of the disordered couplings Jij to derive a set of self-consistent dynamic mean-field equations for each population of units xα with self-coupling strengths sαS

(2) dxαdt=xα+sαtanh(xα)+η(t).

In our notation, S denotes the set of different values of self-couplings sα, indexed by αA, and we denote by Nα the number of units with the same self-coupling sα, and accordingly by nα=Nα/N their fraction. The mean-field η(t) is the same Gaussian process for all units and has zero mean η(t)=0 and autocorrelation

(3) η(t)η(t+τ)=g2C(τ)C(τ)=αAnαϕ[xα(t)]ϕ[xα(t+τ)] ,

where denotes an average over the mean-field.

We found that networks with heterogeneous self-couplings exhibit a complex landscape of fixed points xα, obtained as the self-consistent solutions to the static version of Equation 2 and Equation 3, subject to stability conditions (see Methods: 'Fixed points and transition to chaos' and 'Stability conditions'). For fixed values of the network gain g, these fixed points can be destabilized by varying the self-couplings of different assemblies, inducing a transition to time-varying chaotic activity (Figure 2). The fixed points landscape exhibits remarkable features inherited directly from the single value self-coupling case, as was extensively researched in Stern et al., 2014. Here, we focus on the dynamical properties of the time-varying chaotic activity, which constitute new features resulting from the heterogeneity of the self-couplings. We illustrate the network’s dynamical features in the case of a network with two sub-populations with n1 and n2=1n1 portions of the units with self-couplings s1 and s2, respectively. In the (s1,s2) plane, this model gives rise to a phase diagram with a single chaotic region separating four disconnected stable fixed-point regions (Figure 2a). In the case of a Gaussian distribution of self-couplings in the stable fixed point regime, a complex landscape of stable fixed points emerges. The unit values at the stable fixed points continuously interpolate between around zero (for units with si<1) and a bi-modal distribution (for units with si>1) within the same network (Figure 3a).

Dynamical and fixed point properties of networks with two self-couplings.

(a) Ratio of autocorrelation timescales τ2/τ1 of units with self-couplings s2 and s1, respectively (τi is estimated as the half width at half max of a unit’s autocorrelation function, see panels iii, iv), in a network with n1=n2=0.5 and g=2 and varying s1,s2. A central chaotic phase separates four different stable fixed point regions with or without transient activity. Black curves represent the transition from chaotic to stable fixed point regimes, which can be found by solving consistently Equation 15, Equation 16, and Equation 18 (using equal to 1 in the latter), see Methods ('Fixed points and transition to chaos' and 'Stability conditions') for details. (i, ii) Activity across time during the initial transient epoch (left) and distributions of unit values at their stable fixed points (right), for networks with N=1000 and (i) s1=3.2,s2=1.5, (ii) s1=3.2,s2=1.2. (iii, iv) Activity across time (left) and normalized autocorrelation functions C(τ)/C(0), (right) of units with (iii) s1=0.8,s2=1.5, (iv) s1=0.8,s2=3.2. (b) Timescales τ2,τ1 (left) and their ratio τ2/τ1 (right) for fixed s1=1 and varying s2, as a function of the relative size of the two populations n1=N1/N,n2=N2/N (at g=2, N=2000; average over 20 network realizations). All points in b. were verified to be within the chaotic region using Equation 18.

Continuous distributions of self-couplings.

(a–i) In a network with a Gaussian distribution of self-couplings (mean μ=1 and variance σ2=9), and g=2.5, the stable fixed point regime exhibits a distribution of fixed point values interpolating between around the zero fixed point (for units with si1) and the multi-modal case (for units with si>1). The purple curve represents solutions to x=stanh(x). (a,b–ii) A network with a lognormal distribution of self-couplings (parameters for (a,b) μ=0.2,0.5 and σ2=1,0.62, and g=2.5 ;autocorrelation timescales τi in units of ms) in the chaotic phase, span several orders of magnitude as functions of the units’ self-couplings si. (a-ii) Mean-field predictions for the autocorrelation functions and their timescales (purple curve) were generated from Equation 13 and Equation 14 via an iterative procedure, see Methods: 'Dynamic mean-field theory with multiple self-couplings' , 'An iterative solution'. (b) Populations of neurons recorded from orbitofrontal cortex of awake monkeys exhibit a lognormal distribution of intrinsic timescales (data from Cavanagh et al., 2016) (panel b-i, red), consistent with neural activity generated by a rate network with a lognormal distribution of self-couplings (panel b-i, blue; panel b-ii). ; We note that Cavanagh et al., 2016 use fitted exponential decay time constants of the autocorrelation functions as neurons’ timescales, while we use the half widths at half max of the autocorrelation functions as the timescales. To bridge these two definitions, we multiplied (Cavanagh et al., 2016) data by a factor of ln(2) before comparing it with our model and presenting it in this figure. The model membrane time constant was assumed to be 3 ms in this example.

A reservoir of heterogeneous timescales explains cortical recordings

In the chaotic phase, we can estimate the intrinsic timescale τi of a unit xi from its autocorrelation function C(τ)=ϕ[xi(t)]ϕ[xi(t+τ)]t as the half-width at its autocorrelation half maximum (Figure 2a-iv, τ1, and τ2). The chaotic phase in the network, Equation 1, is characterized by a large range of timescales that can be simultaneously realized across the units with different self-couplings. In a network with two self-couplings s1 and s2 in the chaotic regime, we found that the ratio of the timescales τ2/τ1 increases as we increase the self-couplings ratio s2/s1 (Figure 2b). The separation of timescales depends on the relative fractions n1 and n2=1n1 of the fast and slow populations, respectively. When the fraction of n2 approaches zero, (with n11), the log of the timescale ratio exhibits a supralinear dependence on the self-couplings ratio, as described analytically in Methods ('Universal colored-noise approximation to the Fokker-Planck theory'), with a simplified estimation given in Equation 4, leading to a vast separation of timescales. Other self-couplings ratios s2/s1 approach the timescale supralinear separation as the fraction of n1 increases. We note that all uses of ‘log’ to evaluate the timescale growth and otherwise assume the base e.

In the case of a lognormal distribution of self-couplings, in the chaotic regime the network generates a reservoir of multiple timescales τi’s of chaotic activity across network units, spanning across several orders of magnitude (Figure 3b). For long-tailed distributions such as the lognormal, mean-field theory can generate predictions for rare units with large self-couplings from the tail end of the distribution by solving Equation 2 and the continuous version of Equation 3, see Methods ('Dynamic mean-field theory with multiple self-couplings') Equation 13. The solution highlights the exponential relation between a unit’s self-coupling and its autocorrelation decay time (Figure 3aii, purple dashed line).

During periods of ongoing activity, the distribution of single-cell autocorrelation timescales in primate cortex was found to be right-skewed and approximately lognormal, ranging from 10 ms to a few seconds (Cavanagh et al., 2016; Figure 3bi). Can the reservoir of timescales generated in our heterogeneous network model explain the distribution of timescales observed in primate cortex? We found that a model with a lognormal distribution of self-couplings can generate a long-tailed distribution of timescales which fits the distribution observed in awake primate orbitofrontal cortex (Figure 3bi). This result shows that neural circuits with heterogeneous assemblies can naturally generate the heterogeneity in intrinsic timescales observed in cortical circuits from awake primates.

Separation of timescales in the bistable chaotic regime

To gain an analytical understanding of the parametric separation of timescales in networks with heterogeneous self-couplings, we consider the special case of a network with two self-couplings where a large sub-population (N1=N1) with s1=1 comprises all but one slow probe unit, x2, with large self-coupling s2s1 (see Methods:'Universal colored-noise approximation to the Fokker-Planck theory' for details). In the large N limit, we can neglect the backreaction of the probe unit on the mean-field and approximate the latter as an external Gaussian colored noise η(t) with autocorrelation g2C(τ)=g2ϕ[x1(t)]ϕ[x1(t+τ)], independent of x2. The noise η(t) then represents the effect on the probe unit x2 of all other units in the network and can be parameterized by the noise strength D and its timescale (color) τ1. For large s2, the dynamics of the probe unit x2 leads to the emergence of a bi-stable chaotic phase whereby its activity is localized around the critical points x±±s2 (Figure 4a–i) and switches between them at random times. In the regime of colored noise (as we have here, with τ17.91), the stationary probability distribution p(x2) (Figure 4a–ii and b) satisfies the unified colored noise approximation to the Fokker Planck equation (see Methods:'Universal colored-noise approximation to the Fokker-Planck theory', Hänggi and Jung, 1995; Jung and Hänggi, 1987), based on an analytical expression for its effective potential Ueff(x) as a function of the self-coupling s2 and the noise color (τ1). The distribution p(x2) is concentrated around the two minima x±±s2 of Ueff. The main effect of the strong color τ11 is to sharply decrease the variance of the distribution around the minima x±, compared to the white noise case (τ1=0). This is evident from comparing the colored noise with white noise (Figure 4a-iv,v,vi).

Figure 4 with 1 supplement see all
Separation of timescales and metastable regime.

(a) Examples of bistable activity. (i, iv,i) - time courses; (ii, v) - histograms of unit’s value across time; (iii, vi) - histograms of dwell times. (a–i, ii, iii) An example of a probe unit x2 with s2=5, embedded in a neural network with N=1000 units, N1=N1 units with s1=1 and g=1.5. (a–iv, v, vi) An example of a probe unit driven by white noise. Note the differences in the x-axis scalings of the timecourses (a–i vs. a–iv and a–iii vs. a–vi).(b) The unified colored noise approximation stationary probability distribution (dark blue curve, Equation 19, its support excludes the shaded gray area) from the effective potential Ueff (dashed blue curve) captures well the activity histogram (light blue area; same as (a–ii)); whereas the white noise distribution (dark green curve, obtained from the naive potential U, dashed green curve) captures the probe unit’s activity (light green area; same as (a–v)) when driven by white noise, and deviates significantly from the activity distribution when the probe is embedded in our network. (c) Average dwell times,T, in the bistable states. Simulation results, mean, and 95%CI (blue curve and light blue background, respectively; An example of the full distribution of the dwell times is given in (a-iii)). Mean-field predictions (purple curve) were generated by calculating the average dwell times from a trace of x2, which was produced by solving the mean-field equations; Equation 2 simultaneously and consistently with Equation 3 with n1=1 and n2=0. The mean first passage time from the unified colored noise approximation (Equation 22, black curve), and for a simplified estimate thereof (Equation 4, gray dashed line) capture well T. When driven by white noise (green curve and light green curve are simulation results and simplified estimate using Equation 4, respectively), the probe’s average dwell times are orders of magnitude shorter than with colored noise, exhibiting substantial support of the probe distribution in the region where the crossing between wells happens (allowing frequent crossing,(a-iv) green line at x=0) and, equivalently, the low value of the potential around its maxima ((b) green dashed line at x=0). Comparison of white and colored noise demonstrates the central role of the self-consistent colored noise to achieve long dwell times.

In our network with colored noise, the probe unit’s temporal dynamics are captured by the mean first passage time T for the escape out of the potential well defined by the effective potential Ueff, yielding good agreement with simulations at increasing s2, as expected on theoretical ground (Hänggi and Jung, 1995; Jung and Hänggi, 1987; Figure 4c). The asymptotic scaling of the mean first passage time for large s2 is

(4) log(T)τ1+12D[s22s2log(s2)].

In this slow probe regime, we thus achieved a parametric separation of timescales between the population x1, with its intrinsic timescale τ1, and the probe unit x2 whose activity fluctuations exhibit two separate timescales: the slow timescale <T> of the dwelling in each of the bistable states and the fast timescale τ1 of the fluctuations around the metastable states (obtained by expanding the dynamical equation around the meta-stable values x±=±s2). One can generalize this metastable regime to a network with Np units which belong to a group with s1=1 and pN slow probe units xα, for α=2,,p+1, with large self-couplings sα. The slow dynamics of each probe unit xα is captured by its own mean first passage time (between the bistable states) <T>α in (Equation 22) and all slow units are driven by a shared external colored noise η(t) with timescale τ1. In summary, in our model multiple timescales can be robustly generated with specific values, varying over several orders of magnitude.

Is the relationship between the unit’s self-coupling and its timescale relying on single-unit properties, or does it rely on network effects? To answer this question, we compare the dynamics of a unit when driven by a white noise input vs. the self-consistent input generated by the rest of the recurrent network (i.e. the mean-field). If the neural mechanism underlying the timescale separation was a property of the single-cell itself, we would observe the same effect regardless of the details of the input noise. We found that the increase in the unit’s timescale as a function of s2 is absent when driving the unit with white noise, and it only emerges when the unit is driven by the self-consistent mean-field (Figure 4c). We thus concluded that this neural mechanism is not an intrinsic property of a single unit but requires the unit to be part of a recurrently connected network.

A reservoir of timescales in E-I spiking networks

We next investigated whether the neural mechanism endowing the rate network (1) with a reservoir of timescales could be implemented in a biologically plausible model based on spiking activity and excitatory/inhibitory cell-type specific connectivity. To this end, we modeled the local cortical circuit as a recurrent network of excitatory (E) and inhibitory (I) current-based leaky integrated-and-fire neurons (see Methods:'Spiking network model' for details), where both E and I populations were arranged in neural assemblies (Figure 5a; Amit and Brunel, 1997; Litwin-Kumar and Doiron, 2012; Wyrick and Mazzucato, 2021). Synaptic couplings between neurons in the same assembly were potentiated compared to those between neurons in different assemblies. Using mean-field theory, we found that the recurrent interactions of cell-type specific neurons belonging to the same assembly can be interpreted as a self-coupling, expressed in terms of the underlying network parameters as siE=J¯EE(in)CiE, where CiE is the assembly size and J¯EE(in) is the average synaptic coupling between E neurons within the assembly (see Methods: 'Spiking network model' for details). The spiking network time-varying activity unfolds through sequences of metastable attractors (Litwin-Kumar and Doiron, 2012; Wyrick and Mazzucato, 2021), characterized by the co-activation of different subsets of neural assemblies (Figure 5a). These dynamics rely on the bistable activity of each assembly, switching between high and low firing rate states. The dwell time of an assembly in a high-activity state increases with larger sizes and with stronger intra-assembly coupling strength (Figure 5b). This metastable regime in spiking networks is similar to the bi-stable, heterogeneous timescales activity observed in the random neural networks endowed with heterogeneous self-couplings. We further examined the features of the metastable regime in spiking networks in order to compare the mechanism underlying the heterogeneous timescale distributions in the rate and spiking models. The two models exhibit clear differences in their “building blocks”. In the rate network, the transfer function is odd (tanh) leading to bistable states with values localized around x±±s. In the spiking model, the single neuron current-to-rate transfer function is strictly positive so that the bistable states can have both high and low firing rate values. As a result, in the spiking network, unlike in the rate network, a variety of firing rate levels can be attained during the high activity epochs, depending on both the assembly size and the the number of other simultaneously active assemblies. In other words, the rate level depends on the specific network attractor visited within the realization of the complex attractor landscape (Mazzucato et al., 2015; Wyrick and Mazzucato, 2021).

Heterogeneity of timescales in E-I spiking networks.

(a) Top: Schematic of a spiking network with excitatory (black) and inhibitory populations (red) arranged in assemblies with heterogeneous distribution of sizes. Bottom: In a representative trial, neural assemblies activate and deactivate at random times generating metastable activity (one representative E neuron per assembly; larger assemblies on top; representative network of N=10,000 neurons), where larger assemblies tend to activate for longer intervals. (b) The average activation times <T> of individual assemblies (blue dots; the average was calculated across 100s simulation and across all neurons within the same assembly for all assemblies in 20 different network realizations; self-coupling units are in [mV], see Methods section). Fit of log(T)=a2sE2+a1sE+a0 with a2=0.14,a1=1.97,a0=5.51 (pink curve). Inset: cross-validated model selection for polynomial fit. As the assembly strength (i.e. the product of its size and average recurrent coupling) increases, <T> increases, leading to a large distribution of timescales ranging from 20 ms to 100s. (c) Eigenvalue distribution of the full weight matrix J (brown) and the mean-field-reduced weight matrix JMF (pink). (d) The Schur eigenvectors of the weight matrix JMF show that the slow (gapped) Schur eigenvalues (top) are associated with eigenvectors corresponding to E/I cluster pairs (bottom). See Appendix (e) Spiking network model for more details and for the scaling to larger networks.

Despite these differences, we found crucial similarities between the time-varying activity in the two models related to the underlying neural mechanism. The characteristic timescale T of the assembly metastable dynamics can be estimated from its average activation time (Figure 5a and b). We tested whether the heterogeneity in the assembly self-coupling distribution could lead to a heterogeneous distribution of timescales. We endowed the network with a heterogeneous distribution of assembly sizes (an additional source of heterogeneity originates from the Erdos-Renyi connectivity), yielding a heterogeneous distribution of self-couplings (Figure 5b). We found that the assembly activation timescales T grew as the assembly’s self-coupling increased, spanning overall a large range of timescales from 20 ms to 100 s i.e. the whole range of our simulation epochs (Figure 5b). In particular, the functional dependence of log(T) vs. self-coupling sE was best fit by a quadratic polynomial (Figure 5b inset, see Methods: 'Spiking network model' for details), in agreement with the functional dependence obtained from the analytical calculation in the rate model (4). We thus concluded that a reservoir of timescales can naturally emerge in biologically plausible spiking models of cortical circuits from a heterogeneous distribution of assembly sizes. Both the range of timescales (20 ms-100 s) (Cavanagh et al., 2016) and the distribution of assembly sizes (50–100 neurons) (Perin et al., 2011; Marshel et al., 2019) are consistent with experimental observations.

What is the relationship between the distribution of timescales in the E/I spiking model and the chaotic rate network? We can obtain crucial insights by combining analysis of the synaptic weight matrix together with a mean-field approach in a linear approximation, following the approach of Murphy and Miller, 2009; Schaub et al., 2015. In the spiking network, the non-normal weight matrix J exhibits the typical E/I structure with the four submatrices representing E/I cell-type specific connectivity; within each of the four E/I submatrices, diagonal blocks highlight the paired E/I clustered architecture (the heterogeneous distribution of cluster sizes is manifest in the increasing size of the diagonal blocks, Appendix 2—figure 1a). To interpret the dynamical features emerging from this weight matrix, we examined a mean-field reduction of the N-dimensional network to a set of 2p+2 mean-field variables, representing the 2p E and I clusters plus the two unclustered background E and I populations (see Appendix 2). The 2p+2 eigenvalues of the mean-field-reduced weight matrix JMF comprise a subset of the full weight matrix J, capturing the salient features of the spiking network dynamics in a linear approximation (Figure 5c; see Appendix 2 and Murphy and Miller, 2009; Schaub et al., 2015). The weights matrix JMF exhibits a spectral gap, beyond which a distribution of p1 eigenvalues with real parts larger than one correspond to slow dynamical modes. To identify these slow modes, we examined the Schur eigenvectors of JMF, which represent independent dynamical modes in the linearized theory (i.e. an orthonormal basis; see Appendix 2 and Appendix 2—figure 2).

We found that the Schur eigenvectors associated with those large positive eigenvalues can be approximately mapped onto E/I cluster pairs. More specifically, eigenvalues with increasingly larger values correspond to assemblies of increasingly larger sizes (Figure 5c), which, in turn, are associated with slower timescales (Figure 5b). We conclude that the slow switching dynamics in the spiking network is linked to large positive eigenvalues of the synaptic weight matrix, and the different timescales emerge from a heterogeneous distribution of these eigenvalues. For comparison, in the chaotic rate network, the eigenvalue distribution of the weight matrix exhibits a set of eigenvalues with large positive real parts as well, Appendix 2—figure 3a. The relation between the value of an eigenvalue and the slow dynamics holds in the rate networks as well: increasingly larger eigenvalues correspond to increasingly larger cluster self-couplings (Appendix 2—figure 3) which are associated with slower dynamics (Figure 3aii). Therefore, the dynamical structure in the rate networks qualitatively matches the structure uncovered in the mean-field-reduced spiking network weight matrix JMF (Figure 5C). In summary, our analysis shows that in both spiking and rate networks the reservoir of timescales is associated with the emergence of a heterogeneous distribution of large positive eigenvalues in the weight matrix. This analysis suggests that the correspondence between rate networks and spiking networks should be performed at the level of dynamical modes associated with these large positive eigenvalues in the Schur basis, where rate units in the rate network can be related to E/I cluster pairs (Schur eigenvectors) in the spiking network. A potential difference between the two models may be related to the nature of the transitions between bistable states in the rate network vs the transitions between low and high activity states in the spiking network. In the rate network, transitions are driven by the self-consistent colored noise, the hallmark of the chaotic activity arising from the disorder in the synaptic couplings. In the spiking network, although the disorder in the inter-assembly effective couplings may contribute to the state transitions, there might be finite size effects at play, due to the small assembly size.

Spatial de-mixing of time-varying broadband input

What are the computational benefits of having multiple timescales simultaneously operating in the same circuit? Previous work in random networks with no self-couplings (si=0 in Equation 1) showed that stimulus-driven suppression of chaos is enhanced at a particular input frequency, related to the network’s intrinsic timescale (Rajan et al., 2010). The phenomenon was preserved when a single rate of adaptation was added to all units (Muscinelli et al., 2019). We investigated whether, in our network with two different self-couplings s1<s2 (in the chaotic regime), the stimulus-dependent suppression of chaos exhibited different features across the two sub-populations, depending on their different intrinsic timescale. We drove each network unit xi with an external broadband stimulus Ii(t)=Al=1Lsin(2πflt+θi) consisting of the superposition of L sinusoidal inputs of different frequencies fl in the range 1200 Hz, with an equal amplitude A=0.5 and random phases θi. We found that the sub-population with a slow, or fast, intrinsic timescale preferentially entrained its activity with slower, or faster, spectral components of the broadband stimulus, respectively (Figure 6a). We quantified this effect using a spectral modulation index m(f)=[(P2(f)P1(f))/(P2(f)+P1(f))], where Pα(f) is the power-spectrum peak of sub-population α at the frequency f (Figure 6b). A positive, or negative, value of m(f) reveals a stronger, or weaker, respectively, entrainment at frequency f in the sub-population s2 compared to s1 exhibited a crossover behavior whereby the low frequency component of the input predominantly entrained the slow population s2, while the fast component of the input predominantly entrained the fast population s1. When fixing s1=1 and varying s2, we found that the dependence of the crossover frequency fc on s2 was strong at low input amplitudes and was progressively tamed at larger input amplitudes (Figure 6c). This is consistent with the fact that the input amplitude completely suppresses chaos beyond a certain critical value, as previously reported in network’s with no self-couplings (Rajan et al., 2010) and with adaptation (Muscinelli et al., 2019).

Network response to broadband input.

(a) Power spectrum density of a network driven by time-dependent input comprising a superposition of 11 sinusoidal frequencies (see main text for details). Maroon and navy curves represent average power spectrum density in s1 and s2 populations, respectively; circles indicate the peak in the power spectrum density amplitudes at each frequency; amplitude A = 0.5, g=3.0, s1=1, and s2=4. (b) Modulation index of the power spectrum density amplitudes as a function of frequency in networks with s1=1 and various s2. The blue circles mark the cutoff frequency fc where the modulation index changes sign. (c) Cutoff period, 2πωc1, as a function of self-coupling s2 for different input amplitudes. An inversely proportional relation between the cutoff period and the amplitude of the broadband signal is present.

Discussion

We demonstrated a new robust and biologically plausible network mechanism whereby multiple timescales emerge across units with heterogeneous self-couplings. In our model, units are interpreted as neural assemblies consistent with experimental evidence from cortical circuits (Perin et al., 2011; Lee et al., 2016; Kiani et al., 2015; Miller et al., 2014; Marshel et al., 2019), and previous theoretical modeling (Litwin-Kumar and Doiron, 2012; Wyrick and Mazzucato, 2021). We found that the neural mechanism underlying the large range of timescales is the heterogeneity in the distribution of self-couplings (representing neural assembly size). We showed that this mechanism can be naturally implemented in a biologically plausible model of a neural circuit based on spiking neurons with excitatory/inhibitory cell-type specific connectivity. This spiking network represents a microscopic realization of our mechanism where neurons are arranged in assemblies, and an assembly’s self-coupling represents the strength of the recurrent interactions between neurons belonging to that assembly, proportional to its size. A heterogeneous distribution of assembly sizes, in turn, generates a reservoir of timescales. Crucially, our model captured the distribution of intrinsic timescales observed across neurons recorded within the same area in primate cortex (Cavanagh et al., 2016).

Several experimental studies uncovered heterogeneity of timescales of neural activity across brain areas and species. Comparison of the population-averaged autocorrelations across cortical areas revealed a hierarchical structure, varying from 50 ms to 350 ms along the occipital-to-frontal axis (Murray et al., 2014). Neurons within the same area exhibit a wide distribution of timescales as well. A heterogeneous distribution of timescales (from 0.5 s to 50 s) was found across neurons in the oculomotor system of the fish (Miri et al., 2011) and primate brainstem (Joshua et al., 2013), suggesting that timescale heterogeneity is conserved across phylogeny. During periods of ongoing activity, the distribution of single-cell autocorrelation timescales in primates was found to be right-skewed and approximately lognormal, ranging from 10 ms to a few seconds (Cavanagh et al., 2016 and Figure 3). Single neuron activity was found to encode long reward memory traces in primate frontal areas over a wide range of timescales up to 10 consecutive trials (Bernacchia et al., 2011). In these studies, autocorrelation timescales were estimated using parametric fits, which may be affected by statistical biases, although Bayesian generative approaches might overcome this issue (Zeraati et al., 2020). In our model, we estimated timescales nonparametrically as the half-width at half-maximum of the autocorrelation function. In our biologically plausible model based on a spiking network with cell-type specific connectivity, the distribution of timescales was in the range between 20 ms and 100 s, similar to the range of timescales observed in experiments (Miri et al., 2011; Joshua et al., 2013; Cavanagh et al., 2016). Moreover, the distribution of assembly sizes in our model is within the range of 50–100 neurons, consistent with the size of functional assemblies experimentally observed in cortical circuits (Perin et al., 2011; Marshel et al., 2019). A fundamental new prediction of our model, to be tested in future experiments, is the direct relationship between assembly size and its timescale.

Previous neural mechanisms for generating multiple timescales of neural activity relied on single-cell bio-physical properties, such as membrane or synaptic time constants (Gjorgjieva et al., 2016). In feedforward networks, developmental changes in single-cell conductance can modulate the timescale of information transmission, explaining the transition from slow waves to rapid fluctuations observed in the developing cortex (Gjorgjieva et al., 2014). However, the extent to which this single-cell mechanism might persist in the presence of strong recurrent dynamics was not assessed. To elucidate this issue, we examined whether a heterogeneous distribution of single-unit integration time constants could lead to a separation of timescales in a random neural network (see Appendix 3 for details). In this model, half of the units were endowed with a fixed fast time constant and the other half with a slow time constant, whose value varied across networks. We found that, although the average network timescale increased proportionally to the value of the slower time constants, the difference in autocorrelation time between the two populations remained negligible. These results suggest that, although the heterogeneity in single-cell time constants may affect the dynamics of single neurons in isolation or within feedforward circuits (Gjorgjieva et al., 2014), the presence of strong recurrent dynamics fundamentally alters these single-cell properties in a counterintuitive way. Our results suggest that a heterogeneity in single-cell time constants may not lead to a diversity of timescales in the presence of recurrent dynamics.

Our results further clarified that the relationship between an assembly’s self-coupling and its timescale relies on the strong recurrent dynamics. This relationship is absent when driving an isolated assembly with white noise external input (Figure 4). Indeed, the mechanism linking the self-coupling to the timescale only emerged when driving the unit with a mean-field whose color was self-consistently obtained from an underlying recurrent network of self-coupled units.

Previous models showed that a heterogeneity of timescales may emerge from circuit dynamics through a combination of structural heterogeneities and heterogeneous long-range connections arranged along a spatial feedforward gradient (Chaudhuri et al., 2014; Chaudhuri et al., 2015). These networks can reproduce the population-averaged hierarchy of timescales observed across the cortex in the range of 50–350 ms (Murray et al., 2014; Chaudhuri et al., 2015). A similar network architecture can also reproduce the heterogeneous relaxation time after a saccade, found in the brainstem oculomotor circuit (Miri et al., 2011; Joshua et al., 2013), in a range between 10–50 s (Inagaki et al., 2019; Recanatesi et al., 2022). This class of models can explain a timescale separation within a factor of 10, but it is not known whether they can be extended to several orders of magnitude, as observed between neurons in the same cortical area (Cavanagh et al., 2016). Moreover, while the feedforward spatial structure underlying these two models is a known feature of the cortical hierarchy and of the brainstem circuit, respectively, it is not known whether such a feedforward structure is present within a local cortical circuit. Our model, on the other hand, relies on strong recurrent connectivity and local functional assemblies, two hallmarks of the architecture of local cortical circuits (Perin et al., 2011; Lee et al., 2016; Kiani et al., 2015; Miller et al., 2014; Marshel et al., 2019). Other network models generating multiple timescales of activity fluctuations were proposed based on self-tuned criticality with anti-hebbian plasticity (Magnasco et al., 2009), or multiple block-structured connectivity (Aljadeff et al., 2015).

In our model, the dynamics of units with large self-couplings, exhibiting slow switching between bistable states, can be captured analytically using the universal colored noise approximation (UCNA) to the Fokker-Planck equation (Hänggi and Jung, 1995; Jung and Hänggi, 1987). This is a classic tool from the theory of stochastic processes, which we successfully applied to investigate neural network dynamics for the first time. This slow-switching regime may underlie the emergence of metastable activity, ubiquitously observed in the population spiking activity of behaving mammals (Abeles et al., 1995; Jones et al., 2007; Mazzucato et al., 2015; Mazzucato et al., 2019; Recanatesi et al., 2022; Engel et al., 2016; Kadmon Harpaz et al., 2019). In these spiking networks, it is not known how to estimate the timescales of metastable activity from network parameters, and we anticipate that our UCNA may provide a powerful new tool for investigating network dynamics in these biologically plausible models.

What is the functional relevance of neural circuits exhibiting a reservoir of multiple timescales? The presence of long timescales deeply in the chaotic regime is a new feature of our model which may be beneficial for memory capacity away from the edge of chaos (Toyoizumi and Abbott, 2011). Moreover, we found that, in our model, time-dependent broadband inputs suppress chaos in a population-specific way, whereby populations of large (small) assemblies preferentially entrain slow (fast) spectral components of the input. Previously studied spiking models suggested that preferential entrainment of input is possible by cellular mechanisms (Lindner, 2016) or finite-size fluctuations in a feedforward network structure (Deger et al., 2014). Here, we presented a recurrent network mechanism for population-specific chaos suppression, independent of the network size. This mechanism may thus grant recurrent networks with a natural and robust tool to spatially demix complex temporal inputs (Perez-Nieves et al., 2021) as observed in visual cortex (Mazzoni et al., 2008). Third, the presence of multiple timescales may be beneficial for performing flexible computations involving simultaneously fast and slow timescales, such as in role-switching tasks (Iigaya et al., 2019) or as observed in time cells in the hippocampus (Kraus et al., 2013; Howard et al., 2014). A promising direction for future investigation is the exploration of the computational properties of our model in the context of reservoir computing (Sussillo and Abbott, 2009) or recurrent networks trained to perform complex cognitive tasks (Yang et al., 2019).

Methods

Dynamic mean-field theory with multiple self-couplings

We derive the dynamic mean-field theory in the limit N by using the moment generating functional (Sompolinsky and Zippelius, 1982; Crisanti and Sompolinsky, 1987). For the derivation we follow the Martin-Siggia-Rose-De Dominicis-Janssen path integral approach formalism (Martin et al., 1973) as appears extensively in Helias and Dahmen, 2020; we borrow their notations as well.

The model includes two sets of random variables, the connectivity couplings Jij for 1i,jN;ij, are drawn independently from a Gaussian distribution with variance 1N and mean 0; and the self-couplings si for 1iN, whose values are of order 1. When we examine the dynamics governing each unit in Equation 1, the sum over the random couplings Jij contributes N terms which, in the limit N, ensure that the net contribution (mean and variance) from this sum remains of order 1. Hence, in our model, as in previous models, J is the quenched disorder parameter, whose sum gives rise to the mean-field. The self-couplings (one for each unit) contribute an additional term to the moment generating functional. Each unit’s activity strongly depends on the value of its own self-coupling, and hence can’t be averaged over when we study a unit’s dynamics. After averaging over J, we can study all units with the same self-coupling together, as they obey the same mean-filed equation, Equation 2. Moreover we show that all units, regardless of their self-coupling, obey a single mean-field due to the structure of J. We note that the results of this Methods section, including Equation 5 and Equation 6, will not be affected by diagonal elements in J which are not zero but rather drawn from the same distribution as the off-diagonal elements (as in the main text) since the contribution of such non-zero elements is negligible overall. To maintain the clarity of the text, and since the results are not affected by it, we left out the differentiation between including and excluding diagonal elements of J of order 1/sqrt(N) in the main text.

For our model, Equation 1, the moment generating functional is, therefore, given by:

(5) Z=Dx~Dxexp[dti=1Nx~i(t)[(t+1)xi(t)siϕ(xi(t))]+i=1Nλi(t)xi(t)jix~i(t)Jijϕ(xj(t))],

where Dx=iDxi and Dx~=iDx~i/2πi. To start, we calculate Z(J)J. We take advantage of the self-averaging nature of our model, particularly by averaging over the quenched disorder, J. The couplings, Jij, are i.i.d. variables extracted from a normal distribution and appear only in the last term in (Equation 5). We, hence, focus our current calculation step on that term, and we derive the result to the leading term in N, yielding:

(6) ijdJijN2πg2exp[Jij2N2g2]exp[dtx~i(t)Jijϕ(xj(t))]=exp[12dtdt(ix~i(t)x~i(t))(g2Njϕ(xj(t))ϕ(xj(t)))].

The result above suggests that all the units in our network are coupled to one another equivalently (by being coupled only to sums that depend on all units’ activity). To further decouple the network, we define the quantity

Q1(t,t)g2Njϕ(xj(t))ϕ(xj(t)).

We enforce this definition by multiplying the disordered averaged moment generating functional with the appropriate Dirac delta function, δ, in its integral form:

1=g2NdQ1δ[Ng2Q1+jϕ(xj(t))ϕ(xj(t))]=g2NdQ1dQ2expQ2[Ng2Q1+jϕ(xj(t))ϕ(xj(t))],

where dQ2 is an integral over the imaginary axis (including its 1/(2πi) factor). We can now rewrite the disordered averaged moment generating functional, using (Equation 6) to replace its last term, the definition of Q1, and with multiplying the functional by the δ function above. All together we get:

(7) Z(J)J=g2NdQ1dQ2exp[Ng2dtdtQ1Q2+NαAnαln[Zα]],Zα=Dx~αDxαexp[dtx~α(t)((t+1)xα(t)sαϕ(xα(t)))+12dtdtx~α(t)Q1(t,t)x~α(t)+dtdtϕ(xα(t))Q2(t,t)ϕ(xα(t))],

with nα=Nα/N the fraction of units with self-couplings sα across the population, for αA. In the expression above we made use of the fact that Q1 and Q2, now in a role of auxiliary fields, couple to sums of the fields xi2 and ϕi2 and hence the generating functional for xi and x~i can be factorized with identical multiplications of Zα. Note that in our network, due to the dependency on si, xi-s are equivalent as long as si-s are equivalent. Hence, the factorization is for Zα for all xi with si=sα. Now each factor Zα includes the functional integrals Dxα and Dx~α for a single unit with self-coupling sα.

In the large N limit we evaluate the auxiliary fields in (Equation 7) by the saddle point approach.e note variable valued at the saddle point by (), obtaining:

0=δδQ1,2[1g2dtdtQ1Q2+αAnαln[Zα]],

and yielding the saddle point values (Q1,Q2):

(8) 0=1g2Q1(t,t)+αAnαZαZαQ2(t,t)|Q Q1(t,t)=g2αAnαϕ(xα(t))ϕ(xα(t))g2C(τ),
(9) 0=1g2Q2(t,t)+αAnαZαZαQ1(t,t)|Q Q2(t,t)=g22αAnαx~α(t)x~α(t)=0,

where C(τ), with τ=tt, represents the average autocorrelation function of the network (as was defined in the main text, Equation 3). The second saddle point Q2=0 vanishes due to x~α(t)x~α(t)=0 as can be immediately extended from Helias and Dahmen, 2020; Sompolinsky and Zippelius, 1982. The action at the saddle point reduces to the sum of actions for individual, non-interacting units with self-coupling sα. All units are coupled to a common external field Q1. Inserting the saddle point values back into Equation 7, we obtain Z=α(Zα)Nα where

(10) ZαDx~αDxαexpαA(dtx~α(t)((t+1)xα(t)sαϕ(xα(t)))+g22dtdtx~α(t)C(τ)x~α(t)).

Thus in the large N limit the network dynamics are reduced to those of a number of A units xα(t), each represents the sub-population with self-couplings sα and follows dynamics governed by

(11) ddtxα(t)=xα(t)+sαϕ[xα(t)]+η(t)

for all αA and where η(t) is a Gaussian mean-field with autocorrelation

(12) η(t)η(t)=g2αAnαϕ(xα(t))ϕ(xα(t)).

The results above can be immediately extended for the continuous case of self-coupling distribution P(s) yielding:

(13) η(t)η(t)=g2p(s)ϕ(x(s,t))ϕ(x(s,t))ds

with p(s) the density function of the self-couplings distribution in the network and the units dynamics dependent on their respective self-couplings:

(14) ddtx(s,t)=x(s,t)+sϕ[x(s,t)]+η(t) .

An iterative solution

We use an iterative approach to solving the mean-field equations, Equation 11 and Equation 12 for a discrete distribution of self-couplings, or Equation 13 and Equation 14 for a continuous distribution of self couplings. The approach is adopted from Stern et al., 2014 and adapted to allow for a consideration of multiple self-couplings. We briefly describe it in what follows. We start by making an initial guess for the mean-field autocorrelation function C(τ), as defined in Equation 3. In its Fourier space, we multiply it by a random angle and g and transform it back to generate an instance of the mean-field η(t) (see Stern et al., 2014 for more details). We create additional η(t) instances by repeating the procedure described above. At least one instance is created per each value sα drawn from a discrete distribution P(s) of self-couplings with support set S, or per each value sα drawn from P(s) in a case of a continuous distribution. We then solve Equation 11 (or equivalently Equation 14 in the case of a continuous distribution) to obtain solutions for xα, one solution for each value of sα. The set of solutions allows us to calculate the set cα(t,t)=ϕ(xα(t))ϕ(xα(t)). For a discrete distribution, we then multiply each cα by its relative weight nα to compute C(τ), Equation 12. For a continuous distribution, we sum all cα, multiplied by 1/n, with n their amount, to estimate C(τ), Equation 13 (since sα values were drawn from P(s) each cα captures approximately 1/n of the distribution). We use these sampled mean-field autocorrelations C(τ) instead of our initial guess to repeat the entire procedure. This leads to obtaining another C(τ). We iterate until the average across iterations of C(τ) converges. We note that for the continuous distribution case, we increase the number of drawn sα values as the iterations progress (starting from very few and ending with many). This allows us to maintain a rapid iterative process and yet receive an accurate solution thanks to the refining of the process with each iteration.

Table 1
Parameters for the clustered network used in the simulations.
Model parameters for clustered network simulations
ParameterDescriptionValue
JEEE-to-E synaptic weights0.9/N [mV]
JIEE-to-I synaptic weights0.9/N [mV]
JEII-to-E synaptic weights2.7/N [mV]
JIII-to-I synaptic weights5.4/N [mV]
JE0E-to-E synaptic weights3.7/N [mV]
JI0I-to-I synaptic weights3.3/N [mV]
JEE+Potentiated intra-assembly E-to-E weight factor14N/2000
JII+Potentiated intra-assembly I-to-I weight factor5N/2000
gEIPotentiation parameter for intra-assembly I-to-E weights
10N/2000
gIEPotentiation parameter for intra-assembly E-to-I weights8N/2000
rextAverage baseline afferent rate to E and I neurons5 [spks/s]
VEthrE-neuron threshold potential1.43 [mV]
VIthrI-neuron threshold potential0.74 [mV]
VresetE- and I-neuron reset potential0 [mV]
τmE- and I-neuron membrane time constant20 [ms]
τrefrE- and I-neuron absolute refractory period5 [ms]
τsE- and I-neuron synaptic time constant5 [ms]

Fixed points and transition to chaos

Networks with heterogeneous self-couplings exhibit a complex landscape of fixed points xα, obtained as the self-consistent solutions to the static version of Equation 2 and Equation 3, namely

(15) xαsαtanh(xα)=η ,

where the mean-field η is a Gaussian random variable with zero mean and its variance is given by

(16) η2=g2C,C=αAnαϕ[xα]2.

The solution for each unit depends on its respective sα (Appendix 1-Figure 1). If sα<1 a single interval around zero is available. For sα>1, for a range of values of η, xα can take values in 1 of 3 possible intervals. Let us consider a network with two sub-populations with n1 and n2=1n1 portions of the units with self-couplings s1 and s2, respectively. In the (s1,s2) plane, this model gives rise to a phase diagram with a single chaotic region separating four disconnected stable fixed-point regions (Figure 2a). We will first discuss the stable fixed points, which present qualitatively different structures depending on the values of the self-couplings. Within the region of both self-couplings s1,s2<1, the only possibility for a stable fixed point is the trivial solution, with all xi=0 (Figure 2a), where the network activity quickly decays to zero. When at least one self-coupling is greater than one, there are three stable fixed point regions (Figure 2a); in these three regions, the network activity starting from random initial conditions unfolds via a long-lived transient period, and then it eventually settles into a stable fixed point. This transient activity with late fixed points is a generalization of the network phase found in Stern et al., 2014. When both self-couplings are greater than one (s1,s2>1) the fixed point distribution in each sub-population is bi-modal (Figure 2a–ii). When s1>1 and s2<1, the solutions for the respective sub-populations are localized around bi-modal fixed points and around zero, respectively (Figure 2a–i).

However, the available solutions in the latter case are further restricted by stability conditions. In the next Methods section we derive the stability condition by expanding the dynamical Equation 1 around the fixed point and requiring that all eigenvalues of the corresponding stability matrix are negative. Briefly, the nα fraction of units with sα>1 at a stable fixed point are restricted to have support on two disjoint intervals [xα(sα)<xα(sα)][xα(sα)>xα+(sα)]. We refer to this regime as multi-modal, a direct generalization of the stable fixed points regime found in Stern et al., 2014 for a single self-coupling s>1, characterized by transient dynamics leading to an exponentially large number of stable fixed points. For the nα portion of units with sα<1, the stable fixed point is supported by a single interval around zero.

Stability conditions

To determine the onset of instability, we look for conditions such that at least one eigenvalue develops a positive real part. An eigenvalue of the stability matrix exists at a point z in the complex plane if Stern et al., 2014; Ahmadian et al., 2015

(17) g2αAnα[1tanh2(xα)]2[z+1sα(1tanh2(xα))]2>1.

The denominator of the expression above is z plus the slope of the curve in Appendix 1—figure 1 ai and aii. Hence, a solution whose value xα gives a negative slope (available when sα>1) leads to a vanishing value of the denominator at some positive z and to a positive eigenvalue and instability. Therefore, the nα fraction of units with sα>1 at a stable fixed point are restricted to have support on two disjoint intervals [xα(sα)<xα(sα)][xα(sα)>xα+(sα)]. We refer to this regime as multi-modal, a direct generalization of the stable fixed points regime found in Stern et al., 2014 for a single self-coupling s>1, characterized by transient dynamics leading to an exponentially large number of stable fixed points. For the nα portion of units with sα<1, the stable fixed point is supported by a single interval around zero.

A fixed point solution becomes unstable as soon as an eigenvalue occurs at z=0, obtaining from Equation 17 the stability condition

(18) g2αAnαqα11,

where qα=[sαcosh2(xα)]2. For sα>1 the two possible consistent solutions to (Equation 15) that may result in a stable fixed point (from the two disjoint intervals in Appendix 1—figure 1a-i), contribute differently to qα. Larger |xα| decreases qα1 (Appendix 1—figure 1b–i), thus improving stability. Choices for distributions of xα along the two intervals become more restricted as g increases or sα decreases, since both render higher values for the stability condition, Equation 18, forcing more solutions of xi to decrease qα1. This restricts a larger fraction of xα at the fixed points to the one solution with a higher absolute value. At the transition to chaos, a single last and most stable solution exists with all xi values chosen with their higher absolute value xα (Appendix 1—figure 1b–i, light green segments). For those with sα<1 only one solution is available, obtained by the distribution of η through consistency (Equation 15) at the fixed point. In this configuration, the most stable solution is exactly transitioning from stability to instability where (Equation 18) reaches unity. Hence, the transition from stable fixed points to chaos occurs for a choice of g and P(s) such that solving consistently (Equation 15) and (Equation 16) leads to saturation of the stability condition (Equation 18) at one.

Universal colored-noise approximation to the Fokker-Planck theory

We consider the special case of a network with two self-couplings where a large sub-population (N1=N1) with s1=1 comprises all but one slow probe unit, x2, with large self-coupling s2s1. The probe unit obeys the dynamical equation dx2/dt=f(x2)+η(t), with f(x)=x+s2ϕ(x). In the large N limit, we can neglect the backreaction of the probe unit on the mean-field and approximate the latter as an external Gaussian colored noise η(t) with autocorrelation g2C(τ)=g2ϕ[x1(t)]ϕ[x1(t+τ)], independent of x2. The noise η(t) can be parameterized by its strength, defined as D=0dτC(τ) and its timescale (color) τ1. For large s2, the dynamics of the probe unit x2 can be captured by a bi-stable chaotic phase whereby its activity is localized around the critical points x2=x±±s2 (Figure 4a–i) and switches between them at random times. In the regime of colored noise (as we have here, with τ17.91), the stationary probability distribution p(x) (for xx2, Figure 4a–ii and b) satisfies the unified colored noise approximation to the Fokker Planck equation (Hänggi and Jung, 1995; Jung and Hänggi, 1987):

(19) p(x)=Z1|h(x)|exp[Ueff(x)/D],

where Z is a normalization constant, h(x)1τ1f(x), and the effective potential Ueff(x)=xf(y)h(y)dy is therefore:

(20) Ueff=x22s2logcosh(x)+τ12f(x)2Umin.

The distribution p(x) has support in the region h(x)>0 comprising two disjoint intervals |x|>xc where tanh(xc)2=11+τ1τ1s2 (Figure 4b). Moreover, the probability distribution is concentrated around the two minima x±±s2 of Ueff. The new UCNA-based term τ12f(x)2 dominates the effective potential. The main effect of the strong color τ11 is to sharply decrease the variance of the distribution around the minima x±. This is evident from comparing the colored noise with white noise, when the latter is driving the same bi-stable probe dx2/dt=x2+s2ϕ(x2)+ξ(t), where ξ(t) is a white noise with an equivalent strength to the colored noise, Figure 4iv-vi. The naive potential for the white noise case U=x2/2s2logcosh(x) is obtained from Equation 19 by sending τ10 in the prefactor h and in potential Ueff. It results in wider activity distribution compared to our network generated colored noise, in agreement with the simulations, Figure 4b.

In our colored-noise network, the probe unit’s temporal dynamics are captured by the mean first passage time T for the escape out of the potential well:

(21) T=s2xcdxDh(x)2p(x)xp(y)dy ,

where the upper limit xc in the outer integral is the edge of the support of p(x). In the small D approximation, we can evaluate the integrals by steepest descent. The inner integrand p(x) is peaked at the minimum x=s2 of the effective potential, yielding

xp(y)dy=Z12πDUeff(x)exp(Ueff(x)/D) .

The outer integrand can be rewritten as ψ(x)=expρ(x)D, where ρ(x)=Ueff(x)+Dln|h(x)| peaks at xf with tanh(xf)211/2s2. The mean first passage time can thus be estimated as

(22) T2πUeff(x)ρ(xf)exp(ΔD) ,

where Δ=ρ(xf)Ueff(x) and its asymptotic scaling for large s2 leads to Equation 4. We validated the UCNA approach to calculate the mean first passage time by estimating the distribution of escape points xesc from one well to the other well, which was found to lie predominantly within the support x>|xc| of the stationary probability distribution p(x). Only a fraction of activity in the simulations (1.8+/-0.4) * 10-3 (mean±SD over 10 probe units run with parameters as in Figure 4b) entered the forbidden region (see Figure 4—figure supplement 1 for details ).

A comparison with white noise

To test the impact of the input generated by the network (or equivalently as mimicked by the colored noise), we replaced this input (Equation 1, most rhs term) with white noise. The probe unit x in the white noise case is, therefore, following the dynamical equation:

(23) dxdt=x+sϕ(x)+gDη(t)

with η(t) taken from a normal distribution, ϕtanh and s,g, and D are constants receiving their values according to the probe unit dynamics driven by the network case; specifically these constants are the probe unit self-coupling strength, the original network gain, and strength (the integral under the autocorrelation function of the network input to the probe). Simulation results of the probe dynamics are in Figure 4aii, along with its distribution, Figure 4b (light green area, parameters’ values for s,g, and D are specified in the caption). To estimate the probability (Equation 19) and the potential (Equation 20) in this case, and since η here is white noise, we substitute τ1=0 as no correlation in the input exists. Similarly, we calculate the probe’s approximated mean first passage time when driven by white noise (Equation 4). The result (Figure 4c light green dashed line) estimates the simulations well (Figure 4c green line). Note that since log<T> depends on τ1 linearly, its exponent, the mean first passage time, depends on τ1 exponentially. Hence, the importance of the ‘color’ (correlations) in the network input in generating long timescales and the failure of these long timescales to materialize when the ‘color’ is removed (as in this particular white noise-driven probe case, which replicates the assemblies endowed network model except for its generated correlated input drive).

Spiking network model

Network architecture

We simulated a recurrent network of N excitatory (E) and inhibitory (I) spiking neurons (for N=2000,5000,10000) with relative fractions nE=80% and nI=20% and connection probabilities pEE=0.2 and pEI=pIE=pII=0.5 (Figure 5). Non-zero synaptic weights from pre-synaptic neuron j to post-synaptic neuron i were Jij, whose values only depended on the two neurons types i,j{α,β} for α,β=E,I. Neurons were arranged in p cell-type specific assemblies. E assemblies had heterogeneous sizes drawn from a uniform distribution with a mean of NEclust=60+N/100 E-neurons and 30% standard deviation. The number of assemblies was determined as p=round(nEN(1nbgr)/NEclust), where nbgr=0.1 is the fraction of background neurons in each population, i.e., not belonging to any assembly. I assemblies were paired with E assemblies and the size of each I assembly was matched to the corresponding E assembly with a proportionality factor nI/nE=1/4. Neurons belonging to the same assembly had potentiated intra-assembly weights by a factor Jαβ+, while those belonging to different assemblies had depressed inter-assembly weights by a factor Jαβ, where: JEI+=p/(1+(p1)/gEI), JIE+=p/(1+(p1)/gIE), JEI=JEI+/gEI, JIE=JIE+/gIE and Jαα=1γ(Jαα+1) for α=E,I, with γ=f(2f(p+1))1. f=(1nbgr)/p is the fraction of E neurons in each assembly. Parameter values are in Table 1.

Single neuron dynamics

Single neuron dynamics. We simulated current-based leaky-integrate-and-fire (LIF) neurons, with membrane potential V and dynamical equation

dVdt=-Vτm+Irec+Iext,

where τm is the membrane time constant. Input currents included a contribution Irec from the other recurrently connected neurons and a constant external current Iext=NextJα0rext (units of mV s-1), for α=E,I, representing afferent inputs from other brain areas and Next=nENpEE. When the membrane potential V hits the threshold Vαthr (for α=E,I), a spike is emitted and V is held at the reset value Vreset for a refractory period τrefr. We chose the thresholds so that the homogeneous network (i.e. where all Jαβ±=1) was in a balanced state with average spiking activity at rates (rE,rI)=(2,5) spks/s. The post-synaptic currents evolved according to

τsyndIrecdt=-Irec+j=1NJijkδ(t-tk),

where τs is the synaptic time constant, Jij are the recurrent couplings and tk is the time of the k-th spike from the j-th presynaptic neuron. Parameter values are in Table 1.

Self-couplings from mean-field theory

We can estimate the E-assembly self-couplings in this model using mean-field methods (Amit and Brunel, 1997; Wyrick and Mazzucato, 2021). This method allows obtaining, self-consistently, the fixed point values of the firing rates rlE,rlI in the l-th assembly (l=1,,p) via the equation

(24) rlα=Fα[μlα(r),σlα(r)],

where r=(r1E,,rpE,r1I,,rpI) is the leaky-integrate-and-fire current-to-rate transfer function for each α=E,I population

(25) Fα(μlα,σlα)=(τrefr+τmαπHΘdueu2[1+erf(u)])1,

where Hl=(Vresetμlα)/σαl+ak and Θ=(Vlthrμlα)/σlα+ak and a=|ζ(1/2)|/2 are terms accounting for the synaptic dynamics (Fourcaud and Brunel, 2002). The infinitesimal means μlE,μlI and variances (σlE)2,(σlI)2 of the network populations comprising E and I assemblies (for l=1,,p assemblies) are themselves functions of the firing rates, thus leading to self-consistent equations for the fixed points (for more details see Wyrick and Mazzucato, 2021). The infinitesimal mean μ1E of the postsynaptic input to a neuron in a representative E assembly in focus is

(26) τE1μ1E=NnEpEEJEE[JEE+f1Er1E+JEEl=2pflErlE+nbgrbgE]+NextJE0rextNnIpEIJEI[f1IJEI+r1I+JEIl=2pflIrlI+nbgrbgI] ,

where r1E is the firing rate of the E assembly in focus and r1I is the firing rate of its paired I assembly; rlE,rlI, for l=2,,p are the firing rates of the other E and I assemblies; rbgE,rbgI are the firing rates of the background (unclustered) populations. fiE,fiI represent the fraction of E and I neurons in each assembly, which are drawn from a uniform distribution (see above). The first line in Equation 26 represents the contribution to the input current coming from neurons within the same E assembly, or, in other words, the self-coupling of the assembly in focus. We can thus recast the first term in the input current as s1r1E where s1=NnEpEEJEEJEE+f1E. The number of neurons in the assembly is given by N1=NnEf1E, and the average E-to-E synaptic coupling is J¯(in)=pEEJEEJEE+, from which we obtain s1=N1J¯EE(in), which is the expression we used in Figure 5. We can thus recast Equation 26 as

(27) τE1μ1E=s1Er1Es1Ir1I+l=2p(J^1lEErlEJ^1lEIrlI)+J^bg,ErbgEJ^bg,IrbgI+J^extrext ,

where J^ represent effective synaptic couplings which depend on the underlying spiking network parameters in Equation 26. In the spiking model, the self-couplings have units of [mV]. The first line in Equation 27 exhibits the same functional form as the rate model in Equation 1, if we identify each rate unit as a functional assembly with a corresponding self-coupling. A crucial simplification occurring in the rate model Equation 1 is the absence of cell-type specific connectivity and the corresponding difference in the statistics of the distribution of the effective couplings J^, whose mean is zero in Equation 1 but non-zero in Equation 27. If we interpret the current x in the rate model as a membrane potential with units of [mV] (see Miller and Fumarola, 2012), and the current-to-rate function ϕ(x)=tanh(x) as a normalized (min-maxed) firing rate fluctuation around baseline (see Ahmadian et al., 2015), then the self-coupling in the rate model exhibits units of [mV] as in the spiking model. However, direct numerical comparison of the self-couplings between the two models is hampered by the fact that the spiking model is a balanced network, where E and I contributions to the total input current are large and cancel to leading order (Wyrick and Mazzucato, 2021), whereas the rate network does not operate in the balanced regime.

Model selection for timescale fit

Model selection for timescale fit The degree of the polynomial that best fits the dependence of the logarithm of the assembly timescales on the assembly self-couplings was estimated using cross-validation (inset in Figure 5b), according to the following steps. (1) The dataset was split into a training set and a test set of the same size. (2) Polynomial functions with increasing degrees were fit to the training set. (3) The mean-squared error of the test set was obtained from the corresponding fit. (4). A minimum was achieved for a polynomial of degree 2. All logarithms, abbreviated as ‘log’ in the main text, is in the natural base e.

Appendix 1

Dynamical regions of networks with identical self-couplings

Appendix 1—figure 1
Transition to chaos with multiple self-couplings: Fixed point solutions and stability.

(a–i) The fixed point curve xαsαtanhxα, from Equation 15, for sα>1. Stable solutions are allowed within the dark green region. (b–i) The shape of a unit’s contribution to stability q1=(sαcoshxα2)2, from Equation 18. Stable solutions of xαsαtanhxα=η, filled blue circles in (a–i), with different |x| values contribute differently to stability. At the edge of chaos only a fixed point configuration with all units contributing most to stability (minimal q1) is stable, light green region in (a–i).(a–ii) The curve xαsαtanhxα for sα<1. (a-iii) A possible distribution of the Gaussian mean-field η. A representative fixed point solution is illustrated by the dashed blue line: for sα<1 a single solution exists for all values of η, (filled blue circle in a-ii); For sα>1 multiple solutions exist (a–i) for some values of η; some of them lead to instability (empty blue circle in a-i). The other two solutions may lead to stability (filled blue circles in a-ii), although only one of them will remain stable at the edge of chaos (encircled with green line in a-i).

It is constructive to quickly survey the results of Stern et al., 2014 who studied the special case of including a single value self-coupling s for all assemblies in the network, P(si)=δs,si. In this case, the dynamics of all units in the network follow:

(28) dxidt=xi+stanh(xi)+gi=1NJϕ(xj),

Two variables determine the network dynamics, the network gain g and the self-coupling value s. The network gain g defines the strength of the network impact on its units. It brings the network into chaotic activity, without self-coupling (s=0), for values g>1 (Sompolinsky et al., 1988). The self-coupling s generates bi-stability. Without network impact (g=0) the dynamical Equation 28 for each unit becomes

(29) dxidt=xi+stanh(xi),

which has two stable solutions for s>1 (Appendix 1—figure 2a), both at x0. For s<1 (Appendix 1—figure 2b), a single stable solution exists at x=0.

When small values of network gain g are introduced to the network dynamics, Equation 28, with identical bi-stable units (s>1), each unit solution jitters around one of its two possible fixed points. After an irregular activity, the network settles into a stable fixed point. This generates a region of transient irregular activity with stable fixed points (Appendix 1—figure 2c). As g increases and s decreases, different possible fixed point configurations lose their stability (as a result, the typical time spent in the transient activity increases). When the critical line sc1+0.157ln(0.443g+1) is crossed, no fixed point remains stable and the network activity becomes chaotic (Stern et al., 2014). The ‘last’ stable fixed point at the transition line has a unique configuration with all unit values located farthest from x=0 (Appendix 1—figure2a, light green lines). Additional decrease of s and g leads to a region where any initial activity of the network decays and the trivial solution (xi=0 for all i) is stable (Appendix 1—figure 2c)

Appendix 1—figure 2
Network dynamics with identical self-couplings, adopted from Stern et al., 2014.

(a,b) Graphical solutions to Equation 29. (a) For s>1 there are two stable non-zero solutions (full black circles) and an unstable solution at zero (open black circle). The green background over the x axis denotes the regions of allowed activity values at the stable fixed point on the transition line to chaos (solid red curve in (c)).(b) For s<1 there is a single stable solution (full black circle) at zero. (c) Regions of the network dynamics over a range of s and g values. Below the long dashed blue line, any initial activity in the network decays to zero. Above the solid red curve, the network exhibits transient irregular activity that eventually settles into one out of a number of possible nonzero stable fixed points. In the region between these two curves, the network activity is chaotic. Colored circles denote, according to their locations on the phase diagram and with respect to their colors, the values of s (ranging from 1.6 and decreasing with steps of 0.2) and g=1.5, used for the autocorrelation functions C(τ) in (d; corrected version of Figure 4a in Stern et al., 2014). (e) Widths at half peak (values of τ’s in the main text notation) of the autocorrelation functions in (d).

Appendix 2

Metastable dynamics in the spiking network model

The spiking network model with E/I clusters is based on a weight matrix W with four E/I submatrices, each one exhibiting diagonal blocks representing potentiated intra-cluster synaptic weights (see Appendix 2—figure 1a, b). We can approximate the full N×N weight matrix with a mean-field reduction to a set of 2(p+1) mean-field variables riMF, representing the p E and I clusters and the two unclustered background E and I populations. In order to gain an intuitive understanding of the population dynamics encoded by JMF, we follow the approach in Murphy and Miller, 2009; Schaub et al., 2015 and can consider an auxiliary linear system

(30) τmdriMFdt=riMF+j=12p+2JijMFrjMF,

which was shown to capture the transition from asynchronous irregular activity to metastable attractor dynamics in Schaub et al., 2015. A Schur decomposition of JMF=VTVT gives an upper triangular matrix T, whose Schur eigenvectors (columns of V) represent independent dynamical modes (i.e. an orthonormal basis; Appendix 2—figure 1b). The Schur eigenvalues (diagonal values in the Schur matrix) correspond to the real part of the eigenvalues of the original matrix JMF. The Schur eigenvectors associated with the large positive eigenvalues approximately correspond to E/I cluster pairs; larger eigenvalues correspond to clusters of increasingly larger size (Appendix 2—figure 1b).

Appendix 2—figure 1
Synaptic weight matrices of clustered spiking networks.

Synaptic weight matrices of a clustered spiking networks. (a) Left: Synaptic weight matrix J of a clustered spiking network with N=2000 neurons, exhibiting the E/I four submatrices, where diagonal blocks within each submatrix reveal the E/I clustered structure (larger to smaller clusters, top to bottom in each submatrix). Right: 2(p+1)-dimensional mean-field-reduced synaptic weight matrix JMF corresponding to the full matrix on the left. Populations are ordered as follows: p+1 excitatory clusters (larger to smaller, top to bottom), background excitatory population, p+1 inhibitory clusters, background inhibitory population. For each pair of populations, the value in JMF is obtained from the corresponding block in J by summing its column values (presynaptic inputs) and averaging over the rows (postsynaptic neurons belonging to the population). (b) Right: Schur decomposition of JMF=VTVT into an upper triangular matrix T. Left: Schur eigenvectors (columns of V, bottom) sorted from larger to smaller Schur eigenvalue (top). Larger Schur eigenvalues are associated with eigenvectors whose loadings are on larger E/I cluster pairs.

Clustered networks generate metastable attractor dynamics where coupled E/I cluster pairs switch between periods of low and high firing rate activity, yielding a bimodal firing rate distribution (Appendix 2—figure 2a, b, see Litwin-Kumar and Doiron, 2012; Deco and Hugues, 2012; Mazzucato et al., 2015; Mazzucato et al., 2019; Mazzucato et al., 2016; Wyrick and Mazzucato, 2021; Schaub et al., 2015). A more detailed analysis revealed that the network activity explores metastable attractors with varying number of simultaneously active clusters (from one to four in the representative network with N=2000 neurons in Appendix 2—figure 2c), yielding a complex attractor landscape (Mazzucato et al., 2015; Wyrick and Mazzucato, 2021). The firing rate of neurons in active clusters is inversely proportional to the number of simultaneously active clusters. Therefore, the activity of single neurons is not simply bistable, but rather multistable, with several levels of firing rates attainable depending on which attractor the network is expressing at any given time (one inactive and four active levels in this representative case). This single-neuron multistability property is a biologically plausible effect observed in cortical activity (Mazzucato et al., 2015; Recanatesi et al., 2022), however, it is not present in the rate network, where neurons are just bistable (Figure 4). In the spiking model, clusters are uncorrelated (0.01±0.12, mean±S.D. across 20 networks; Appendix 2—figure 2c), similarly to neurons in the rate network.

Appendix 2—figure 2
Metastable attractor dynamics in clustered spiking networks.

Metastable attractor dynamics in clustered spiking networks. (a) Metastable activity from all clustered neurons in a representative trial of a network with N=2000 neurons (larger to smaller clusters, top to bottom; action potentials from excitatory and inhibitory neurons, black and red, respectively). (b) Firing rate distributions of four representative neurons from the network in (a) exhibit bimodal distributions (colors represent clusters with different self-couplings; spike counts estimated in 100ms bins over 20 trials of 200 seconds duration). (c) Left: Metastable attractor dynamics unfolds through different network configurations with a range of simultaneously active clusters from one to four (average occurrence of each configuration across 20 networks as fraction of total simulation time). Inset: Metastable dynamics are uncorrelated across clusters (distribution of pairwise Pearson correlations between clusters’ firing rates, 0.01±0.12, mean±SD; 20 networks). Right: The firing rate of neurons in active clusters depend on how many clusters are simultaneously active, with higher firing rates in configurations with less co-active clusters (mean and SD across 20 networks).

Appendix 2—figure 3
Metastable activity and timescale distribution in networks of N=2000,5000 neurons (top and bottom rows, respectively).

Panels (a) and (b) have the same notations as Figure 5b, and Figure 5d, respectively. The fits of average activation time T vs. self-coupling log(T)=a2sE2+a1sE+a0 yielded a2=0.44,a1=2.06,a0=1.04 for N=2000; and a2=0.093,a1=0.88,a0=0.45 for N=5000.

Appendix 2—figure 4
Chaotic rate network of N=200 units (self-couplings drawn from a lognormal distribution with parameters μ=0.2,σ=2).

(a) Eigenvalue distribution (brown) and Schur eigenvalues (pink) of the synaptic weight matrix J. (b) The Schur eigenvectors of J corresponding to large positive Schur eigenvalues have loadings localized on units with larger self-couplings, with a similar structure to the Schur eigenvectors in the spiking networks in Appendix 2-figure 3 .

Appendix 3

RNN with heterogeneous time constants

Our recurrent neural network model in Equation 1, assumes that all units share the same time constant, θ=1 ms, which measures the rate of change of a neuron’s membrane potential. We examined whether a network of units with heterogeneous time constants could give rise to multiple timescales of dynamics. We simulated the model from Equation 1 with no self-coupling term, si=0, with neuron-specific time constant θi:

(31) θiddtxi(t)=xi(t)+gjJijϕ[xj(t)].

Following the same strategy as in Figure 2, we consider the scenario when our network contains two equal-size populations of neurons (N1=N2) with different time constants θ1θ2. We quantified each unit’s timescales, τi, as the width of the autocorrelation function at the mid height. When keeping θ1 fixed and increasing θ2, we found that both populations increased their timescale Figure 1a(i-v), and the ratio between the timescales of the two populations, τ2/τ1 did not appreciably change over a large range of time constant ratios θ2/θ1, Figure 1b. Hence, we conclude that heterogeneity in single-cell time constants does not lead to large separation of timescales in networks with strong recurrent dynamics.

Appendix 3—figure 1
Timescale analysis for an RNN with two-time constants θi, Equation 31, governing equal populations of neurons (N1=N2=1000) and gain g=2.5.

(a) Average autocorrelation function for each population. The insert shows the dynamics of individual neurons from each population: blue for neurons with timeconstant θ1 and green for neurons with timeconstant θ2. In the networks considered here, θ1=0.1 ms is kept constant while: θ2=0.1 ms (i), θ2=1.0 ms (ii), θ2=10.0 ms (iii), θ2=100.0 ms (iv), θ2=1000.0 ms (v). (b) Population timescale ratio τ2/τ1 for fixed timeconstant θ1=0.1 ms and varying θ2.

Data availability

https://github.com/nistrate/multipleTimescalesRNN. (Copy archive at Istrate, 2022).

The following previously published data sets were used
    1. Cavanagh S
    2. Wallis J
    3. Kennerley S
    4. Hunt L
    (2020) Dryad Digital Repository
    Data from: Autocorrelation structure at rest predicts value correlates of single neurons during reward-guided choice.
    https://doi.org/10.5061/dryad.5b331

References

    1. Lindner B
    (2016) Mechanisms of information filtering in neural systems
    IEEE Transactions on Molecular, Biological and Multi-Scale Communications 2:5–15.
    https://doi.org/10.1109/TMBMC.2016.2618863

Article and author information

Author details

  1. Merav Stern

    1. Institute of Neuroscience, University of Oregon, Eugene, United States
    2. Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
    Contribution
    Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Nicolae Istrate
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7744-4855
  2. Nicolae Istrate

    1. Institute of Neuroscience, University of Oregon, Eugene, United States
    2. Departments of Physics, University of Oregon, Eugene, United States
    Contribution
    Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Merav Stern
    Competing interests
    No competing interests declared
  3. Luca Mazzucato

    1. Institute of Neuroscience, University of Oregon, Eugene, United States
    2. Departments of Physics, University of Oregon, Eugene, United States
    3. Mathematics and Biology, University of Oregon, Eugene, United States
    Contribution
    Conceptualization, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    lmazzuca@uoregon.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-8525-7539

Funding

National Institute of Neurological Disorders and Stroke (R01-NS118461)

  • Luca Mazzucato

National Institute on Drug Abuse (R01-DA055439)

  • Luca Mazzucato

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank J Wallis and S Cavanagh for sharing their previously published data Cavanagh et al., 2016; and G Mongillo and G La Camera for discussions. LM was supported by National Institute of Neurological Disorders and Stroke grant R01-NS118461 and by the National Institute on Drug Abuse grant R01-DA055439 (CRCNS). MS was supported by The Hebrew University of Jerusalem 'Emergency response to COVID-19’ grant and hosted by the Theoretical Sciences Visiting Program at the Okinawa Institute of Science and Technology while completing this study.

Version history

  1. Preprint posted: October 12, 2021 (view preprint)
  2. Received: January 31, 2023
  3. Accepted: December 7, 2023
  4. Accepted Manuscript published: December 12, 2023 (version 1)
  5. Version of Record published: January 25, 2024 (version 2)

Copyright

© 2023, Stern, Istrate et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 964
    views
  • 176
    downloads
  • 2
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Merav Stern
  2. Nicolae Istrate
  3. Luca Mazzucato
(2023)
A reservoir of timescales emerges in recurrent circuits with heterogeneous neural assemblies
eLife 12:e86552.
https://doi.org/10.7554/eLife.86552

Share this article

https://doi.org/10.7554/eLife.86552

Further reading

    1. Genetics and Genomics
    2. Neuroscience
    Kenneth Chiou, Noah Snyder-Mackler
    Insight

    Single-cell RNA sequencing reveals the extent to which marmosets carry genetically distinct cells from their siblings.

    1. Neuroscience
    Flavio J Schmidig, Simon Ruch, Katharina Henke
    Research Article

    We are unresponsive during slow-wave sleep but continue monitoring external events for survival. Our brain wakens us when danger is imminent. If events are non-threatening, our brain might store them for later consideration to improve decision-making. To test this hypothesis, we examined whether novel vocabulary consisting of simultaneously played pseudowords and translation words are encoded/stored during sleep, and which neural-electrical events facilitate encoding/storage. An algorithm for brain-state-dependent stimulation selectively targeted word pairs to slow-wave peaks or troughs. Retrieval tests were given 12 and 36 hr later. These tests required decisions regarding the semantic category of previously sleep-played pseudowords. The sleep-played vocabulary influenced awake decision-making 36 hr later, if targeted to troughs. The words’ linguistic processing raised neural complexity. The words’ semantic-associative encoding was supported by increased theta power during the ensuing peak. Fast-spindle power ramped up during a second peak likely aiding consolidation. Hence, new vocabulary played during slow-wave sleep was stored and influenced decision-making days later.