The geometry and dimensionality of brain-wide activity

  1. Zezhen Wang
  2. Weihao Mai
  3. Yuming Chai
  4. Kexin Qi
  5. Hongtai Ren
  6. Chen Shen
  7. Shiwu Zhang
  8. Guodong Tan
  9. Yu Hu  Is a corresponding author
  10. Quan Wen  Is a corresponding author
  1. School of Data Science, University of Science and Technology of China, China
  2. Division of Life Science, The Hong Kong University of Science and Technology, China
  3. Hefei National Laboratory for Physical Sciences at the Microscale, Center for Integrative Imaging, University of Science and Technology of China, China
  4. Division of Life Sciences and Medicine, University of Science and Technology of China, China
  5. Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, China
  6. Department of Mathematics, The Hong Kong University of Science and Technology, China

eLife Assessment

This important study shows a surprising scale-invariance of the covariance spectrum of large-scale recordings in the zebrafish brain in vivo. A convincing analysis demonstrates that a Euclidean random matrix model of the covariance matrix recapitulates these properties. The results provide several new and insightful approaches for probing large-scale neural recordings.

https://doi.org/10.7554/eLife.100666.3.sa0

Abstract

Understanding neural activity organization is vital for deciphering brain function. By recording whole-brain calcium activity in larval zebrafish during hunting and spontaneous behaviors, we find that the shape of the neural activity space, described by the neural covariance spectrum, is scale-invariant: a smaller, randomly sampled cell assembly resembles the entire brain. This phenomenon can be explained by Euclidean Random Matrix theory, where neurons are reorganized from anatomical to functional positions based on their correlations. Three factors contribute to the observed scale invariance: slow neural correlation decay, higher functional space dimension, and neural activity heterogeneity. In addition to matching data from zebrafish and mice, our theory and analysis demonstrate how the geometry of neural activity space evolves with population sizes and sampling methods, thus revealing an organizing principle of brain-wide activity.

Introduction

Geometric analysis of neuronal population activity has revealed the fundamental structures of neural representations and brain dynamics (Churchland et al., 2012; Zhang et al., 2023; Kriegeskorte and Wei, 2021; Chung and Abbott, 2021). Dimensionality reduction methods, which identify collective or latent variables in neural populations, simplify our view of high-dimensional neural data (Cunningham and Yu, 2014).Their applications to optical and multi-electrode recordings have begun to reveal important mechanisms by which neural cell assemblies process sensory information (Stringer et al., 2019a; Si et al., 2019), make decisions (Mante et al., 2013; Yang et al., 2022), maintain working memory (Xie et al., 2022) and generate motor behaviors (Churchland et al., 2012; Nguyen et al., 2016; Lindén et al., 2022; Urai et al., 2022).

In the past decade, the number of neurons that can be simultaneously recorded in vivo has grown exponentially (Buzsáki, 2004; Ahrens et al., 2012; Jun et al., 2017; Stevenson and Kording, 2011; Nguyen et al., 2016; Sofroniew et al., 2016; Lin et al., 2022; Meshulam et al., 2019; Demas et al., 2021). This increase spans various brain regions (Musall et al., 2019; Stringer et al., 2019a; Jun et al., 2017) and the entire mammalian brain (Stringer et al., 2019b; Kleinfeld et al., 2019). As more neurons are recorded, the multidimensional neural activity space, with each axis representing a neuron’s activity level (Figure 1A), becomes more complex. The changing size of observed cell assemblies raises a number of basic questions. How does this space’s geometry evolve and what structures remain invariant with increasing number of neurons recorded?

The relationship between the geometric properties of the neural activity space and the size of neural assemblies.

(A) Illustration of how dimensionality of neural activity (DPR) changes with the number of recorded neurons. (B) The eigenvalues of the neural covariance matrix dictate the geometrical configuration of the neural activity space with λi being the distribution width along a principal axis. (C) Examples of two neural populations with identical dimensionality (DPR=25/112.27) but different spatial configurations, as revealed by the eigenvalue spectrum (green: {λi}={7,7,1}, blue: {λi}={9,3,3}).

A key measure, the effective dimension or participation ratio (denoted as DPR, Figure 1B), captures a major part of variability in neural activity (Recanatesi et al., 2019; Litwin-Kumar et al., 2017; Gao et al., 2017 ; Clark et al., 2023; Dahmen et al., 2020). How does DPR vary with the number of sampled neurons (Figure 1A)? Two scenarios are possible: DPR grows continuously with more sampled neurons; DPR saturates as the sample size increases. Which scenario fits the brain? Furthermore, even if two cell assemblies have the same DPR, they can have different shapes (the geometric configuration of the neural activity space, as dictated by the eigenspectrum of the covariance matrix, Figure 1C). How does the shape vary with the number of neurons sampled? Lastly, are we going to observe the same picture of neural activity space when using different recording methods such as two-photon microscopy, which records all neurons in a brain region, versus Neuropixels (Jun et al., 2017), which conducts a broad random sampling of neurons?

Here, we aim to address these questions by analyzing brain-wide Ca2+ activity in larval zebrafish during hunting or spontaneous behavior (Figure 2A) recorded by Fourier light-field microscopy (Cong et al., 2017). The small size of this vertebrate brain, together with the volumetric imaging method, enables us to capture a significant amount of neural activity across the entire brain simultaneously. To characterize the geometry of neural activity beyond its dimensionality DPR, we examine the eigenvalues or spectrum of neural covariance (Hu and Sompolinsky, 2022; Figure 1C). The covariance spectrum has been instrumental in offering mechanistic insights into neural circuit structure and function, such as the effective strength of local recurrent interactions and the depiction of network motifs (Hu and Sompolinsky, 2022; Morales et al., 2023; Dahmen et al., 2020). Intriguingly, we find that both the dimensionality and covariance spectrum remain invariant for cell assemblies that are randomly selected from various regions of the zebrafish brain. We also verify this observation in datasets recorded by different experimental methods, including light-sheet imaging of larval zebrafish (Chen et al., 2018), two-photon imaging of mouse visual cortex (Stringer et al., 2019b), and multi-area Neuropixels recording in the mouse (Stringer et al., 2019b). To explain the observed phenomenon, we model the covariance matrix of brain-wide activity by generalizing the Euclidean Random Matrix (ERM) (Mézard et al., 1999) such that neurons correspond to points distributed in a d-dimensional functional or feature space, with pairwise correlation decaying with distance. The ERM theory, studied in theoretical physics (Mézard et al., 1999Goetschy and Skipetrov, 2013), provides extensive analytical tools for a deep understanding of the neural covariance matrix model, allowing us to unequivocally identify three crucial factors for the observed scale invariance.

Figure 2 with 4 supplements see all
Whole-brain calcium imaging of zebrafish neural activity and the phenomenon of its scale-invariant covariance eigenspectrum.

(A) Rapid light-field Ca2+ imaging system for whole-brain neural activity in larval zebrafish. (B) Inferred firing rate activity from the brain-wide calcium imaging. The ROIs are sorted by their weights in the first principal component (Stringer et al., 2019b). (C) Procedure of calculating the covariance spectrum on the full and sampled neural activity matrices. (D) Dimensionality (circles, average across eight samplings (dots)), as a function of the sampling fraction. The curve is the predicted dimensionality using Equation 5. (E) Iteratively sampled covariance matrices. Neurons are sorted in each matrix to maximize values near the diagonal. (F) The covariance spectra, that is, eigenvalue versus rank/N, for randomly sampled neurons of different sizes (colors). The gray dots represent the sorted variances Cii of all neurons. (G–I) Same as F but from three models of covariance (see details in Methods): (G) a Wishart random matrix calculated from a random activity matrix of the same size as the experimental data; (H) replacing the eigenvectors by a random orthogonal set; (I) covariance generated from a randomly connected recurrent network. The collapse index (CI), which quantifies the level of scale invariance in the eigenspectrum (see Methods), is: (G) CI = 0.214; (H) CI = 0.222; (I) CI = 0.139.

Building upon our theoretical results, we further explore the connection between the spatial arrangement of neurons and their locations in functional space, which allows us to distinguish among three sampling approaches: random sampling, anatomical sampling (akin to optical recording of all neurons within a specific region of the brain) and functional sampling (Meshulam et al., 2019). Our ERM theory makes distinct predictions regarding the scaling relationship between dimensionality and the size of cell assembly, as well as the shape of covariance eigenspectrum under various sampling methods. Taken together, our results offer a new perspective for interpreting brain-wide activity and unambiguously show its organizing principles, with unexplored consequences for neural computation.

Results

Geometry of neural activity across random cell assemblies in zebrafish brain

We recorded brain-wide Ca2+activity at a volume rate of 10 Hz in head-fixed larval zebrafish (Figure 2A) during hunting attempts (Methods) and spontaneous behavior using a Fourier light-field microscopy (Cong et al., 2017). Approximately 2000 ROIs (1977.3 ± 677.1, mean ± SD) with a diameter of 16.84 ± 8.51 µm were analyzed per fish based on voxel activity (Methods, Figure 2—figure supplement 1). These ROIs likely correspond to multiple nearby neurons with correlated activity. Henceforth, we refer to the ROIs as ‘neurons’ for simplicity.

We first investigate the dimensionality of neural activity DPR (Figure 1B) in a randomly chosen cell assembly in zebrafish, similar to multi-area Neuropixels recording in a mammalian brain. We focus on how DPR changes with a large sample size N. We find that if the mean squared covariance remains finite instead of vanishing with N, the dimensionality DPR (Figure 1B) becomes sample size independent and depends only on the variance σi2 and the covariance Cij between neurons i and j:

(1) limNDPR=E(σi2)2Eij(Cij2),

where E() denotes average across neurons (Methods and Dahmen et al., 2020). The finite mean squared covariance condition is supported by the observation that the neural activity covariance Cij is positively biased and widely distributed with a long tail (Figure 2—figure supplement 2A). As predicted, the data dimensionality grows with sample size and reaches the maximum value specified by Equation 1 (Figure 2D).

Next, we investigate the shape of the neural activity space described by the eigenspectrum of the covariance matrix derived from the activity of N randomly selected neurons (Figure 2C). When the eigenvalues are arranged in descending order and plotted against the normalized rank r/N, where r=1,,N (we refer to it as the rank plot), this curve shows an approximate power law that spans 10 folds. Interestingly, as the size of the covariance matrices decreases (N decreases), the eigenspectrum curves nearly collapse over a wide range of eigenvalues. This pattern holds across diverse datasets and experimental techniques (Figure 2F, Figure 2—figure supplement 2E–L). The similarity of the covariance matrices of randomly sampled neural populations can be intuitively visualized (Figure 2E), after properly sorting the neurons (Methods).

The scale invariance in the neural covariance matrix – the collapse of the covariance eigenspectrum under random sampling – is non-trivial. The spectrum is not scale invariant in a common covariance matrix model based on independent noise (Figure 2G). It is absent when replacing the neural covariance matrix eigenvectors with random ones, keeping the eigenvalues identical (Figure 2H). A recurrent neural network with random connectivity (Hu and Sompolinsky, 2022) does not yield a scale-invariant covariance spectrum (Figure 2I). A recently developed latent variable model (Morrell et al., 2024; Appendix 1—figure 6), which is able to reproduce avalanche criticality, also fails to generate the scale-invariant covariance spectrum. Thus, a new model is needed for the covariance matrix of neural activity.

Modeling covariance by organizing neurons in functional space

Dimension reduction methods simplify and visualize complex neuron interactions by embedding them into a low-dimensional map, within which nearby neurons have similar activities. Inspired by these ideas, we use the ERM (Mézard et al., 1999) to model neural covariance. Imagine sprinkling neurons uniformly distributed on a d-dimensional functional space of size L (Figure 3A), where the distance between neurons i and j affects their correlation. Let xi represent the functional coordinate of the neuron i. The distance-correlation dependency is described by kernel functionf(xixj)>0 with f(0)=1, indicating closer neurons have stronger correlations, and decreases as distance xixj increases (Figure 3A and Methods). To model the covariance, we extend the ERM by incorporating heterogeneity of neuron activity levels (shown as the size of the neuron in the functional space in Figure 3A).

(2) Cij=σiσjf(xixj),i,j=1,2,,N.

The variance of neural activity σi2 is drawn i.i.d. from a given distribution and is independent of neurons’ position.

Figure 3 with 2 supplements see all
Euclidean Random Matrix (ERM) model of covariance and its eigenspectrum.

(A) Schematic of the ERM model, which reorganizes neurons (circles) from the anatomical space to the functional space (here d=2 is a two-dimensional box). The correlation between a pair of neurons decreases with their distance in the functional space according to a kernel function f(x). This correlation is then scaled by neurons’ variance σi2 (circle size) to obtain the covariance Cij. (B) An example ERM correlation matrix (i.e., when σi21). (C) Spectrum (same as Figure 2F) for the ERM correlation matrix in (B). The gray dots represent the sorted variances Cii of all neurons (same as in Figure 2F). (D) Visualizing the distribution of the same ERM eigenvalues in C by plotting the probability density function (pdf).

This multidimensional functional space may represent attributes to which neurons are tuned, such as sensory features (e.g., visual orientation Hubel and Wiesel, 1959, auditory frequency) and movement characteristics (e.g., direction, speed Stefanini et al., 2020; Kropff et al., 2015). In sensory systems, it represents stimuli as neural activity patterns, with proximity indicating similarity in features. For motor control, it encodes movement parameters and trajectories. In the hippocampus, it represents the place field of a place cell, acting as a cognitive map of physical space (O’Keefe, 1976; Moser et al., 2008; Tingley and Buzsáki, 2018).

We first explore the ERM with various forms of f(x) and find that fast-decaying functions like Gaussian and exponential functions do not produce eigenspectra similar to the data and no scale invariance over random sampling (Figure 3—figure supplement 1A–H and Appendix 2). Thus, we turn to slow-decaying functions including the power law, which produce spectra similar to the data (Figure 3C, D; see also Figure 3—figure supplement 2). We adopt a particular kernel function because of its closed-form and analytical properties: f(x)=ϵμ(ϵ2+x2)μ/2 (Methods). For large distance xϵ, it approximates a power law f(x)ϵμxμ and smoothly transitions at small distance to satisfy the correlation requirement f(0)=1 (Appendix 1—figure 3I, J).

Analytical theory on the conditions of scale invariance in ERM

To determine the conditions for scale invariance in ERM, we analytically calculate the eigenspectrum of covariance matrix C (Equation 2) for large N,L using the replica method (Mézard et al., 1999). A key order parameter emerging from this calculation is the neuron density ρ:=N/Ld. In the high-density regime ρϵd1, the covariance spectrum can be approximated in a closed form (Methods). For the slow-decaying kernel function f(x) defined above, the spectrum for large eigenvalues follows a power law (Appendix 2):

(3) λ(r/N)1+μdρμd,andequivalentlyp(λ)ρμdμλ2dμdμ,

where r is the rank of the eigenvalues in descending order and p(λ) is their probability density function. Equation 3 intuitively explains the scale invariance over random sampling. Sampling in the ERM reduces the neuron density ρ. The eigenspectrum is ρ-independent whenever μ/d0. This indicates two factors contributing to the scale invariance of the eigenspectrum. First, a small exponent μ in the kernel function f(x) means that pairwise correlations slowly decay with functional distance and can be significantly positive across various functional modules and throughout the brain. For a given μ, an increase in dimension d improves the scale invariance. The dimension d could represent the number of independent features or latent variables describing neural activity or cognitive states.

We verify our theoretical predictions by comparing sampled eigenspectra in finite-size simulated ERMs across different μ and d (Figure 4A). We first consider the case of homogeneous neurons (σi21 in Equation 2, revisited later) in these simulations (Figures 3C, D, 4A), making C’s entries correlation coefficients. To quantitatively assess the level of scale invariance, we introduce a collapse index (CI, see Methods for a detailed definition). Motivated by Equation 3, the CI measures the shift of the eigenspectrum when the number of sampled neurons changes. The smaller CI values indicate higher scale invariance. Intuitively, it is defined as the area between spectrum curves from different sample sizes (Figure 4A, upper right). In the log–log scale rank plot, Equation 3 shows the spectrum shifts vertically with ρ.

Figure 4 with 2 supplements see all
Three factors contributing to scale invariance.

(A) Impact of μ and d (see text) on the scale invariance of Euclidean Random Matrix (ERM) spectrum (same plots as Figure 3C) with f(x)=ϵμ(ϵ2+x2)μ/2. The degree of scale invariance is quantified by the collapse index (CI), which essentially measures the area between different spectrum curves (upper right inset). For comparison, we fix the same coordinate range across panels hence some plots are cropped. The gray dots represent the sorted variances Cii of all neurons (same as in Figure 2F). (B) Top: sampled correlation matrix spectrum in an example animal (fish 1). Bottom: same as top but for the covariance matrix that incorporates heterogeneous variances. The gray dots represent the sorted variances Cii of all neurons (same as in Figure 2F). (C) The CI of the correlation matrix (filled squares) is found to be larger than that for the covariance matrix (opened squares) across different datasets: f1 to f6: six light-field zebrafish data (10 Hz per volume, this paper); fl: light-sheet zebrafish data (2 Hz per volume, Chen et al., 2018); mn: mouse Neuropixels data (downsampled to 10 Hz per volume); mp: mouse two-photon data (3 Hz per volume, Stringer et al., 2019b).

Thus, we define CI as this average displacement (Figure 4A, upper right, Methods), and a smaller CI means more scale invariant. Using CI, Figure 4A shows that scale invariance improves with slower correlation decay as μ decreases and the functional dimension d increases. Conversely, with large μ and small d, the covariance eigenspectrum varies significantly with scale (Figure 4A).

Next, we consider the general case of unequal neural activity levels σi2 and check for differences between the correlation (equivalent to σi21) and covariance matrix spectra. Using the collapsed index (CI), we compare the scale invariance of the two spectra in the experimental data. Intriguingly, the CI of the covariance matrix is consistently smaller (more scale-invariant) across all datasets (Figure 4C, Figure 4—figure supplement 2C, open vs. closed squares), indicating that the heterogeneity of neuronal activity variances significantly affects the eigenspectrum and the geometry of neural activity space (Tian et al., 2024). By extending our spectrum calculation to the intermediate density regime ρϵd1 (Methods), we show that the ERM model can quantitatively explain the improved scale invariance in the covariance matrix compared to the correlation matrix (Figure 4—figure supplement 2B; Table 1).

Table 1
Table of notations.
NotationDescription
CCovariance matrix, Equation 2
CijPairwise covariance between neuron i, j; entries of C
DPRParticipation ratio dimension, Equation 5
DPRASapAnatomical sampling dimension, Equation 4
λEigenvalue of a covariance matrix C
p(λ)Probability density function of covariance eigenvalues, Equation 8
rRank of an eigenvalue in descending order, Equation 3
qFraction of eigenvalues up to λ and q=r/N; Equation 13
f(x)=f(xixj)Kernel function or distance-correlation function, Equation 11
f~(k)Fourier transform of f(x),f~(k)=Rdf(x)eixkddx
μPower-law exponent in f(x) , Equation 11
εResolution parameter in f(x) to smooth the singularity near 0, Equation 11
NNumber of neurons
N0The total number of neurons prior to sampling
kN/N0 the fraction of sampled neurons
LLinear box size of the functional space
ρDensity of neurons in the functional space, Equation 3
dDimension of the functional space, Equation 3
ai(t)Neural activity of neuron i at time t
σi2Temporal variance of neural activity, Equation 2
ClCollapse index for measuring scale invariance, Equation 13
αPower-law coefficient of eigenspectrum in the rank plot, see Discussion
xi,yiNeuron i's coordinate in the functional and anatomical space, respectively
vfunc,vanatThe first canonical directions in the functional and anatomical space, respectively
RCCAThe first canonical correlation
RASapCorrelation between anatomical and functional coordinates along ASap direction, Equation 4

Lastly, we examine factors that turn out to have minimal impact on the scale invariance of the covariance spectrum. First, the shape of the kernel function f(x) over a small distance (small distance means f(x) near x = 0 in the functional space, Appendix 1—figure 3) does not affect the distribution of large eigenvalues (Appendix 1—figure 3, Table 3, Appendix 1—figure 2, Appendix 1—figure 1A).

This supports our use of a specific f(x) to represent a class of slow-decaying kernels. Second, altering the spatial distribution of neurons in the functional space, whether using a Gaussian, uniform, or clustered distribution, does not affect large covariance eigenvalues, except possibly the leading ones (Appendix 1—figure 1B, Appendix 1). Third, different geometries of the functional space, such as a flat square, a sphere, or a hemisphere, result in eigenspectra similar to the original ERM model (Appendix 1—figure 1C). These findings indicate that our theory for the covariance spectrum’s scale invariance is robust to various modeling details.

Connection among random sampling, functional sampling, and anatomical sampling

So far, we have focused on random sampling of neurons, but how does the neural activity space change with different sampling methods? To this end, we consider three methods (Figure 5A1): random sampling (RSap), anatomical sampling (ASap) where neurons in a brain region are captured by optical imaging (Grewe and Helmchen, 2009; Gauthier and Tank, 2018; Stringer et al., 2019a), and functional sampling (FSap) where neurons are selected based on activity similarity (Meshulam et al., 2019). In ASap or FSap, sampling involves expanding regions of interest in anatomical space or functional space while measuring all neural activity within those regions (Appendix 1). The difference among sampling methods depends on the neuron organization throughout the brain. If anatomically localized neurons also cluster functionally (Figure 5A4), ASap ≈ FSap; if they are spread in the functional space (Figure 5A2), ASap ≈ RSap. Generally, the anatomical–functional relationship is in-between and can be quantified using the Canonical Correlation Analysis (CCA). This technique finds axes (CCA vectors vanat and vfunc) in anatomical and functional spaces such that the neurons’ projection along these axes has the maximum correlation, RCCA. The extreme scenarios described above correspond to RCCA=1 and RCCA=0.

Figure 5 with 7 supplements see all
The relationship between the functional and anatomical space and theoretical predictions.

(A) Three sampling methods (A1) and RCCA (see text). When RCCA0 (A2), the anatomical sampling (ASap) resembles the random sampling (RSap), and while when RCCA1 (A4), ASap is similar to the functional sampling (FSap). (B) Distribution of neurons in the functional space inferred by MDS. Each neuron is color-coded by its projection along the first canonical direction vanat in the anatomical space (see text). Data based on fish 6, same for (C-E). (C) Similar to (B) but plotting neurons in the anatomical space with color based on their projection along vfunc in the functional space (see text). (D) Dimensionality (DPR) across sampling methods: average DPR under RSap (circles), average and individual brain region DPR under ASap (squares and dots), and DPR under FSap for the most correlated neuron cluster (triangles; Methods). Dashed and solid lines are theoretical predictions for DPR under RSap and FSap, respectively (Methods). (E) The CI of correlation matrices under three sampling methods in six animals (colors). **p < 0.01; ***p < 0.001; one-sided paired t tests: RSap versus ASap, p = 0.0010; RSap versus FSap, p = 0.0004; ASap versus FSap, p = 0.0014.

To determine the anatomical–functional relationship in neural data, we infer the functional coordinates xi of each neuron by fitting the ERM using multidimensional scaling (MDS) (Cox and Cox, 2000) (Methods). For simplicity and better visualization, we use a low-dimensional functional space where d=2. The fitted functional coordinates confirm the slow decay kernel function in ERM except for a small distance (Figure 5—figure supplement 3). The ERM with inferred coordinates xi also reproduces the experimental covariance matrix, including cluster structures (Figure 5—figure supplement 2) and its sampling eigenspectra (Figure 5—figure supplement 1).

Equipped with the functional and anatomical coordinates, we next use CCA to determine which scenarios illustrated in Figure 5A align better with the neural data. Figure 5B, C shows a representative fish with a significant RCCA=0.327 (p-value = 0.0042, Anderson–Darling test). Notably, the CCA vector in the anatomical space, vanat, aligns with the rostrocaudal axis. Coloring each neuron in the functional space by its projection along vanat shows a correspondence between clustering and anatomical coordinates (Figure 5B). Similarly, coloring neurons in the anatomical space (Figure 5C) by their projection along vfunc reveals distinct localizations in regions like the forebrain and optic tectum. Across animals, functionally clustered neurons show anatomical segregation (Chen et al., 2018), with an average RCCA of 0.335 ± 0.054 (mean ± SD).

Next, we investigate the effects of different sampling methods (Figure 5A1) on the geometry of the neural activity space when there is a significant but moderate anatomical–functional correlation as in the experimental data. Interestingly, dimensionality DPRASap in data under anatomical sampling consistently falls between random and functional sampling values (Figure 5D). This phenomenon can be intuitively explained by the ERM theory. Recall that for large N, the key term in Equation 1 is Eij(Cij2). For a fixed number of sampled neurons, this average squared covariance is maximized when neurons are selected closely in the functional space (FSap) and minimized when distributed randomly (RSap). Thus, RSap and FSap DPR set the upper and lower bounds of dimensionality, with ASap expected to fall in between. This reasoning can be precisely formulated to obtain quantitative predictions of the bounds (Methods). We predict the ASap dimension at large N as

(4) DPRASap(1RASap2+k2RASap2)μ/dDPR.

Here, DPR is the dimensionality under RSap (Equation 1), k represents the fraction of sampled neurons. RASap is the correlation between anatomical and functional coordinates along the direction where the anatomical subregions are divided (Methods), and it is bounded by the canonical correlation RASapRCCA. When RASap=0, we get the upper bound DPRASap=DPR (Figure 5D, dashed line). The lower bound is reached when RASap=RCCA=1 (Figure 5A4), where Equation 4 shows a scaling relationship DPRASap=DPRFSapk2μ/dDPR that depends on the sampling fraction k (Figure 5D, solid line). This contrasts with the k-independent dimensionality of RSap in Equation 1. Furthermore, if RASap and its upper bound is not close to 1 (precisely RASap0.84 for the ERM model in Figure 5D), DPRASap align closer to the upper bound of RSap. This prediction agrees well with our observations in data across animals (Figure 5D, Figure 5—figure supplement 6, and Figure 5—figure supplement 7).

Beyond dimensionality, our theory predicts the difference in the covariance spectrum between sampling methods based on the neuronal density ρ in the functional space (Equation 3). This density ρ remains constant during FSap (Figure 5A1) and decreases under RSap; the average density across anatomical regions ρ in ASap lies between those of FSap and RSap. Analogous to Equation 4, the relationship in ρ orders the spectra: ASap’s spectrum lies between those of FSap and RSap (Methods). This further implies that the level of scale invariance under ASap should fall between that of RSap and FSap, which is confirmed by our experimental data (Figure 5E).

Discussion

Impact of hunting behavior on scale invariance and functional space organization

How does task-related neural activity shape the covariance spectrum and brain-wide functional organization? We examine the hunting behavior in larval zebrafish, marked by eye convergence (both eyes move inward to focus on the central visual field) (Bianco et al., 2011). We find that scale invariance of the eigenspectra persists and is enhanced even after removing the hunting frames from the Ca2+ imaging data (Figure 4C, Appendix 1—figure 7A, B, Appendix 1). This is consistent with the scale-invariant spectrum found in other datasets during spontaneous behaviors (Figure 5—figure supplement 1F, Figure 2—figure supplement 2G, H), suggesting scale invariance is a general phenomenon.

Interestingly, in the inferred functional space, we observe reorganizations of neurons after removing hunting behavior (Appendix 1—figure 7C, D). Neurons in one cluster disperse from their center of mass (Appendix 1—figure 7D) and decreases the local neuronal density ρ (Appendix 1 and Appendix 1—figure 7E). The neurons in this dispersed cluster have a consistent anatomical distribution from the midbrain to the hindbrain in four out of five fish (Appendix 1—figure 9). During hunting, the cluster has robust activations that are widespread in the anatomical space but localized in the functional space (Appendix 1, Appendix 1—Video 1).

Our findings suggest that the functional space could be defined by latent variables that represent cognitive factors such as decision-making, memory, and attention. These variables set the space’s dimensions, with neural activity patterns reflecting cognitive state dynamics. Functionally related neurons – through sensory tuning, movement parameters, internal conditions, or cognitive factors – become closer in this space, leading to stronger activity correlations.

Criticality and power law

What drives brain dynamics with a slow-decaying distance–correlation function f(x) in functional space? Long-range connections and a slow decline in projection strength over distance (Kunst et al., 2019) may cause extensive correlations, enhancing global activity patterns. This behavior is also reminiscent of phase transitions in statistical mechanics (Kardar, 2007), where local interactions lead to expansive correlated behaviors. Studies suggest that critical brains optimize information processing (Beggs and Plenz, 2003; Dahmen et al., 2019). The link between neural correlation structures and neuronal connectivity topology is an exciting area for future exploration.

In the high-density regime of the ERM model, the rank plot (Equation 3) for large eigenvalues (λ>1) follows a power law λrα, with α=1μ/d<1. The scale-invariant spectrum occurs when α is close to 1. Experimental data, however, align more closely with the model in the intermediate-density regime, where the power-law spectrum is an approximation and the decay is slower (for ERM model, Figure 4—figure supplement 1BC, and for data α=0.47±0.08, mean ± SD, n =6 fish). Stringer et al., 2019a found an α1 decay in the mouse visual cortex’s stimulus trial averaged covariance spectrum, and they argued that this decay optimizes visual code efficiency and smoothness. Our study differs in two fundamental ways. First, we recorded brain-wide activity during spontaneous or hunting behavior, calculating neural covariance from single-trial activity. Much of the neural activity was not driven by sensory stimulus and unrelated to specific tasks, requiring a different interpretation of the neural covariance spectrum. Second, without loss of generality, we normalized the mean variance of neural activity E(σ2) by scaling the covariance matrix so that its eigenvalues sum up to N. This normalization imposes a constraint on the spectrum. In particular, large and small eigenvalues may have different behaviors and do not need to obey a single power law λrα for all N eigenvalues (Pospisil and Pillow, 2024) (Methods). Stringer et al., 2019a did not take this possibility into account, making their theory less applicable to our analysis.

We draw inspiration from the renormalization group (RG) approach to navigate neural covariance across scales, which has also been explored in the recent literature. Following Kadanoff’s block spin transformation (Kardar, 2007, Meshulam et al., 2019) formed size-dependent neuron clusters and their covariance matrices by iteratively pairing the most correlated neurons and placing them adjacent on a lattice. The groups expanded until the largest reached the system size. The RG process, akin to spatial sampling in functional space (FSap), maintains constant neuron density ρ. Thus, for any kernel function f(x), including the power law and exponential, the covariance eigenspectrum remains invariant across scales (Appendix 1—figure 5A, B, D, E).

Morrell et al., 2024; Morrell et al., 2021 proposed a simple model in which a few time-varying latent factors impact the whole neural population. We evaluated if this model could account for the scale invariance seen in our data. Simulations showed that the resulting eigenspectra differed considerably from our findings (Appendix 1—figure 6). Although the Morrell model demonstrated a degree of scale invariance under functional sampling (or RG), it did not align with the scale-invariant features under random sampling, suggesting that this simple model might not capture all crucial features in our observations.

We emphasize that the covariance spectrum being a power law is distinct from the scale invariance we define in this study, namely the collapse of spectrum curves under random neuron sampling. The random RNN model in Figure 2I shows a power-law behavior, but lacks true scale invariance as spectrum curves for different sizes do not collapse. When connection strength g approaches 1, the system exhibits a power-law spectrum of λ(rN)32. Subsampling causes the spectrum to shift by λk12(rN)32, where k=Ns/N is the sampling fraction (derived from Equation 24 in Hu and Sompolinsky, 2022).

Bounded dimensionality under random sampling

The saturation of the dimensionality DPR at large sample sizes indicates a limit to neural assembly complexity, evidenced by the finite mean square covariance. This is in contrast with neural dynamics models such as the balanced excitatory–inhibitory (E–I) neural network (Renart et al., 2010), where Eij(Cij2)1/N resulting in an unbounded dimensionality (see Appendix 2). Our results suggest that the brain encodes experiences, sensations, and thoughts using a finite set of dimensions instead of an infinitely complex neural activity space.

We found that the relationship between dimensionality and the number of recorded neurons depends on the sampling method. For functional sampling, the dimensionality scales with the sampling fraction k:DPRFSapk2μ/dDPR. This suggests that if anatomically sampled neurons are functionally clustered, as with cortical neurons forming functional maps, the increase in dimensionality with neuron number may seem unbounded. This offers new insights for interpreting large-scale neural activity data recorded under various techniques.

Manley et al., 2024 found that, unlike in our study, neural activity dimensionality in head-fixed, spontaneously behaving mice did not saturate. They used shared variance component analysis (SVCA) and noted that PCA-based estimates often show dimensionality saturation, which is consistent with our findings. We intentionally chose PCA in our study for several reasons. First, PCA is a trusted and widely used method in neuroscience, proven to uncover meaningful patterns in neural data. Second, its mathematical properties are well understood, making it particularly suitable for our theoretical analysis. Although newer methods such as SVCA might offer valuable insights, we believe PCA remains the most appropriate method for our research questions.

It is important to note that the scale invariance of dimensionality and covariance spectrum are distinct phenomena with different underlying requirements. Dimensionality invariance relies on finite mean square covariance, causing saturation at large sample sizes. In contrast, spectral invariance requires a slow-decaying correlation kernel (small μ) and/or a high-dimensional functional space (large d). Although both features appear in our data, they result from distinct mechanisms. A neural system could show saturating dimensionality without spectral invariance if it has finite mean square covariance but rapidly decaying correlations with functional distance. Understanding these requirements clarifies how neural organization affects different scale-invariant properties.

Computational benefits of a scale-invariant covariance spectrum

Our findings are validated across multiple datasets obtained through various recording techniques and animal models, ranging from single-neuron calcium imaging in larval zebrafish to single-neuron multi-electrode recordings in the mouse brain (see Figure 2—figure supplement 2). The conclusion remains robust when the multi-electrode recording data are reanalyzed under different sampling rates (6–24 Hz, Figure 2—figure supplement 4). We also confirm that substituting a few negative covariances with zero retains the spectrum of the data covariance matrix (Figure 2—figure supplement 3 and Methods).

The scale invariance of neural activity across different neuron assembly sizes could support efficient multiscale information encoding and processing. This indicates that the neural code is robust and requires minimal adjustments despite changes in population size. One recent study shows that randomly sampled and coarse-grained macrovoxels can predict population neural activity (Hoffmann et al., 2023), reinforcing that a random neuron subset may capture overall activity patterns. This enables downstream circuits to readout and process information through random projections (Gao et al., 2017). A recent study demonstrates that a scale-invariant noise covariance spectrum with a specific slope α<1 enables neurons to convey unlimited stimulus information as the population size increases (Moosavi et al., 2024). The linear Fisher information, in this context, grows at least as N1α.

Understanding how dimensionality and spectrum change with sample size also suggests the possibility of extrapolating from small samples to overcome experimental limitations. This is particularly feasible when μ/d0, where the dimensionality and spectrum under anatomical, random, and functional sampling coincide (Equations 3 and 4). Developing extrapolation methods and exploring the benefits of scale-invariant neural code are promising future research directions.

Materials and methods

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Strain, strain background (Danio rerio)Tg(elavl3: H2B- GCaMP6f)https://doi.org/10.7554/eLife.12741Jiu-Lin Du, Institute of Neuroscience, Chinese Academy of Sciences, Shanghai
Software, algorithmjulia1.7https://julialang.org/
Software, algorithmMATLABhttps://ww2.mathworks.cn/
Software, algorithmMathematicahttps://www.wolfram.com/mathematica/

Experimental methods

The handling and care of the zebrafish complied with the guidelines and regulations of the Animal Resources Center of the University of Science and Technology of China (USTC). All larval zebrafish (huc:h2b -GCaMP6f Cong et al., 2017) were raised in E2 embryo medium (comprising 7.5 mM NaCl, 0.25 mM KCl, 0.5 mM MgSO4, 0.075 mM KH2PO4, 0.025 mM Na2HPO4, 0.5 mM CaCl2, and 0.35 mM NaHCO3; containing 0.5 mg/l methylene blue) at 28.5°C and with a 14-hr light and 10-hr dark cycle.

To induce hunting behavior (composed of motor sequences like eye convergence and J turn) in larval zebrafish, we fed them a large amount of paramecia over a period of 4–5 days post-fertilization (dpf). The animals were then subjected to a 24-hr starvation period, after which they were transferred to a specialized experimental chamber. The experimental chamber was 20 mm in diameter and 1 mm in depth, and the head of each zebrafish was immobilized by applying 2% low melting point agarose. The careful removal of the agarose from the eyes and tail of the fish ensured that these body regions remained free to move during hunting behavior. Thus, characteristic behavioral features such as J-turns and eye convergence could be observed and analyzed. Subsequently, the zebrafish were transferred to an incubator and stayed overnight. At 7 dpf, several paramecia were introduced in front of the previously immobilized animals, each of which was monitored by a stereomicroscope. Those displaying binocular convergence were selected for subsequent Ca2+ imaging experiments.

We developed a novel optomagnetic system that allows (1) precise control of the trajectory of the paramecium and (2) imaging brain-wide Ca2+ activity during the hunting behavior of zebrafish. To control the movement of the paramecium, we treated these microorganisms with a suspension of ferric tetroxide for 30 min and selected those that responded to its magnetic attraction. A magnetic paramecium was then placed in front of a selected larva, and its movement was controlled by changing the magnetic field generated by Helmholtz coils that were integrated into the imaging system. The real-time position of the paramecium, captured by an infrared camera, was identified by online image processing. The positional vector relative to a predetermined target position was calculated. The magnitude and direction of the current in the Helmholtz coils were adjusted accordingly, allowing for precise control of the magnetic field and hence the movement of the paramecium. Multiple target positions could be set to drive the paramecium back and forth between multiple locations.

The experimental setup consisted of head-fixed larval zebrafish undergoing two different types of behavior: induced hunting behavior by a moving paramecium in front of a fish (fish 1–5), and spontaneous behavior without any visual stimulus as a control (fish 6). Experiments were carried out at ambient temperature (ranging from 23 to 25°C). The behavior of the zebrafish was monitored by a high-speed infrared camera (Basler acA2000-165umNIR, 0.66×) behind a 4F optical system and recorded at 50 Hz. Brain-wide Ca2+ imaging was achieved using XLFM. Light-field images were acquired at 10 Hz, using customized LabVIEW software (National Instruments, USA) or Solis software (Oxford Instruments, UK), with the help of a high-speed data acquisition card (PCIe-6321, National Instruments, USA) to synchronize the fluorescence with behavioral imaging.

Behavior analysis

Request a detailed protocol

The background of each behavior video was removed using the clone stamp tool in Adobe Photoshop CS6. Individual images were then processed by an adaptive thresholding algorithm, and fish head and yolk were selected manually to determine the head orientation. The entire body centerline, extending from head to tail, was divided into 20 segments. The amplitude of a bending segment was defined as the angle between the segment and the head orientation. To identify the paramecium in a noisy environment, we subtracted a background image, averaged over a time window of 100 s, from all the frames. The major axis of the left or right eye was identified using DeepLabCut (Mathis et al., 2018). The eye orientation was defined as the angle between the rostrocaudal axis and the major axis of an eye. The convergence angle was defined as the angle between the major axes of the left and right eyes. An eye-convergence event was defined as a period of time where the angle between the long axis of the eyes stayed above 50° (Bianco et al., 2011).

Imaging data acquisition and processing

Request a detailed protocol

We used a fast eXtended light-field microscope (XLFM, with a volume rate of 10 Hz) to record Ca2+ activity throughout the brain of head-fixed larval zebrafish. Fish were ordered by the dates of experiments. As previously described (Cong et al., 2017), we adopted the Richardson–Lucy deconvolution method to iteratively reconstruct 3D fluorescence stacks (600 × 600 × 250) from the acquired 2D images (2048 × 2048). This algorithm requires an experimentally measured point spread function of the XLFM system. The entire recording for each fish is 15.3 ± 4.3 min (mean ± SD).

To perform image registration and segmentation, we first cropped and resized the original image stack to 400 × 308 × 210, which corresponded to the size of a standard zebrafish brain (zbb) atlas (Tabor et al., 2019). This step aimed to reduce substantial memory requirements and computational costs in subsequent operations. Next, we picked a typical volume frame and aligned it with the zbb atlas using a basic 3D affine transformation. This transformed frame was used as a template. We aligned each volume with the template using rigid 3D intensity-based registration (Studholme et al., 1997) and non-rigid pairwise registration (Rueckert et al., 1999) in the Computational Morphometry Toolkit (CMTK) (https://www.nitrc.org/projects/cmtk/). After voxel registration, we computed the pairwise correlation between nearby voxel intensities and performed the watershed algorithm on the correlation map to cluster and segment voxels into consistent ROIs across all volumes. We defined the diameter of each ROI using the maximum Feret diameter (the longest distance between any two voxels within a single ROI).

Finally, we adopted the ‘OASIS’ deconvolution method to denoise and infer neural activity from the fluorescence time sequence (Friedrich et al., 2017). The deconvolved ΔF/F of each ROI was used to infer firing rates for subsequent analysis.

Other experimental datasets analyzed

Request a detailed protocol

To validate our findings across different recording methods and animal models, we also analyzed three additional datasets (Table 2). We include a brief description below for completeness. Further details can be found in the respective reference. The first dataset includes whole-brain light-sheet Ca2+ imaging of immobilized larval zebrafish in the presence of visual stimuli as well as in a spontaneous state (Chen et al., 2018). Each volume of the brain was scanned through 2.11 ± 0.21 planes per second, providing a near-simultaneous readout of neuronal Ca2+ signals. We analyzed fish 8 (69,207 neurons × 7890 frames), 9 (79,704 neurons × 7720 frames), and 11 (101,729 neurons × 8528 frames), which are the first three fish data with more than 7200 frames. For simplicity, we labeled them l2, l3, and l1(fl). The second dataset consists of Neuropixels recordings from approximately ten different brain areas in mice during spontaneous behavior (Stringer et al., 2019b). Data from the three mice, Kerbs, Robbins, and Waksman, include the firing rate matrices of 1462 neurons × 39,053 frames, 2296 neurons × 66,409 frames, and 2688 neurons × 74,368 frames, respectively. The last dataset comprises two-photon Ca2+ imaging data (2–3 Hz) obtained from the visual cortex of mice during spontaneous behavior. While this dataset includes numerous animals, we focused on the first three animals that exhibited spontaneous behavior. While this dataset includes numerous animals, we focused on the first three animals that exhibited spontaneous behavior:spont_M150824_MP019_2016-04-05 (11,983 neurons × 21,055 frames), spont_M160825_MP027_2016-12-12 (11,624 neurons × 23,259 frames), and spont_M160907_MP028_2016-09-26 (9392 neurons × 10,301 frames) (Stringer et al., 2019b).

Covariance matrix, eigenspectrum, and sampling procedures

Request a detailed protocol

To begin, we multiplied the inferred firing rate of each neuron (see Methods) by a constant such that in the resulting activity trace ai, the mean of ai(t) over the nonzero time frames equaled one (Meshulam et al., 2019). Consistent with the literature (Meshulam et al., 2019), this step aimed to eliminate possible confounding factors in the raw activity traces, such as the heterogeneous expression level of the fluorescence protein within neurons and the nonlinear conversion of the electrical signal to Ca2+ concentration. Note that after this scaling, neurons could still have different activity levels characterized by the variance of ai(t) over time, due to differences in the sparsity of activity (proportion of nonzero frames) and the distribution of nonzero ai(t) values. Without normalization, the covariance matrix becomes nearly diagonal, causing significant underestimation of the covariance structures.

The three models of covariance in Figure 2G–I were constructed as follows. For model in Figure 2G, the entries of matrix G (with dimensions N×T) were sampled from an i.i.d. Gaussian distribution with zero mean and standard deviation σ=1. In Figure 2H, we constructed the composite covariance matrix for fish 1 achieved by maintaining the eigenvalues from the fish 1 data covariance matrix and replacing the eigenvectors U with a set of random orthonormal basis. Lastly, the covariance matrix in Figure 2I was generated from a randomly connected recurrent network of linear rate neurons. The entries in the synaptic weight matrix are normally distributed with JijN(0,g2/N), with a coupling strength g=0.95 (Hu and Sompolinsky, 2022; Morales et al., 2023). For consistency, we used the same number of time frames T=7200 when comparing CI across all the datasets (Figures 4B, C, 5D, E, Figure 4—figure supplement 2C). For other cases, we analyzed the full length of the data (number of time frames: fish 1 – 7495, fish 2 – 9774, fish 3 – 13,904, fish 4 – 7318, fish 5 – 7200, and fish 6 – 9388). Next, the covariance matrix was calculated as Cij=1T1t=1T(ai(t)a¯i)(aj(t)a¯j), where a¯i is the mean of ai(t) over time. Finally, to visualize covariance matrices on a common scale, we multiplied matrix C by a constant such that the average of its diagonal entries equaled one, that is, Tr(C)/N=1. This scaling did not alter the shape of covariance eigenvalue distribution, but set the mean at 1 (see also Equation 8).

To maintain consistency across datasets, we fixed the same initial number of neurons at N0=1,024. These N0 neurons were randomly chosen once for each zebrafish dataset and then used throughout the subsequent analyses. We adopted this setting for all analyses except in two particular instances: (1) for comparisons among the three sampling methods (RSap, ASap, and FSap), we specifically chose 1024 neurons centered along the anterior–posterior axis, mainly from the midbrain to the anterior hindbrain regions (Figure 5DE, Figure 5—figure supplement 6). (2) When investigating the impact of hunting behavior on scale invariance, we included the entire neuronal population (Appendix 1).

We used an iterative procedure to sample the covariance matrix C (calculated from data or as simulated ERMs). For RSap, in the first iteration, we randomly selected half of the neurons. The covariance matrix for these selected neurons was a N/2×N/2 diagonal block of C. Similarly, the covariance matrix of the unselected neurons was another diagonal block of the same size. In the next iteration, we similarly created two new sampled blocks with half the number of neurons for each of the blocks we had. Repeating this process for n iterations resulted in 2n blocks, each containing N:=N0/2n neurons. At each iteration, the eigenvalues of each block were calculated and averaged across the blocks after being sorted in descending order. Finally, the averaged eigenvalues were plotted against rank/N on a log–log scale.

In the case of ASap and FSap, the process of selecting neurons was different, although the remaining procedures followed the RSap protocol. In ASap, the selection of neurons was based on a spatial criterion: neurons close to the anterior end on the anterior–posterior axis were grouped to create a diagonal block of size N2×N2, with the remaining neurons forming a separate block. FSap, on the other hand, used the RG framework (Meshulam et al., 2019) to define the blocks (details in Appendix 1). In each iteration, the cluster of neurons within a block that showed the highest average correlation (Eij(Cij2)) was identified and labeled as the most correlated cluster (refer to Figure 5D, Figure 5—figure supplement 6, and Figure 5—figure supplement 7).

In the ERM model, as part of implementing ASap, we generated anatomical and functional coordinates for neurons with a specified CCA properties as described in Methods. Mirroring the approach taken with our data, ASap segmented neurons into groups based on the first dimension of their anatomical coordinates, akin to the anterier–posterior axis. FSap employed the same RG procedures outlined earlier (Appendix 1).

To determine the overall power-law coefficient of the eigenspectra, α, throughout sampling, we fitted a straight line in the log–log rank plot to the large eigenvalues that combined the original and three iterations of sampled covariance matrices (selecting the top 10% eigenvalues for each matrix and excluding the first four largest ones for each matrix). We averaged the estimated α over 10 repetitions of the entire sampling procedure. R2 of the power-law fit was computed in a similar way. To visualize the statistical structures of the original and sampled covariance matrices, the orders of the neurons (i.e., columns and rows) are determined by the following algorithm. We first construct a symmetric Toeplitz matrix T, with entries Ti,j=tij and tijtji. The vector t=[t0,t1,,tN1] is equal to the mean covariance vector of each neuron calculated below. Let ci be a row vector of the data covariance matrix; we identify t=1Ni=1ND(ci), where D() denotes a numerical ordering operator, namely rearranging the elements in a vector c such that c0c1cN1. The second step is to find a permutation matrix P such that TPCPTF is minimized, where  F denotes the Frobenius norm. This quadratic assignment problem is solved by simulated annealing. Note that after sampling, the smaller matrix will appear different from the larger one. We need to perform the above reordering algorithm for every sampled matrix so that matrices of different sizes become similar in Figure 2E.

The composite covariance matrix with substituted eigenvectors in Figure 2H was created as described in the following steps. First, we generated a random orthogonal matrix Ur (based on the Haar measure) for the new eigenvectors. This was achieved by QR decomposition A=UrR of a random matrix A with i.i.d. entries AijN(0,1/N). The composite covariance matrix Cr was then defined as Cr:=UrΛUrT, where Λ is a diagonal matrix that contains the eigenvalues of C. Note that since all the eigenvalues are real and Ur is orthogonal, the resulting Cr is a real and symmetric matrix. By construction, Cr and C have the same eigenvalues, but their sampled eigenspectra can differ.

Dimensionality

Request a detailed protocol

In this section, we introduce the participation ratio (DPR) as a metric for effective dimensionality of a system, based on Recanatesi et al., 2019; Litwin-Kumar et al., 2017; Gao and Ganguli, 2015; Gao et al., 2017; Clark et al., 2023; Dahmen et al., 2020. DPR is defined as:

(5) DPR(C)=(iλi)2iλi2=(Tr(C))2Tr(C2)=N2E(σ2)2NE(σ4)+N(N1)Eij(Cij2)

Here, λi are the eigenvalues of the covariance matrix C, representing variances of neural activities. Tr() denotes the trace of the matrix. The term Eij(Cij2) denotes the expected value of the squared elements that lie off the main diagonal of C. This represents the average squared covariance between the activities of distinct pairs of neurons.

With these definitions, we explore the asymptotic behavior of DPR as the number of neurons N approaches infinity:

limNDPR(C)=E(σ2)2Eij(Cij2)

This limit highlights the relationship between the PR dimension and the average squared covariance among different pairs of neurons. To predict how DPR scales with the number of neurons (Figure 2D), we first estimated these statistical quantities (Eij(Cij2), E(σ2), and E(σ4)) using all available neurons, then applied Equation 5 for different values of N. It is worth mentioning that a similar theoretical finding is established by Dahmen et al., 2020. The transition from increasing DPR with N to approaching the saturation point occurs when N is significantly larger than DPR.

ERM model

Request a detailed protocol

We consider the eigenvalue distribution or spectrum of the matrix C at the limit of N1 and L1. This spectrum can be analytically calculated in both high- and intermediate-density scenarios using the replica method (Mézard et al., 1999). The following sketch shows our approach, and detailed derivations can be found in Appendix 2. To calculate the probability density function of the eigenvalues (or eigendensity), we first compute the resolvent or Stieltjes transform g(z)=2Nzln det(zIC)1/2, zC. Here, is the average across the realizations of C (i.e., random xi’ s and σi2’ s). The relationship between the resolvent and the eigendensity is given by the Sokhotski–Plemelj formula:

(6) p(λ)=1πlimη0+Img(λ+iη),

where Im means imaginary part.

Here we follow the field-theoretic approach (Mézard et al., 1999), which turns the problem of calculating the resolvent to a calculation of the partition function in statistical physics by using the replica method. In the limit N, Ld, ρ being finite, by performing a leading order expansion of the canonical partition function at large z (Appendix 2), we find the resolvent is given by

(7) g(z)=1ρddk(2π)d1zρE(σ2)f~(k)

In the high-density regime, the probability density function (pdf) of the covariance eigenvalues can be approximated and expressed from Equations 6 and 7 using the Fourier transform of the kernel function f~(k):

(8) p(λ)=1ρE(σ2)Rdddk(2π)dδ(λE(σ2)ρf~(k)),

where δ(x) is the Dirac delta function and E(σ2) is the expected value of the variances of neural activity. Intuitively, Equation 8 means that λ/ρ are distributed with a density proportional to the area of f~(k)’ level sets (i.e., isosurfaces).

In Results, we found that the covariance matrix consistently shows greater scale invariance compared to the correlation matrix across all datasets. This suggests that the variability in neuronal activity significantly influences the eigenspectrum. This finding, however, cannot be explained by the high-density theory, which predicts that the eigenspectrum of the covariance matrix is simply a rescaling of the correlation eigenspectrum by E(σi2), the expected value of the variances of neural activity. Without loss of generality, we can always standardize the fluctuation level of neural activity by setting E(σ2)=1. This is equivalent to multiplying the covariance matrix C by a constant such that Tr(C)/N=1, which in turn scales all the eigenvalues of C by the same factor. Consequently, the heterogeneity of σi2 has no effect on the scale invariance of the eigenspectrum (see Equation 8). This theoretical prediction is indeed correct and is confirmed by direct numerical simulations and quantifying the scale invariance using the CI (Figure 4—figure supplement 2A).

Fortunately, the inconsistency between theory and experimental results can be resolved by focusing the ERM within the intermediate density regime ρϵd1, where neurons are positioned at a moderate distance from each other. As mentioned above, we set E(σ2)=1 in our model and vary the diversity of activity fluctuations among neurons represented by E(σ4). Consistent with the experimental observations, we find that the CI decreases with E(σ4) (see Figure 4—figure supplement 2B). This agreement indicates that the neural data are better explained by the ERM in the intermediate density regime.

To gain a deeper understanding of this behavior, we use the Gaussian variational method (Mézard et al., 1999) to calculate the eigenspectrum. Unlike the high-density theory where the eigendensity has an explicit expression, in the intermediate density the resolvent g(z) no longer has an explicit expression and is given by the following equation:

(9) g(z)=1zσ2DkG~(k,z)σ,

where σ computes the expectation value of the term within the bracket with respect to σ, namely σp(σ)dσ. Here and in the following, we denote Dkddk(2π)d. The function G(k,z) is determined by a self-consistent equation,

(10) 1f~(k)=1G~(k,z)+ρσ2zσ2DkG~(k,z)σ

We can solve DkG(k,z) from Equation 10 numerically and below is an outline, and the details are explained in Appendix 2. Let us define the integral GDkG~(k,z). First, we substitute zλ+iη into Equation 10 and write G=ReG+iImG. Equation 10 can thus be decomposed into its real part and imaginary part, and a set of nonlinear and integral equations, each of which involves both ReG and ImG. We solve these equations at the limit η0 using a fixed-point iteration that alternates between updating ReG and ImG until convergence.

We find that the variational approximations exhibit excellent agreement with the numerical simulation for both large and intermediate ρ where the high-density theory starts to deviate significantly (for ρ=256 and ρ=10.24, ϵ=0.03125, Figure 4—figure supplement 1). Note that the departure of the leading eigenvalues in these plots is expected, since the power-law kernel function we use is not integrable (see Methods).

To elucidate the connection between the two different methods, we estimate the condition when the result of the high-density theory (Equation 8) matches that of the variational method (Equations 9 and 10; Appendix 2). The transition between these two density regimes can also be understood (see Equation 22 and Appendix 2).

Importantly, the scale invariance of the spectrum at μ/d0 previously derived using the high-density result (Equation 3) can be extended to the intermediate-density regime by proving the ρ-independence using the variational method (Appendix 2).

Finally, using the variational method and the integration limit estimated by simulation (see Methods), we show that the heterogeneity of the variance of neural activity, quantified by E(σ4), indeed improves the collapse of the eigenspectra for intermediate ρ (Appendix 2). Our theoretical results agree excellently with the ERM simulation (Figure 4—figure supplement 2A, B).

Kernel function

Request a detailed protocol

Throughout the paper, we have mainly considered a particular approximate power-law kernel function inspired by the Student’s t distribution

(11) f(x)=ϵμ(ϵ2+x2)μ/2.

To understand how to choose ϵ and μ, see Methods. Variations of Equation 11 near x=0 have also been explored; see a summary in Table 3.

Table 3
Modifications of the shape of f(x) near x=0 used in Appendix 1—figures 13.

Flat: when x<ϵ, f(x)=1. Tangent: when x<cϵ, f(x) follows a tangent line of the exact power law (bx+1 and ϵμxμ have a same first-order derivative when x=cϵ). b and c are constants. Tent: when x<cϵ, f(x) follows a straight line while the slope is not the same as the tangent case. Parabola: when x<cϵ, f(x) follows a quadratic function (ax2+1 and ϵμxμ have same first-order derivative). t pdf: mimic the smoothing treatment like the t distribution. All the constant parameters are set such that f(0)=1.

f(x)Definition
Flatf(x)={1,x<ϵeμxμ,xϵ
Tangentf(x)={bx+1,x<cϵ,f(cϵ)=beμxμ,xcϵ
Tentf(x)={bx+1,x<cϵ,f(cϵ)beμxμ,xcϵ
Parabolaf(x)={bx2+1,x<cϵ,f(cϵ)=2bcϵeμxμ,xcϵ
t pdff(x)=εμ(ε2+x2)μ/2

It is worth mentioning that a power law is not the only slow-decaying function that can produce a scale-invariant covariance spectrum (Figure 3—figure supplement 2). We choose it for its analytical tractability in calculating the eigenspectrum. Importantly, we find numerically that the two contributing factors to scale invariance – namely, slow spatial decay and higher functional space – can be generalized to other nonpower-law functions. An example is the stretched exponential function f(x)=exη with 0<η<1. When η is small and d is large, the covariance eigenspectra also display a similar collapse upon random sampling (Figure 3—figure supplement 2).

This approximate power-law f(x) has the advantage of having an analytical expression for its Fourier transform, which is crucial for the high-density theory (Equation 8),

(12) f(k)=2dμ+22πd2kμd2ϵμ+d2K(dμ)/2(kϵ)Γ(μ/2),k=k

Here, Kα(x) is the modified Bessel function of the second kind, and Γ(x) is the Gamma function. We calculated the above formulas analytically for d=1,2,3 with the assistance of Mathematica and conjectured the case for general dimension d, which we confirmed numerically for d10.

We want to explain two technical points relevant to the interpretation of our numerical results and the choice of f(x). Unlike the case in the usual ERM, here we allow f(x) to be non-integrable (over Rd), which is crucial to allow power law f(x). The nonintegrability violates a condition in the classical convergence results of the ERM spectrum (Bordenave, 2008) as N. We believe that this is exactly the reason for the departure of the first few eigenvalues from our theoretical spectrum (e.g., in Figure 3). Our hypothesis is also supported by ERM simulations with integrable f(x) (Figure 3—figure supplement 1), where the numerical eigenspectrum matches closely with our theoretical one, including the leading eigenvalues. For ERM to be a legitimate model for covariance matrices, we need to ensure that the resulting matrix C is positive semidefinite. According to the Bochner theorem (Rudin, 1990), this is equivalent to the Fourier transform (FT) of the kernel function f~(k) being nonnegative for all frequencies. For example, in 1D, a rectangle function rect(x)={1,if|x|120,otherwise does not meet the condition (its FT is sinc(x)=sin(x)x), but a tent function tent(x)={1|x|, if |x|10,otherwise does (its FT is sinc2(x)). For the particular kernel function f(x) in Equation 11, this condition can be easily verified using the analytical expressions of its Fourier transform (Equation 12). The integral expression for Kα(x), given as Kα(x)=0excoshtcosh(αt)dt, shows that Kα(x) is positive for all x>0. Likewise, the Gamma function Γ(x)>0. Therefore, the Fourier transform of Equation 11 is positive and the resulting matrix C (of any size and values of xi) is guaranteed to be positive definite.

Building upon the theory outlined above, numerical simulations further validated the empirical robustness of our ERM model, as showcased in Figures 3B–D4A. In Figure 3B–D, the ERM was characterized by the parameters N=1024, d=2, L=10, ρ=10.24, and μ=0.5 and ϵ=0.03125 for f(x). To numerically compute the eigenvalue probability density function, we generated the ERM 100 times, each sampled using the method described in Methods. The pdf was computed by calculating the pdf of each ERM realization and averaging these across the instances. The curves in Figure 3D showed the average of over 100 ERM simulations. The shaded area (most of which is smaller than the marker size) represented the SEM. For Figure 4A, the columns from left to right were corresponded to μ=0.5,0.9,1.3, and the rows from top to bottom were corresponded to d=1,2,3. Other ERM simulation parameters: N=4096, ρ=256, L=(N/ρ)1/d, ϵ=0.03125, and σi2=1. It should be noted that for Figure 4A, the presented data pertain to a single ERM realization.

Collapse index

We quantify the extent of scale invariance using CI defined as the area between two spectrum curves (Figure 4A, upper right), providing an intuitive measure of the shift of the eigenspectrum when varying the number of sampled neurons. We chose the CI over other measures of distance between distributions for several reasons. First, it directly quantifies the shift of the eigenspectrum, providing a clear and interpretable measure of scale invariance. Second, unlike methods that rely on estimating the full distribution, the CI avoids potential inaccuracies in estimating the probability of the top leading eigenvalues. Finally, the use of CI is motivated by theoretical considerations, namely the ERM in the high-density regime, which provides an analytical expression for the covariance spectrum (Equation 3) valid for large eigenvalues.

(13) CI:=1log(q0/q1)logq1logq0|logλ(q)logρ|dlogq,

we set q1 such that λ(q1)=1, which is the mean of the eigenvalues of a normalized covariance matrix. The other integration limit q0 is set to 0.01 such that λ(q0) is the 1% largest eigenvalue.

Here, we provide numerical details on calculating CI for the ERM simulations and experimental data.

A calculation of CI for experimental datasets/ERM model

Request a detailed protocol

To calculate CI for a covariance matrix C of size N0, we first computed its eigenvalues λi0 and those of the sampled block Cs of size Ns=N0/2, denoted as λis (averaged over 20 times for the ERM simulation and 2000 times in experimental data). Next, we estimated logλ(q) using the eigenvalues of C0 and Cs at q=i/Ns, i=1,2,,Ns. For the sampled Cs, we simply had logλ(q=i/Ns)=logλis, its ith largest eigenvalue. For the original C0, logλ(q=i/Ns) was estimated by a linear interpolation, on the logλlogq scale, using the value of logλ(q) in the nearest neighboring q=i/N0’s (which again are simply logλi0). Finally, the integral (Equation 13) was computed using the trapezoidal rule, discretized at q=i/Ns’s, using the finite difference logλ(q)logρ1log(N0/Ns)|Δlogλ(q)|, where Δ denotes the difference between the original eigenvalues of C0 and those of sampled Cs.

Estimating CI using the variational method

Request a detailed protocol

In the definition of CI (Equation 13), calculating λ(q) and logλ(q)logρ directly using the variational method is difficult, but we can make use of an implicit differentiation

(14) logλ(q,ρ)logρ=ρλλ(q,ρ)ρ=ρλq(ρ,λ)ρq(ρ,λ)λ,

where q(λ):=λp(λ)dλ is the complementary cdf (the inverse function of λ(q) in Methods). Using this, the integral in CI (Equation 13) can be rewritten as

(15) logq1logq0|logλ(q,ρ)logρ|dlogq=q1q0|ρqλqρqλ|dq=λ(q1)λ(q0)|ρqλqρyqλ|qλdλ=λ(q0)λ(q1)|1λlogqlogρ|dλ.

Since qλ=p(λ)<0, we switch the order of the integration interval in the final expression of Equation 15.

First, we explain how to compute the complementary cdf q(λ) numerically using the variational method. The key is to integrate the probability density function p(λ) from λ to a finite λ(qs) rather than to infinity,

(16) q(λ)=λp(λ)dλ=λ(qs)p(λ)dλ+λλ(qs)p(λ)dλ=qs+λλ(qs)p(λ)dλ.

The integration limit λ(qs) cannot be calculated directly using the variational method. We thus used the value of λs(qsq0) (Methods) from simulations of the ERM with a large N=1024 as an approximation. Furthermore, we employed a smoothing technique to reduce bias in the estimation of λs(qs) due to the leading zigzag eigenvalues (i.e., the largest eigenvalues) of the eigenspectrum. Specifically, we determined the nearest rank j<Nq0 and then smoothed the eigenvalue logλs(qs) on the log–log scale using the formula logλs(qs)=13i=02logλs(j+iN) and logqs=13i=02logj+iN, averaging over 100 ERM simulations.

Note that we can alternatively use the high-density theory (Appendix 2) to compute the integration limit λ(qs=1/N) instead of resorting to simulations. However, since the true value deviates from the λh(qs=1/N) derived from high-density theory, this approach introduces a constant bias (Figure 4—figure supplement 2) when computing the integral in Equation 16. Therefore we used the simulation value λs(qsq0) when producing Figure 4—figure supplement 2AB.

Next, we describe how each term within the integral of Equation 15 was numerically estimated. First, we calculated logqlogρ with a similar method described in Methods. Briefly, we calculated q0(λ) for density ρ0=N0Ld and qs(λ) for density ρs=NsLd, and then used the finite difference 1log(ρ0/ρs)|Δlogq(λ)|. Second, logq(λ)logρ was evaluated at λ=λ(q1)+iλ(q0)λ(q1)k1, where i=0,1,2,,k1, and we used k=20. Finally, we performed a cubic spline interpolation of the term logqlogρ, and obtained the theoretical CI by an integration of Equation 15. Figure 4—figure supplement 2A, B shows a comparison between theoretical CI and that obtained by numerical simulations of ERM (Methods).

Fitting ERM to data

Estimating the ERM parameters

Request a detailed protocol

Our ERM model has four parameters: μ and ϵ dictate the kernel function f(x), whereas the box size L and the embedding dimension d determine the neuronal density ρ. In the following, we describe an approximate method to estimate these parameters from pairwise correlations measured experimentally Rij=Cijσiσj. We proceed by deriving a relationship between the correlation probability density distribution h(R) and the pairwise distance probability density distribution g(u):=g(x1x2) in the functional space, from which the parameters of the ERM can be estimated.

Consider a distribution of neurons in the functional space with a coordinate distribution p(x). The pairwise distance density function g(u) is related to the spatial point density by the following formula:

(17) g(u)=[0,L]dp(x1)p(x2)δ(x1x2u)dx1dx2

For ease of notation, we subsequently omit the region of integration, which is the same as here. In the case of a uniform distribution, p(x1)=p(x2)=1/V=1/Ld. For other spatial distributions, Equation 17 cannot be explicitly evaluated. We therefore make a similar approximation by focusing on a small pairwise distance (i.e., large correlation):

(18) p(x1)p(x2)p(x1+x22)

By a change of variables:

X=x1+x22,u=x1x2,

Equation 17 can be rewritten as

(19) g(u)p2(X)δ(uu)dXdu=Sd1(u)p2(X)dX

where Sd1(u) is the surface area of d1 sphere with radius u. Note that the approximation of g(u) is not normalized to 1, as Equation 19 provides an approximation valid only for small pairwise distances (i.e., large correlation). Therefore, we believe this does not pose an issue.

With the approximate power-law kernel function R=f(u)(ϵu)μ, the probability density function of pairwise correlation h(R) is given by:

(20) h(R)=g(u)|dudR|=2πd2ϵdΓ(d2)μR(μ+d)/μp2(X)dX

Taking the logarithm on both sides

(21) logh(R)=log(ϵdp2(X)dX)+log2πd2Γ(d2)μμ+dμlogR

Equation 21 is the key formula for ERM parameters estimation. In the case of a uniform spatial distribution, ϵdp2(X)dX=ϵd/V=(ϵ/L)d. For a given dimension d, we can therefore estimate μ and (ϵ/L)d separately by fitting h(R) on the log–log scale using the linear least squares. Lastly, we fit the distribution of σ2 (the diagonal entries of the covariance matrix C) to a log-normal distribution by estimating the maximum likelihood.

There is a redundancy between the unit of the functional space (using a rescaled ϵδϵ/δ) and the unit of f(x) (using a rescaled fδ(x)f(x/δ)), thus ϵ and L are a pair of redundant parameters: once ε is given, L is also determined. We set ϵ=0.03125 throughout the article. In summary, for a given dimension d and ϵ, μ of f(x) (Equation 11), the distribution of σ2 and ρ (or equivalently L) can be fitted by comparing the distribution of pairwise correlations in experimental data and ERM. Furthermore, knowing (ϵ/L)d enables us to determine a fundamental dimensionless parameter

(22) ρϵd:=N(ϵ/L)d,

which tells us whether the experimental data are better described by the high-density theory or the Gaussian variational method (Appendix 2). Indeed, the fitted ρϵd103100 is much smaller than 1, consistent with our earlier conclusion that neural data are better described by an ERM model in the intermediate-density regime.

Notably, we found that a smaller embedding dimension d5 gave a better fit to the overall pairwise correlation distribution. The following is an empirical explanation. As d grows, to best fit the slope of logh(R)logR, μ will also grow. However, for very high dimensions d, the y-intercept would become very negative, or equivalently, the fitted correlation would become extremely small. This can be verified by examining the leading order logR independent term in Equation 21, which can be approximated as dlogϵL+d2(logπ+1logd2). It becomes very negative for large d since ϵL by construction. Throughout this article, we use d=2 when fitting the experimental data with our ERM model.

The above calculation can be extended to the cases where the coordinate distribution p(x) becomes dependent on other parameters. To estimate the parameters in coordinate distributions that can generate ERMs with a similar pairwise correlation distribution (Appendix 1—figure 1), we fixed the integral value p2(x)dx. Consider, for example, a transformation of the uniform coordinate distribution to the normal distribution N(μp=0,σp2I) in R2. We imposed p2(x)dx=1/(4πσp2)=1/L2. For the log-normal distribution, a similar calculation led to Lexp(σp2/4μp)=2πσp. The numerical values for these parameters are shown in Appendix 1. However, note that due to the approximation we used (Equation 18), our estimate of the ERM parameters becomes less accurate if the density function p(x) changes rapidly over a short distance in the functional space. More sophisticated methods, such as grid search, may be needed to tackle such a scenario.

After determining the parameters of the ERM, we first examine the spectrum of the ERM with uniformly distributed random functional coordinates xi[0,L]d (Figure 5—figure supplement 1M–R). Second, we use f(x) to translate experimental pairwise correlations into pairwise distances for all neurons in the functional space (Figure 5—figure supplement 2, Figure 5—figure supplement 1G–L). The embedding coordinates xi in the functional space can then be solved through multidimensional scaling (MDS) by minimizing the Sammon error (Methods). The similarity between the spectra of the uniformly distributed coordinates (Figure 5—figure supplement 1M–R) and those of the embedding coordinates (Figure 5—figure supplement 1G–L) is also consistent with the notion that specific coordinate distributions in the functional space have little impact on the shape of the eigenspectrum (Appendix 1—figure 1).

Nonnegativity of data covariance

Request a detailed protocol

To use ERM to model the covariance matrix, the pairwise correlation is given by a non-negative kernel function f(x) that monotonically decreases with the distance between neurons in the functional space. This nonnegativeness brings about a potential issue when applied to experimental data, where, in fact, a small fraction of pairwise correlations/covariances are negative. We have verified that the spectrum of the data covariance matrix (Figure 2—figure supplement 3) remains virtually unchanged when replacing these negative covariances with zero (Figure 2—figure supplement 3). This confirms that the ERM remains a good model when the neural dynamics is in a regime where pairwise covariances are mostly positive Dahmen et al., 2019 (see also Figure 2—figure supplement 2B, Figure 2—figure supplement 2B–D).

Multidimensional scaling

Request a detailed protocol

With the estimated ERM parameters (μ in f(x) and the box size L for given ϵ and d, see Methods), we performed MDS to infer neuronal coordinates xi in functional space. First, we computed a pairwise correlation Rij=Cijσiσj from the data covariances. Next, we calculated the pairwise distance, denoted by uij, by computing the inverse function of f(x) with respect to the absolute value of Rij, uij=f1(|Rij|). We used the absolute value |Rij| instead of Rij as a small percentage of Rij are negative (Figure 2—figure supplement 2A–D) where the distance is undefined. This substitution by the absolute value serves as a simple workaround for the issue and is only used here in the analysis to infer the neuronal coordinates by MDS. Finally, we estimated the embedding coordinates xi for each neuron by the SMACOF algorithm (Scaling by MAjorizing a COmplicated Function), which minimizes the Sammon error

(23) E=1i<juiji<j(uijuij)2uij

where uij=xixj is the pairwise distance in the embedding space calculated above. To reduce errors at large distances (i.e., small correlations with Rij<f(L), where L is the estimated box size), we performed a soft cut-off at a large distance:

(24) uij=f1(|Rij|),Rijf(L)uij=Llog(f1(|Rij|)/L)+L,Rij<f(L)

During the optimization process, we started at the embedding coordinates estimated by the classical MDS (Cox and Cox, 2000), with an initial sum of squares distance error that can be calculated directly, and ended with an error or its gradient smaller than 10−4.

The fitted ERM with the embedding coordinates xi reproduced the experimental covariance matrix including the cluster structures (Figure 5—figure supplement 2) and its sampling eigenspectra (Figure 5—figure supplement 1).

Canonical correlation analysis

Request a detailed protocol

Here we briefly explain the CCA method (Knapp, 1978) for completeness. The basis vectors vfunc and vanat, in functional and anatomical space, respectively, were found by maximizing the correlation RCCA=corr({vfuncxi},{vanatyi}). These basis vectors satisfy the condition that the projections of the neuron coordinates along them, {xivfunc} and {yivanat}, are maximally correlated among all possible choices of vfunc and vanat. Here, {xi}, {yi} represent the coordinates in functional and anatomical spaces, respectively. The resulting maximum correlation is RCCA. To check the significance of the canonical correlation, we shuffled the functional space coordinates {xi} across neurons’ identity and re-calculated the canonical correlation with the anatomical coordinates, as shown in Figure 5—figure supplement 4.

To study the effect of functional–anatomical relation described by RCCA in the ERM model, we generated three-dimensional anatomical coordinates {yi} and two-dimensional functional coordinates {xi} for each neuron which are jointly five-dimensional zero-mean multivariate Gaussian random variables. The coordinates are independent among each other, except for the first dimension {xi1} of the functional coordinates and the first dimension {yi1}, which are assigned to have a correlation coefficient equals to RCCA. The variances of the coordinates are σy12=1,σy22=1,σy32=1, and σx12=2,σx22=1 for the numerics in Figure 5—figure supplement 7. Under this construction, the first canonical correlation between the anatomical and functional coordinates equals RCCA, and the first canonical direction vanat in the anatomical space is (1,0,0)T and the first canonical direction vfunc in the functional space is (1,0)T.

Spectrum of three types of sampling procedures in ERM model

Request a detailed protocol

In Result, we have considered three types of sampling procedures: random sampling (RSap), spatial sampling in the anatomical space (ASap, e.g., recording neurons in a brain region), and spatial sampling in the functional space (FSap), namely spatial sampling in functional space by subdividing the space into smaller regions, is equivalent to the previously reported RG inspired process (Bradde and Bialek, 2017). Here, we consider the relationship between the spectrum of three types of sampling procedures.

We assume a uniform random distribution of neurons in a d-dimensional functional space, [0,L]d. For RSap procedures, the resulting neuronal density ρR is reduced to ρR=kρ0, with k representing the sampling ratio (k=N/N0) and ρ0 being the initial density. In contrast, FSap maintains the original density, ρF=ρ0. This constancy in neuronal density under FSap ensures that the covariance eigenspectrum remains invariant across scales for any spatial correlation functions f(x), such as power law and exponential, as shown in Appendix 1—figure 5A, B, D, E. In contrast, RSap reduces ρ, thus demanding more rigorous conditions to achieve a scale-invariant covariance spectrum (e.g., compare Appendix 1—figure 5A, C).

Under ASap, sampled neurons are not spread out evenly in functional space, whereas our theoretical framework assumes a uniform distribution. To reconcile this discrepancy, we employ a uniform approximation of the neural distribution. This approach involves introducing an effective density, ρ, defined as the spatial average of the density function ρ(x). This adjustment allows our theoretical model to accommodate non-uniform distributions encountered in anatomically spatial sampling.

(25) ρρ(x)=p(x)ρ(x)dx=kN0p2(x)dx,

where p(x) is the normalized density distribution (see Methods).

Using the Cauchy–Schwarz inequality, we have

(26) p2(x)dxdx(p(x)dx)2

thus ρkρ0.

According to the condition p(x)<1kV, we have ρρ0, intuitively, sampling within a uniformly distributed neuron population does not increase the density.

So we have ρ0ρAkρ0, that is, ρFρAρR. Thus, the spectrum ASap should be between FSap and RSap.

Dimensions of three types of sampling procedures in ERM model

Scaling of dimensions through random sampling

Request a detailed protocol

Let us revisit the definition of the participation ratio (PR) dimension as defined in Equation 5:

(27) DPR(C)=(iλi)2iλi2=(Tr(C))2Tr(C2)=N2E(σ2)2NE(σ4)+N(N1)Eij(Cij2)

During the random sampling process, the expected values E(σ2), E(σ4), and Eij(Cij2) remain constant. These constants allow for the estimation of the PR dimension across various scales using:

(28) DPRRSap=kN0E(σ2)2E(σ4)+(kN01)Eij(Cij2)

Here, k=N/N0 represents a scaling factor (fraction) associated with sampling. The key question is to understand how the dimensionality changes with k. Under random sampling, as k increases, the dimensionality will quickly approaches a saturating point defined by Equation 1.

Scaling of dimensions through functional sampling

Request a detailed protocol

In this section, we leverage the uniform ERM model to estimate dimensions within the context of functional sampling, specifically focusing on the estimation of squared pairwise covariance Eij(Cij2) and dimensionality. Adopting an approximation for a power-law kernel function f(x)ϵμxμ allows us to express the expected value of the squared covariance Eij(Cij2) as follows:

(29) Eij(Cij2)=[0,L]dp(x1)p(x2)f2(x1x2)dx1dx2[0,L]dp(x1)p(x2)ϵ2μx1x22μdx1dx2.

For a set subjected to functional sampling with a sampling fraction k, this procedure adjusts the size of the functional space in the ERM model by a factor of k1/d. Consequently, the Eijk(Cij2) for the sampled fraction k is given by:

(30) Eijk(Cij2)=[0,k1/dL]dp(x1)p(x2)f2(x1x2)dx1dx2=[0,L]dp(x1)p(x2)f2(k1/dx1x2)dx1dx2[0,L]dp(x1)p(x2)ϵ2μk2μ/dx1x22μdx1dx2k2μ/dEij(Cij2),

Here, we assume that E[σ2] and E[σ4] are constant across the sampling process. This model enables the estimation of the ratio μ/d as detailed in the Methods.

(31) DPRFSapkN0E(σ2)2E(σ4)+(kN01)k2μ/dEij(Cij2)

In the large N limit, we observe distinct behaviors in the evolution of dimensionality in both theory and data: it saturates in RSap (dashed line in Figure 5D), namely DPRRSapDPR defined in Equation 1, whereas it follows a different scaling relationship DPRFSapk2μ/dDPR in FSap (solid line in Figure 5D).

Comparative analysis of PR dimension across sampling techniques

Request a detailed protocol

This section examines the behavior of the PR dimension under three sampling techniques: anatomical sampling, random sampling, and functional sampling. We show that the average PR dimension following anatomical sampling occupies a middle ground between the extremes presented by random and functional sampling.

The PR dimension, denoted DPR, reflects the sampling impact and depends on the distribution p(X) of the functional coordinates X. Defining the sampling fraction as k=1/q, the mean DPR is represented as:

(32) mean(DPR)=1qi=1qDPRi=1qi=1qJ(pi(X)),

where the neuron set 1,2,...,N is segmented into q clusters {X1,X2,...,Xq}, each comprising Nq neurons. The probability distribution pi(X) corresponds to each cluster {Xi}. The probability distribution for each cluster, pi(X), emerges naturally from the sampling process.

The equivalence of the mean probability density function across the sampled clusters to the original set’s probability density function leads us to the condition:

(33) 1qi=1qpi(X)=p(X),

This condition is a direct consequence of the sampling process, ensuring that the aggregated probability density function of all sampled sets mirrors the overall density distribution of the neurons.

Applying the Lagrange multiplier method to optimize the mean DPR:

(34) L(p,λ)=1qi=1qJ(pi(X))+DddXλ(X)(1qi=1qpi(X)p(X)),

Here, L(p,λ) is the Lagrangian, λ(X) is the Lagrange multiplier, we derive the optimal condition:

(35) L(p,λ)pi=0,

yielding:

(36) 1qJpi(X)+λ(X)q=0.

At the optimal mean DPR, each p(Xi) is equivalent, leading to p(Xi)=p(Xj)=p(X) (representative of random sampling). Hence, the mean DPR post-random sampling sets the upper limit for the mean DPR after anatomical sampling.

Let us investigate the lower bound of the mean PR dimension with the ERM model. For the minimization of mean (DPR), a key requirement is the functional spatial proximity of neurons within the same cluster, in other words, the neuron set should be distinctly separated in functional space. Consequently, achieving the minimum mean PR dimension necessitates a functional sampling strategy.

Derive upper bound of dimension from spectrum

Request a detailed protocol

To deduce DPR from the spectrum, for simplicity, we focus on the high-density region, where we have an analytical expression for λ that is valid for large eigenvalues:

(37) λr=γ(rN)1+μdρμd=γr1+μdLμNforrβ(N),

where L is the size of the functional space, γ is the coefficient in Equation 3, which depends on d, μ, and E(σ2). Note that the eigenvalue λr decays rapidly after the threshold r=β(N). Since we did not discuss small eigenvalues in this article, we represent them here as an unknown function η(r,N,L):

(38) λr=η(r,N,L)forr>β(N)

As discussed in Methods, without changing the properties of the spectrum, we can always impose E(σ2)=1 such that

(39) r=1Nλr=Tr(C)=N

We emphasize that this constraint requires that large and small eigenvalues behave differently because otherwise r=1Nrα with α<1 would scale as N1α, and r=1Nλr is not proportional to N.

Using the Cauchy–Schwarz inequality, we have an upper bound of r=1Nλr2:

(40) r=1Nλr2(rλr)2=N2

On the other hand, λ12 is a lower bound of r=1Nλr2:

(41) r=1Nλr2>λ12=L2μN2γ2

As a result, the dimensionality

DPR=(r=1Nλr)2r=1Nλr2,

is bounded as

(42) 1DPR<L2μγ2

Under random sampling, L remains fixed. Thus, we must have a bounded dimensionality that is independent of N for our ERM model. A tighter lower bound of r=1Nλr2 is

(43) r=1Nλr2>γ2L2μN2r=1β(N)(r2+2μ/d)

A tighter upper bound of participation ratio DPR can be written as:

(44) DPR=(r=1Nλr)2r=1Nλr2<L2μγ2r=1β(N)(r2+2μ/d)<L2μγ2

However, in functional sampling, enlarging the region size with constant density ρ results in LN1/d. Thus, the upper bound of DPR should grow as N2μ/d, consistent with the previously derived result (Equation 31) in Methods.

Simulating CCA and anatomical sampling

Request a detailed protocol

In this section, we estimate the dimensions of the anatomically sampled neuron set. For simplicity, we assume that the functional coordinates of neurons, Xi, and the anatomical coordinates of neurons, Yi, both follow a multivariate Gaussian distribution. We define anatomical sampling, which involves sampling on Yi, along a direction chosen arbitrarily and denote this direction as YA. Subsequently, we perform sampling on Xi in the direction denoted by XA, which is determined to have the highest correlation with YA according to CCA. This process effectively mimics the scenario of functional sampling.

The key to calculating the PR dimension involves computing the expected value Eij(Cij2). In the ERM model, the distribution of Cij can be estimated by the distribution of points in the functional space. This allows for the calculation of the PR dimension across anatomical sampling by comparing the distribution of Xi after anatomical sampling with that after functional sampling. We can model the distribution of XA and YA as follows:

(45) RASap=corr(XA,YA),CASap=corr(XA,YA)σxσy,[XAYA]N([00],[σx2CASapCASapσy2]),

Here, we consider only the projection of the functional coordinate onto the direction XA, which exhibits the highest correlation, denoted by RASap, with YA. Specifically, when selecting the anatomical direction as the first CCA direction, the correlation between XA and YA reaches its maximum, such that RASap=RCCA. In this case, anatomical sampling results in the minimization of the dimensionality.

Now, let us perform anatomical sampling on the neurons. The Xi and Yi denote the functional and anatomical coordinates of the ith neuron cluster after anatomical sampling, respectively.

To approximate, we need to calculate the functional coordinate probability distribution p(Xi)=p(X|qiky<YA<q(i+1)ky), which is the distribution of the ith neuron cluster after anatomical sampling. YA represents the selected direction in anatomical space, and qiky denotes the ikth quantile of YA, where k is the sampled fraction. Note the following relationships and distributions:

(46) p(XA|YA=y)=p(XA,YA=y)p(YA=y),p(XA|YA=y)N(yσxσyRASap,σx2(1RASap2)).
(47) p(XiA)=p(XA|qiky<YA<q(i+1)ky)=1kqikyq(i+1)kyp(XA|YA=y)dy

The conditional probability distribution P(XA|qiky<YA<q(i+1)ky) is equivalent to the distribution of the sum of YiAσxσyRASap and X0, where X0N(0,σx2(1RASap2)):

(48) XiA=YiAσxσyRASap+X0,
(49) p(YiA=y)={1k2πσyexp(y22σy2)for qiky<y<q(i+1)ky,0otherwise.

The computation of XiA involves two technical challenges: (1) The distribution of YiA is represented by a non-elementary function (Equation 49), which complicates the direct calculation of XiA, which is the sum of YiARASapσx/σy and X0. To facilitate approximation, we model YiA using a normal distribution with equivalent variance. (2) Calculating the variance of YiA presents direct challenges, and the variance of YiA differs across different neuron clusters i. Using a uniform distribution for Y simplifies this task (this assumption is only used to calculate the variance of YiA). Under this assumption, the variance of YiA can be straightforwardly calculated as Var(YiA)=k2σy2. Consequently, we approximate YiA and XiA as follows:

(50) YiAN(qiky+q(i+1)ky2,k2σy2),
(51) XiAN(qiky+q(i+1)ky2σxσyRASap,σx2(1RASap2+k2RASap2)).

Calculating the PR dimension directly from the distribution of XiA is difficult; thus, we approximate anatomical sampling with fraction k as functional sampling with fraction kf, leading to:

(52) kf=1+k2RASap2RASap2.

Using the equation for functional sampling Eijk(Cij2)k2μ/dEij(Cij2) (Equation 30):

(53) Eijk(Cij2)(1+k2RASap2RASap2)μ/dEij(Cij2).
(54) DPRASapkN0E(σ2)2E(σ4)+(kN01)(1+k2RASap2RASap2)μ/dEij(Cij2)

Appendix 1

Extensions of ERM and factors not affecting the scale invariance

In Appendix 1—figure 1, we considered five additional types of spatial density distributions (coordinate distributions) in functional space and two additional functional space geometries. We examined the points distributed according to the uniform distribution (x1/Ld), the normal distribution (xN(μp,σp2I)), and the log-normal distribution (logxN(μp,σp2I)). We used the method described in Methods to adjust the parameters of the coordinate distributions based on the uniform distribution case, so that they all generate similar pairwise correlation distributions. The relationships between these parameters are described in Methods. In Appendix 1—figure 1B, we used the following parameters: d=2 for the uniform distribution; μp=0, σp=2.82 for the normal distribution; and μp=2, σp=0.39 for the log-normal distribution.

Second, we introduced multiple clusters of neurons in the functional space, with each cluster uniformly distributed in a box. We considered three arrangements: (1) two closely situated clusters (with a box size of L=52, the distance between two cluster centers being Lc=L), and (2) two distantly situated clusters (with a box size of L=52 and the distance between clusters Lc=4L), and three clusters arranged symmetrically in an equilateral triangle (with a box size of L=10/3 and the distance between clusters Lc=L).

Finally, we examined the scenario in which the points were uniformly distributed on the surface of a sphere (4πl2=L2, l being the radius of the sphere) or a hemisphere (2πl2=L2) embedded in R3 (the pairwise distance is that in R3). It should be noted that both cases have the same surface area as the 2D box.

Appendix 1—figure 1
Factors that do not affect the scale invariance.

(A) Rank plot of the covariance eigenspectrum for ERMs with different f(x) (see Table 3). Diagrams show different slow-decaying kernel functions f(x) along a 1D slice. (B) Same as A but for different coordinate distributions in the functional space (see text). The diagrams on the right illustrate uniform and clustered coordinate distributions. (C) Same as A but for different geometries of the functional space (see text). Diagrams illustrate spherical and hemispherical surfaces. (D) CI of the different ERMs considered in (A–C). The range on the y-axis is identical to Figure 4C. On the x-axis, 1: uniform distribution, 2: normal distribution, 3: log-normal distribution, 4: uniform two nearby clusters, 5: uniform two faraway clusters, 6: uniform 3-cluster, 7: spherical surface in R3, 8: hemispherical surface in R3. All ERM models in (B, C) are adjusted to have a similar distribution of pairwise correlations (Appendix 1).

Appendix 1—figure 2
Comparisons of large eigenvalues across different smoothing interval sizes, ε.

Rank plot (upper row) and pdf (lower row) of the covariance eigenspectrum for ERMs with different f(x). (A) ϵ=0.06. (B) ϵ=0.12. (C) ϵ=0.3. (D) ϵ=0.6. Other ERM simulation parameters: N=4096, ρ=100, μ=0.5, d=2, L=6.4, σi2=1. The formulas for different f(x)’s are listed in Table 3 in Methods.

Appendix 1—figure 3
Modifications of f(x) near x=0.

The upper row illustrates the slow-decaying kernel function f(x) (blue solid line) and its power-law asymptote (red dashed line) along a 1D slice at various f(x). The lower row is similar to A, but on the log–log scale. The formulas for different f(x)’s are listed in Table 3 in Methods.

RG approach

Here we briefly summarize the RG approach used in Meshulam et al., 2019 and elucidate the adjustments required when applying the RG approach to ERM. The method consists of two stages: (i) iterative agglomerate clustering of neurons, and (ii) computing the spectrum of a block of the original covariance matrix corresponding to a cluster of the desired size based on the previous clustering result.

Stage (i): iterative clustering

We begin with N0 neurons, where N0 is assumed to be a power of 2. In the first iteration, we compute Pearson’s correlation coefficients for all neuron pairs. We then search greedily for the most correlated pairs and group the half pairs with the highest correlation into the first cluster; the remaining neurons form the second cluster. For each pair (a,b), we define a coarse-grained variable according to:

(S1) xik=Zabk1(xak1+xbk1),

where Zabk1 normalizes the average to ensure unit nonzero activity. This process reduces the number of neurons to N1=N0/2. In subsequent iterations, we continue grouping the most correlated pairs of the coarse-grained neurons, iteratively reducing the number of neurons by half at each step. This process continues until the desired level of coarse-graining is achieved.

When applying the RG approach to ERM, instead of combining neural activity, we merge correlation matrices to traverse different scales. During the kth iteration, we compute the coarse-grained covariance as:

(S2) cijk=cabk1+cack1+cbck1+cbdk1

and the variance as:

(S3) ciik=caak1+cbbk1+2cabk1

Following these calculations, we normalize the coarse-grained covariance matrix to ensure that all variances are equal to one. Note that these coarse-grained covariances are only used in stage (i) and not used to calculate the spectrum.

Stage (ii): eigenspectrum calculation

The calculation of eigenspectra at different scales proceeds through three sequential steps. First, for each cluster identified in stage (i), we compute the covariance matrix using the original firing rates of neurons within that cluster (not the coarse-grained activities). Second, we calculate the eigenspectrum for each cluster. Finally, we average these eigenspectra across all clusters at a given iteration level to obtain the representative eigenspectrum for that scale.

In stage (ii), we calculate the eigenspectra of the sub-covariance matrices across different cluster sizes as described in Meshulam et al., 2019. Let N0=2n be the original number of neurons. To reduce it to size N=N0/2k=2nk, where k is the kth reduction step, consider the coarse-grained neurons in step nk in stage (i). Each coarse-grained neuron is a cluster of 2nk neurons. We then calculate spectrum of the block of the original covariance matrix corresponding to neurons of each cluster (there are 2k such blocks). Lastly, an average of these 2k spectra is computed.

For example, when reducing from N0=23=8 to N=231=4 neurons (k=1), we would have two clusters of four neurons each. We calculate the eigenspectrum for each 4 × 4 block of the original covariance matrix, then average these two spectra together. To better understand this process through a concrete example, consider a hypothetical scenario where a set of eight neurons, labeled 1,2,3,...,7,8, are subjected to a two-step clustering procedure. In the first step, neurons are grouped based on their maximum correlation pairs, for example, resulting in the formation of four pairs: {1,2},{3,4},{5,6}, and {7,8} (see Appendix 1—figure 4). Subsequently, the neurons are further grouped into two clusters based on the results of the RG step mentioned above. Specifically, if the correlation between the coarse-grained variables of the pair {1,2} and the pair {3,4} is found to be the largest among all other pairs of coarse-grained variables, the first group consists of neurons {1,2,3,4}, while the second group contains neurons {5,6,7,8}. Next, take the size of the cluster N=4 for example. The eigenspectra of the covariance matrices of the four neurons within each cluster are computed. This results in two eigenspectra, one for each cluster. The correlation matrices used to compute the eigenspectra of different sizes do not involve coarse-grained neurons. It is the real neurons 1,2,3,...,7,8, but with expanding cluster sizes. Finally, the average of the eigenspectra of the two clusters is calculated.

Appendix 1—figure 4
Example of renormalization group (RG) approach for a set of eight neurons.

The figure is adapted from Meshulam et al., 2019. The diagram illustrates the iterative clustering process for eight neurons. In each iteration, neurons are paired based on maximum correlation, with their activities combined through summation and normalized to maintain unit mean for nonzero values. Each neuron can only be paired once per iteration, ensuring all neurons are grouped by the iteration’s end.

Appendix 1—figure 5
Eigenspectra of renormalization group (RG)-inspired clustering, direct functional region sampling (FSap), and random sampling (RSap) in ERM.

(A, D) RG clustered eigenspectra of ERM. The size of the cluster is denoted by N, which is the number of neurons in each cluster. We adopt the RG approach (Meshulam et al., 2018; Meshulam et al., 2019), but with a specific modification (Appendix 1). (B, E) Direct spatial sampling in the functional space (FSap) and the corresponding ERM eigenspectra. We began our analysis with a set of N0 neurons distributed in the functional space. Initially, we chose N=N0/2 neurons that were located exclusively on one side of the x-axis of this space. We then proceeded to select N=N0/4 neurons from 4 quadrants. This sampling process was repeated iteratively, generating successively smaller subsets of neurons. (C, F) Random sampled (RSap) eigenspectra of ERM. ERM parameters: (A–C) Exponential function f(x)=ex/b where b=1, ρ=10.24 and dimension d=2. (D–F) Approximate power law Equation 11 with μ=0.5, ρ=10.24 and dimension d=2. Other parameters are the same as Figure 3. The standard error of the mean (SEM) across the clusters is represented by the shaded area of each line.

Appendix 1—figure 6
Morrell et al.’s latent variable model.

(A–D) Functional sampled (FSap) eigenspectra of the Morrell et al. model. (E–H) Random sampled (RSap) eigenspectra of the same model. Briefly, in Morrell et al.’s latent variable model (Morrell et al., 2024; Morrell et al., 2021), neural activity is driven by Nf latent fields and a place field. The latent fields are modeled as Ornstein–Uhlenbeck processes with a time constant τ. The parameters ε and η control the mean and variance of individual neurons’ firing rates, respectively. The following are the parameter values used. (A, E) Using the same parameters as in Morrell et al., 2021: Nf=10, ϵ=2.67, η=6, τ=0.1. Half of the cells are also coupled to the place field. (B–D, F–H) Using parameters from Morrell et al., 2024: Nf=5, ϵ=3, η=4. There is no place field. The time constant τ=0.1,1,10 for (B, F, C, G) and (D, H), respectively.

Analyzing the effects of removing neural activity data during hunting

To identify and remove the time frames corresponding to putative hunting behaviors, the following procedure was used. The hunting interval was defined as 10 frames (1 s) preceding the onset of an eye convergence (see Methods) to 10 frames after the offset of this eye convergence. These frames were then excluded from the data before recalculating the covariance matrix (see Methods) and subsequently the sampled eigenspectra (Appendix 1—figure 7B, Appendix 1—figure 8B, D, F, H). As a control to the removal of the hunting frame, an equal number of time frames that are not within those hunting intervals were randomly selected and then removed and analyzed (Appendix 1—figure 7C, Appendix 1—figure 8A, C, E, G). The number of hunting interval frames and total recording frames for five fish exhibiting hunting behaviors are as follows: fish 1 – 268/7495, fish 2 – 565/9774, fish 3 – 2734/13,904, fish 4 – 843/7318, and fish 5 – 1066/7200. Fish 6 (number of time frames: 9388) was not exposed to a prey stimulus and, therefore, was excluded from the analysis.

To assess the impact of hunting removal on CI, we calculated the CI of the covariance matrix using all neurons recorded in each fish (without sampling to 1024 neurons). For the control case, we repeated the removal of the nonhunting frame 10 times to generate 10 covariance matrices and computed their CIs. We used a one-sample t-test to determine the level of statistical significance between the control CIs and the CI obtained after removal of the hunting frame.

Using fitted ERM parameters by full data, we performed a MDS on the control data and hunting-removed data to infer the functional coordinates. Note that the functional coordinates inferred by MDS are not unique: rotations and translations give equivalent solutions. For visualization purposes (not needed for analysis), we first used the Umeyama algorithm to optimally align the functional coordinates of control and hunting-removed data.

To identify distinct clusters within the functional coordinates, we fit Gaussian Mixture Models (GMMs) using the ‘GaussianMixtures’ package in Julia. We chose the number of clusters K based on giving the smallest Bayesian information criterion score. After fitting the GMMs, a list of probabilities pik, k=1,2,,K was given for each neuron i specifying the probability of the neuron belonging to the cluster k. The mean and covariance parameters were estimated for each Gaussian distributed cluster. For visualization (but not for analysis), a neuron was colored according to cluster k where k=argmax1kKpik.

We used the following method to measure the size of the cluster and its fold change. For a 2D (recall d=2 in our ERM) Gaussian distributed cluster, let us consider an ellipse centered on its mean, and its axes are aligned with the eigenvectors of its covariance matrix C2×2. Let the eigenvalues of C be λ1,λ2. Then we set the length of the half-axis of the ellipse to be cλi, respectively. Here, c>0 is a constant determined below. Note that the ellipse axes correspond to linear combinations of 2D Gaussian random variables that are independent and λi’s are the variance of these linear combinations. From this fact, it is straightforward to show that the probability that a sample from the Gaussian cluster lies in the above ellipse depends only on c, that is, 1ec22, and not on the shape of the cluster. So, the ellipse represents a region that covers a fixed proportion of neurons for any cluster, and its area can be used as a measure for the size of the Gaussian cluster. Note that the area of the ellipse is πc2λ1λ2=πc2det(C). In Appendix 1—figure 9, we plot the ellipses to help visualize the clusters and their changes. We choose c such that the ellipse covers 95% of the probability (i.e., the fraction of neurons belonging to the cluster).

In the control functional map where we fit the GMMs, we directly calculated the size measure πc2det(C) from the estimated covariance C for each Gaussian cluster. In the hunting-removed functional map, we needed to estimate the covariance C for neurons belonging to a cluster k under the new coordinates (we assume that the new distribution can still be approximated by a Gaussian distribution). We performed this estimation in a probabilistic manner to avoid issues of highly overlapping clusters where the cluster membership could be ambiguous for some neurons. First, we estimated the center/mean of the new Gaussian distribution by

(x¯,y¯):=(i=1Npikxii=1Npik,i=1Npikyii=1Npik).

Here, the summation goes over all the N neurons in the functional space and pik is the membership probability defined above, and (xi,yi) is the coordinate of neuron i in the hunting-removed map. Similarly, we can use a weighted average to estimate the entries in the covariance matrix C=[CxxCxyCyxCyy]. For example,

C^xy:=i=1Npik(xix¯)(yiy¯)i=1Npik.

Then we calculated the size of the cluster on the new map as πc2det(C^). Finally, we computed the fold change in size as det(C^)det(C).

Appendix 1—figure 7
The effects of hunting behavior on scale invariance and functional space organization.

Sampled covariance eigenspectra of the data from fish 1 calculated from control (A) and hunting removed (B) data . Ctrl: We randomly remove the same number of non-hunting frames. This process is repeated 10 times, and the mean ± SD of the CI is shown in the plot. Hunting removed: The time frames corresponding to the eye-converged intervals (putative hunting state) are removed when calculating the covariance (Appendix 1). The CI for the hunting-removed data appears to be statistically smaller than in the control case (p-value = 1.5 × 10−9). (C) Functional space organization of control data. The neurons are clustered using the Gaussian Mixture Models (GMMs) and their cluster memberships are shown by the color. The color bar represents the proportion of neurons that belong to each cluster. (D) Similar to (C) but the functional coordinates are inferred from the hunting-removed data. The color code of each neuron is the same as that of the control data (C), which allows for a comparison of the changes to the clusters under the hunting-removed condition. See also the Movie S1. (E) Fold change in size/area (Appendix 1) for each cluster (top; the gray dashed line represents a fold change of 1, i.e., no change in size) and the anatomical distribution of the most dispersed cluster (bottom).

Appendix 1—figure 8
Removing the time segment of hunting behavior does not obliterate the scale-invariant eigenspectra.

Rows correspond to four light-field zebrafish data: fish 2 to fish 5 (results for fish 1 have been shown in Appendix 1—figure 7). (A, C, E, G) Ctrl: we randomly remove the same number of time frames that are not the putative hunting frames. We repeat this process 10 times to generate 10 control covariance matrices and the CI is represented by mean ± SD. (B, D, F, H) Hunting removed: data obtained by removing hunting frames from the full data (Appendix 1). The CI for the hunting removed data appears to be significantly smaller than that of the control case (one-sample t-test p=2.2×1010 in fish 2, p=4.6×109 in fish 3, p=1.7×109 in fish 4, and p=3.4×1017 in fish 5).

Appendix 1—figure 9
Hunting behavior reorganizes neurons in the functional space (continued on next page).

Rows correspond to five light-field recordings of zebrafish engaged in hunting behavior: fish 1 to fish 5. (A, D, G, J, M) (top) Functional space organization of the control data inferred by fitting the ERM and MDS (Result). Neurons are clustered using the Gaussian Mixture Models (GMMs) and their cluster memberships are shown by the color. The colorbar represents the proportion of neurons belonging to each cluster. (A, D, G, J, M) (bottom) The coordinate distribution of the cluster in control data which is most dispersed (i.e., largest fold change in size, see below) after hunting-removal. The transparency of the dots (colorbar) is proportional to the probability of the neurons belonging to this cluster (Appendix 1). The cyan ellipse serves as a visual aid for the cluster size: it encloses 95% of the neurons belonging to that cluster (Appendix 1). (B, E, H, K, N) (top) Similar to (A, D, G, J, M) (top) but the functional coordinates are inferred from the hunting-removed data. The color code of each neuron is the same as that in the control data, which allows for a comparison of the changes to the clusters under the hunting-removed condition. (B, E, H, K, N) (bottom) Similar to (A, D, G, J, M) (bottom) but the functional coordinates are inferred from the hunting-removed data. The transparency of each neuron is the same as in (A, D, G, J, M). (bottom), and it represents the probability pik (Appendix 1) of neurons belonging to the most dispersed cluster k in the control data. Likewise, the cyan ellipse encloses 95% of the neurons belonging to that cluster (Appendix 1). (C, F, I, L, O) Top, size/area fold change (Appendix 1) for each cluster (the gray dashed line represents a fold change of 1, i.e., no change in size); bottom, the anatomical distribution of the neurons in the most dispersed cluster.

Appendix 1—video 1
Neural activity patterns in anatomical and functional space during hunting (click here).

Single-trial examples of fish 1 and fish 3. (A) Inferred firing rate activity in anatomical space. Scale bar, 100 µm. (B) Inferred firing rate activity in functional space. Functional space organization of the control data inferred by fitting the ERM and MDS in Result. The cyan ellipse serves as a visual aid for the cluster size: it encloses 95% of the neurons belonging to that cluster (Appendix 1). The inset illustrates the functional space organization, similar to that shown in Appendix 1—figure 7C. The colorbars in panels A and B depict the inferred activity magnitude of individual neurons. (C) Simultaneous behavior recording alongside the neural activity. Time, seconds.

Appendix 2

In this appendix, we elaborate upon the sketch introduced in the Methods, and present a full derivation of the covariance eigenspectrum of our ERM model. This section is organized as follows. First, we will briefly introduce the relationship between the eigenvalue probability density distribution and the resolvent. Second, we will turn the problem of calculating the resolvent to a calculation of the partition function using a field-theoretic representation and proceed to manipulate the partition function using the replica method. Third, we will introduce two approximate methods for calculating the partition function, leading to the high-density theory and the Gaussian variational method. We will discuss the implications and predictions of each method. Finally, we will discuss the relationship between the two methods and identify the parameter regime where the high-density theory agrees with the numerical simulation. Notation table: Appendix 2—table 1.

Resolvent

The eigenvalues λn of a Hermitian matrix C are real. Their probability density function or eigendensity is formally given by

(S4) p(λ)=1Nn=1Nδ(λλn),

where represents an average across different realizations of C. The eigendensity is connected with the resolvent (Mézard et al., 1999; Goetschy and Skipetrov, 2013),

(S5) g(z)=1NTr1zC=1Nn=1N1zλn,

we therefore compute the eigendensity using the standard inverse formula of Stieltjes tranform:

(S6) p(λ)=1πlimη0+Im g(λ+iη)

Field representation

In this section, we discuss a field-theoretical representation of the resolvent g(z). First, we rewrite Equation S5 as

(S7) g(z)=2Nzln[(det(zC))1/2]

The determinant (det(zC))1/2 can be represented as a Gaussian integral

(S8) Ξ(z)=(det(zC))1/2=iN/2+dϕ12π...dϕN2πexp[i2ΦT(zC)Φ],

where Φ=[ϕ1,,ϕN]T, and i1.

(S9) lnΞ(z)=ln+dϕ12π...dϕN2πexp[i2ΦT(zC)Φ]iπN4

We thus establish a relationship between the resolvent and Ξ

(S10) g(z)=2NzlnΞ(z)

Note that the constant term in Equation S9 can be killed by z and we will ignore it in the sequel. Equation S10 is the central formula in this note. Ξ(z) is also called the partition function in statistical physics. We endeavor to find a way to compute the average of lnΞ(z).

Recall that in our ERM model (Result Equation S2 and Figure 3A), the covariance between neuron i and neuron j is determined by the distance kernel function and their neural activity variances:

(S11) Cij=f(xixj)σiσj,

where xi are sampled from a uniform coordinate distribution p(xi)=1/V;σi are i.i.d. chosen from a probability density distribution p(σ) and are independent of the neuron coordinates xi. The in Equation S10 is therefore an average over all possible xi and σi. In order to compute lnΞ(z), we apply the replica method based on a smart use of the identity

lnx=limn0xn1n

Equation S10 now becomes

(S12) g(z)=2Nz[limn01nΞn(z)1]=2Nz[limn01nlnΞn(z)]

The idea is to compute the right-hand side for finite and integer n and then perform the analytic continuation to n0.

Now we seek to determine the value of Ξn(z). It contains n copies (replicas) of the original system

(S13) Ξn(z)=(12π)Nn2+(dϕ11...dϕ1n)...(dϕN1...dϕNn)exp[i2α=1nΦαT(zC)Φα].

Writing it down explicitly, we have

(S14) Ξn(z)=(12π)Nn2+(dϕ11...dϕ1n)...(dϕN1...dϕNn)LLddx1V...ddxNVp(σ1)dσ1...p(σN)dσNexp[zi2α=1nj=1N(ϕjα)2+i2α=1nj,k=1Nϕjαϕkαf(xjxk)σjσk]

In order to proceed further, we introduce the following auxiliary fields:

(S15) ψα(x)=j=1Nϕjαδ(xxj)

Equation S15 can be represented as a following functional integral

(S16) 1=+α=1nD[ψα]δF[ψα(x)j=1Nϕjαδ(xxj)]
(S17) δF[ψ]=+D[ψ^]exp[i+ddxψ(x)ψ^(x)]

or we can combine Equations S16 and S17 as

(S18) 1=++α=1nD[ψ^α]D[ψα]exp[i+ddx[ψα(x)j=1Nϕjαδ(xxj)]ψ^α(x)]

Using Equation S15, we can write the term 12j,k=1Nϕjαϕkαf(xjxk) in Equation S14 as

(S19) 12j,k=1Nϕjαϕkαf(xjxk)=12+dxdxf(xx)ψα(x)ψα(x)

We insert the relation Equations S18 and S19 into Equation S14,

(S20) Ξn(z)=(12π)Nn2+(dϕ11...dϕ1n)...(dϕN1...dϕNn)LLddx1V...ddxNVp(σ1)dσ1...p(σN)dσNexp[zi2α=1nj=1N(ϕjα)2+i2α=1nj,k=1Nϕjαϕkαf(xjxk)σjσk]++α=1nD[ψα]D[ψ^α]exp[i+ddx(ψα(x)j=1Nϕjαδ(xxj)σj)ψ^α(x)]=(12π)Nn2+α=1nD[ψα]D[ψ^α]+(dϕ11...dϕ1n)...(dϕN1...dϕNn)LLddx1V...ddxNVp(σ1)dσ1...p(σN)dσNexp[zi2α=1nj=1N(ϕjα)2+i2α=1nj,k=1Nϕjαϕkαf(xjxk)σjσk]exp[iα=1n+ddx(ψα(x)j=1Nϕjαδ(xxj)σj)ψ^α(x)]=(12π)Nn2+α=1nD[ψα]D[ψ^α]exp[i2α=1n+dxdxf(xx)ψα(x)ψα(x)]+(dϕ11...dϕ1n)...(dϕN1...dϕNn)LLddx1V...ddxNVp(σ1)dσ1...p(σN)dσNexp[zi2α=1nj=1N(ϕjα)2+iα=1n+ddx(ψα(x)j=1Nϕjαδ(xxj)σj)ψ^α(x)]=(12π)Nn2+α=1nD[ψα]D[ψ^α]exp[i2α=1n+dxdxf(xx)ψα(x)ψα(x)]exp[iα=1n+ddxψα(x)ψ^α(x)]+(dϕ11...dϕ1n)...(dϕN1...dϕNn)LLddx1V...ddxNVp(σ1)dσ1...p(σN)dσNexp[zi2α=1nj=1N(ϕjα)2iα=1n+ddxj=1Nϕjαδ(xxj)σjψ^α(x)]

Integrating the last term in Equation S20

(S21) +dϕi1...dϕinLLddriVdσip(σi)exp[zi2α=1n(ϕiα)2iα=1n+ddrϕiαδ(rri)σiψ^α(r)]=LLddriV+dϕi1...dϕindσip(σi)exp[zi2α=1n(ϕiα)2iα=1nϕiασiψ^α(ri)]=(2πzi)n2LLddriVdσip(σi)exp[i2zα=1nψ^α(ri)2σi2]=(2πzi)n2LLddrVdσp(σ)exp[i2zα=1nψ^α(r)2σ2]

so that Ξn(z) from Equation S14 can be written as

(S22) Ξn(z)=+α=1nD[ψα]D[ψ^α]ANeS0
(S23) whereA=LLddxV(zi)n2dσp(σ)exp[i2zα=1nψ^α(x)2σ2]
(S24) andS0=i2α=1n+dxdxf(xx)ψα(x)ψα(x)+iα=1n+ddxψα(x)ψ^α(x)

Integrating out the ψα in Ξn(z) Equations S22 and S24

(S25) +D[ψα]exp[i2+dxdxf(xx)ψα(x)ψα(x)+i+ddxψα(x)ψ^α(x)]=(2πi)N/2(detf)1/2exp[i2+dxdxf1(xx)ψ^α(x)ψ^α(x)]

Here, f1 is the inverse kernel satisfying:

(S26) +dxf(xx)f1(xx)=δ(xx)

so that Ξn(z) can be written as

(S27) Ξn(z)=(2πi)Nn2(detf)n/2+D[ψ^α]eS1
(S28) S1=NlnAi2α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x)

The constant term (2πi)Nn2 of Ξn(z) can be ignored because we should compute zlnΞ(z) Equation S10 in the end.

To ensure the mathematical rigor in Equation S45, we next apply the Wick rotation ψα(x)ψα(x)eiπ4 (Appendix 2).

(S29) Ξn(z)=(detf)n/2+D[ψ^α]eS1
(S30) S1=NlnA12α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x)
(S31) A=LLddxV(z)n2dσp(σ)exp[12zα=1nψ^α(x)2σ2]

High-density expansion

In this section, we directly calculate the canonical partition function Ξn(z) in the z limit by approximating the term NlnA (Equation S30) to a quadratic action, from which the partition function (Equation S29) would become a Gaussian integral.

Let us first calculate the AN in z limit

(S32) limzA(z)n2dσp(σ)[1+LLddxV12zα=1nψ^α(x)2σ2]=(z)n2[1+dσp(σ)σ2LLddxV12zα=1nψ^α(x)2]=(z)n2[1+E(σ2)LLddxV12zα=1nψ^α(x)2]
(S33) limzAN=limz(z)Nn2[1+NE(σ2)LLddxV12zα=1nψ^α(x)2]=limz(z)Nn2[1+NE(σ2)α=1nLLddxV12zψ^α(x)2](z)Nn2exp[E(σ2)LLddxVN2zα=1nψ^α(x)2]

Now let us calculate Ξn(z) (Equation S29–S31) by letting L

(S34) Ξn(z)=(detf)n/2(z)Nn2+D[ψ^α]eSh

where the high-density quadratic action

(S35) Sh=E(σ2)ddxVN2zα=1nψ^α(x)212α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x)=12α=1n+dxdxG1(xx)ψ^α(x)ψ^α(x)

where G1(xy)=f1(xy)NE(σ2)Vzδ(xy). Next, by integrating out the ψ^ field, we find

(S36) Ξn(z)=(detf)n/2(z)Nn2+D[ψ^α]eSh=(zNdetfdet(G1))n/2

Using Equation S12 that connects the partition function with the resolvent, we have

(S37) g(z)=2Nz[limn01nln((det(zfG1))n/2)]=VNz+ddk(2π)dln(zNE(σ2)f~(k)V)=1ρ+ddk(2π)d1zρE(σ2)f~(k)

where f~(k) is the Fourier transform of f(x).

Finally, the eigendensity p(λ) (Equation S6) is given by

(S38) p(λ)=1πlimη0+Im(g(λ+iη))=1ρ+ddk(2π)dδ(λρE(σ2)f~(k))=1ρE(σ2)+ddk(2π)dδ(λE(σ2)ρf~(k))

Derivation of power-law eigenspectrum in high-density limit

Here we calculate the eigendensity of our model, with the kernel function f(x) (Table 3). The Equation S38 (set E(σ2)=1 as in Result) can be written as:

(S39) p(λ)=Sd1(2π)dk0d1ρ2|f~(k0)|,k0=f~1(λρ)

where Sd1 is the surface area of d1 dimensional sphere. Here, we consider the approximation f(x)ϵμxμ, whose Fourier transform and its derivative are f~(k)=c0k(dμ), f~(k)=c1k(dμ+1) and k0=f~1(λρ)=(λc0ρ)1dμ. The constants are given by c0=2dμπd2ϵμΓ(dμ2)Γ(μ2)=ϵμc2, c1=(dμ)c0, c2=2dμπd2Γ(dμ2)Γ(μ2)

(S40) p(λ)=Sd1(2π)dk0d1ρ2|f~(k0)|=Sd1(2π)dk02dμρ2|c1|=Sd1(2π)dc0ddμρ2(dμ)(λρ)2dμdμ=Sd1(2π)dc2ddμdμλ2dμdμ(ρϵd)μdμ

Derivation of eigenspectrum with exponential kernel function in high-density limit

Here we consider the exponential kernel function f(x)=ebx, whose Fourier transform and its derivative are f~(k)=c1(b2+k2)d+12, f~(k)=(d+1)kc1(b2+k2)d+32 and k0=f~1(λρ)=(c1ρλ)2d+1b2, k02+b2=(c1ρλ)2d+1, where c1=2dπd12bΓ(d+12)

(S41) p(λ)=Sd1(2π)dk0d1ρ2|f~(k0)|=Sd1(2π)d(b2+k02)d+32k0d1ρ2|(d+1)k0c1|=Sd1(2π)d(c1ρλ)d+3d+1k0d2(d+1)ρ2|c1|=Sd1(d+1)(2π)dc12d+1ρd+1d+1λd+3d+1k0d2=Sd1(d+1)(2π)d22dd+1πd1d+1Γ(d+12)2d+1(ρbd)d+1d+1λd+3d+1((2dπd12Γ(d+12)ρbdλ)2d+11)d22

It is straightforward to see that this spectrum is not scale invariant. For example, when d=2, the above expression reduces to a perfect power-law spectrum p(λ)ρd+1d+1λd+3d+1, which changes with scale over sampling.

Variational approximation

To find a general approximation for the eigenspectrum that goes beyond the high-density limit, we use Gaussian variational approximation in the field representation, namely by looking for the best quadratic action Sv,

(S42) Sv=12αβn+dxdxGαβ1(xx)ψ^α(x)ψ^β(x),

to approximate the action S1 in the partition function (Equations S29–S31). This enables us to represent the partition function by a Gaussian integral, which can be evaluated analytically. We find the best quadratic action Sv by minimizing the difference between S1 and Sv, which is defined as KL divergence between two distributions that are proportional to eS1 and eSv.

In this section, we will proceed by using the grand canonical ensemble formulation, namely the average in Equation S4, instead of using a fixed covariance matrix size N, which is now carried out across all different sizes. If N follows a Poisson distribution, it is easy to show (Appendix 2) that the grand canonical partition function is given by Equation S116:

Z=NΞNn(z)aNN!,

where a=N. As a result, the new action S1 becomes

(S43) S1=NA12α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x).

Here and below, N should be viewed as the average matrix size. The resolvent g(z) in Equation S12 can be similarly generalized to Equation S117,

g(z)=limn02NnzlnZ

As in statistical physics, we define the free energy as

(S44) F1=lnZ=ln+D[ψ^α]eS1

We shall define the variational free energy Fv such that it would approximate the true free energy F1 by minimizing DKL(Pv||P1),

(S45) Fv=DKL(Pv||P1)+F1

where

(S46) P1=eS1+D[ψ^α]eS1
(S47) Pv=eSv+D[ψ^α]eSv

The KL divergence DKL(Pv||P1) is always nonnegative and the free energy F1 is independent of the quadratic action Sv. Therefore, we need to minimize the variational free energy Fv. Let us now examine the variational free energy Fv

(S48) Fv=DKL(Pv||P1)+F1=1Zv+D[ψ^α]eSvlnPvP1lnZ=1Zv+D[ψ^α]eSv(SvS1ln+D[ψ^α]eSv+ln+D[ψ^α]eS1)lnZ=1Zv+D[ψ^α]eSv(SvS1)lnZv

Here Zv is the normalization factor

(S49) Zv=+D[ψ^α]eSv

Since we want to minimize Fv, the constant term

(S50) 1Zv+D[ψ^α]eSvSv=const

can beignored, and Equation S48 is reduced to

(S51) Fv=1Zv+D[ψ^α]eSvS1lnZv

To simplify the formula, let us introduce S2

(S52) S2=12α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x)

and rewrite Equation S51 as

(S53) Fv=1Zv+D[ψ^α]eSvS21Zv+D[ψ^α]eSvNAlnZv

Next, we will compute each term in the variational free energy Fv First, we calculate the third term lnZv in Equation S53 by Equations S42 and S49

(S54) lnZv=ln(α,β(2π)N/2(det(Gαβ1))12)=α,β12lndet(Gαβ)+n2N2ln(2π)

Second, we calculate the first term 1Zv+D[ψ^α]eSvS2 in Equation S53

(S55) 1ZvD[ψ^α]eSvS2=1Zvlimh0hD[ψ^α]eSv+hS2=1Zvlimh0hα=β[det(Gαβ1+hf1)]12αβ[det(Gαβ1)]12=limh0hα[det(I+hf1Gαα)]12=αnhlimh0(1h2Tr(f1Gαα))=αn12Tr(f1Gαα)

Third, we calculate the second term 1Zv+D[ψ^α]eSvNA in Equation S53, recall the term A (Equation S31)

A=LLddxV(z)n2dσp(σ)exp[12zα=1nψ^α(x)2σ2]
(S56) 1Zv+D[ψ^α]eSvNA=N(z)n2Zvdσp(σ)+D[ψ^α]eSvLLddxVexp[12zα=1nψ^α(x)2σ2]=N(z)n2Zvdσp(σ)LLddx0V+D[ψ^α]exp[Sv+12zα=1nψ^α(x)2σ2]=N(z)n2Zvdσp(σ)LLddx0Vα,β[det(Kαβ)]12=N(z)n2dσp(σ)LLddx0Vα,β[det(KαβGαβ1)]12

where

(S57) Sv+12zα=1nψ^α(x)2σ2=12αβn+dxdxGαβ1(xx)ψ^α(x)ψ^β(x)+12zα=1nψ^α(x)2σ2=12αβn+dxdxKαβ1(xx)ψ^α(x)ψ^β(x)
Kαβ1(x,y)=Gαβ1(x,y)σ2zδαβδ(xxo)δ(yx0)
(S58) det(Kαβ1Gαβ)=1σ2zδαβG(x0,x0)
(S59) 1Zv+D[ψ^α]eSvNA=N(z)n2dσp(σ)LLddx0Vα,β[det(Kαβ1Gαβ)]12=N(z)n2dσp(σ)LLddx0Vα[det(Kαα1Gαα)]12=N(z)n2dσp(σ)LLddx0Vα(1σ2zGαα(x0,x0))12=N(z)n2dσp(σ)α(1σ2zGαα(0))12=N(z)n2dσp(σ)exp(12Trnln(1σ2zddk(2π)dG~(k)))

In sum, the variational free energy Fv is equal to

(S60) Fv=α12Tr(f1Gαα)α,β12ln(det(Gαβ))N(z)n2dσp(σ)exp(12Trnln(1σ2zddk(2π)dG~(k)))=αV2ddk(2π)dG~(k)f~(k)V2ddk(2π)dα,βln(G~αβ(k))N(z)n2dσp(σ)exp(12Trnln(1σ2zddk(2π)dG~(k)))

Now let us find the best quadratic action Sv that minimizes the variational free energy Fv

(S61) δFvδG~αβ=0

The solution of Equation S61 is given by

(S62) G~αβ1(k)=δαβG~1(k)
(S63) 1f~(k)dσp(σ)ρσ2zσ2DkG~(k)1G~(k)=0

where Dkddk(2π)d. By using Equation S117, we finally obtain

(S64) g(z)=limn02nNzF1limn02nNzFv=dσp(σ)1zσ2DkG~(k)

Scale invariance of the covariance spectrum in the Gaussian variational Model

In Result, we point to two factors that contribute to the scale invariance of eigenspectrum using the high-density theory. In this section, we show that the same conclusion can be drawn by using the Gaussian variational method. Furthermore, we examine how the heterogeneity of neural activity influences the eigendensity calculated by the Gaussian variational model. We show that p(λ)ρ, which characterizes the change of eigendensity due to sampling in the functional space, decreases with the heterogeneity of neural activity described by higher-order moment of neural activity variance, for example, E(σ4).

Let us rewrite Equation S63 as

(S65) G=DkG~(k)=Dkf~(k)1M(z)f~(k)M(z)=dσp(σ)ρσ2zσ2G(z)

To present a formal expression for the eigendensity, let us define Re(G)gr, Im(G)gi. From Equations S6 and S64, we find

(S66) p(λ,ρ)=1πσ2gi(λσ2gr)2+σ4gi2σ,

where ...σ=...p(σ)dσ.

A direct computation of Equation S66, however, remains difficult: the complication arises from the complex function M(z) in Equation S65, which in turn is an integral function of G. To streamline the calculation, let us further define Re(M)ρa, Im(M)ρb. Writing it down explicitly, we have

(S67) a=σ2(λσ2gr)(λσ2gr)2+σ4gi2σ
(S68) b=σ4gi(λσ2gr)2+σ4gi2σ

The real and imaginary part of G can now be expressed as functions of a and b. Integrating Equation S65 in the spherical coordinates, we have

(S69) gr(ρ)=Sd1(2π)dπ/Lπ/ϵdkkd1f~(k)[1ρaf~(k)][1ρaf~(k)]2+ρ2b2f~2(k)gi(ρ)=Sd1(2π)dπ/Lπ/ϵdkkd1ρbf~2(k)[1ρaf~(k)]2+ρ2b2f~2(k)

where for clarity, we have abused the notation a bit by defining k=k;Sd1 is the surface area of unit d-ball in the momentum space. In order to evaluate the integrals analytically, we introduce an ultraviolet cutoff π/ϵ. Numerically, whether integrating up to π/ϵ or greater than this bound shows little difference.

Numerical solution of the Gaussian variational method

With Equations S66–S69, we numerically calculate the eigendensity iteratively from the following steps:

  • Step 1: set the initial values of a and b as a0=1, b0=1

  • Step 2: solve for a in Equation S67 with fixed b

  • Step 3: solve for b in Equation S68 with fixed a

  • Step 4: iterate Steps 2 and 3 10 times

  • Step 5: calculate p(λ) using Equation S66

Note that we plug Equation S69 into Equations S67 and S68 in step 2–3.

Two contributing factors on the scale invariance

We next derive an analytical expression for Equation S69 by considering the approximate power law kernel function f(x)ϵμxμ, μ>0, from which the high-density theory results on the scale invariance can be extended.

By a change of variable x=f~(k)ϵμk(dμ), and let xϵf~(πϵ), xLf~(πL), we have

(S70) gi(ρ)ϵμddμdμxϵxLdxρbxμdμ[1ρax]2+ρ2b2x2,

where ∼ indicates that all constant numerical factors (e.g., π and Γ(d/2)) are ignored. To compute Equation S70, we perform a branch cut at [0,], and perform a contour integral on the complex plane following the path in , Appendix 2—figure 1A. When 0<β=1μdμ<2, the integral on the large circle ΓR and the small circle Γϵ goes to zero as xL,xϵ0, leaving only two simple poles (zeros of the function in the denominator) in the complex plane. By applying the residue theorem, we find an expression for gi in the limit L,ϵ0

(S71) cosθ=aa2+b2β=d2μdμgiϵμddμdμsin(β1)θsinθsinπβπbρ1β(a2+b2)β/2

The analytical expression for gr is a bit more involving.

(S72) gr(ρ)ϵμddμdμxϵxLdxxddμ[1ρax]2+ρ2b2x2abgi

It has two terms, the second term is similar to Equation S70; the first term, however, diverges as xϵ0. Thus, the radius of the small circle Γϵ in Appendix 2—figure 1 cannot shrink to zero: this is precisely the requirement of an ultraviolet cutoff in the wave vector k. The contour integral on the large circle ΓR, on the other hand, goes to zero as xL. Thus, the integral on Γϵ contributes to the final result. By considering leading order term of xϵ for finite but small xϵ, we find

(S73) cosθ=aa2+b2γ=μdμgrϵμddμdμxϵγγϵμddμdμsin(γ1)θsinθsinπγπργ(a2+b2)γ/2abgi

Recall xϵϵμ(π/ϵ)dμ, and we find that the first term in gr is proportional to πμ/μ, independent of ε.

Appendix 2—figure 1
Calculate gi and gr.

(A) The path of the contour integral for gi, gr (Equation S70). The heatmap of gr and gi with respect to λ and ρ, gr in (B, C) are calculated by the numerical method (Methods). The parameters are N=1024, ρ=10.24, d=2, L=10, μ=0.5, ϵ=0.03125 is i.i.d. sampled from a log-normal distribution with zero mean and a standard deviation of 0.5 in the natural logarithm of the σi2 values; we also normalize E(σi2)=1.

According to Equation S71 and S73, one can immediately see that as μ/d0, the ρ-dependence relationship vanishes for gr and gi. We therefore conclude that a slower power-law decay in the kernel function and/or a higher dimension of the functional space are two contributing factors for the scale invariance of the covariance spectrum.

Heterogeneity of neural activity across neurons enhances scale invariance

Next, we take a more close look at how the eigendensity changes with ρ for finite but small μ/d and when λ1. Using Equation S66, we have

(S74) pρ=1πgiρσ2[(λσ2gr)2+σ4gi2]2σ6gi2[(λσ2gr)2+σ4gi2]2+grρ2σ4gi(λσ2gr)[(λσ2gr)2+σ4gi2]2σ

From numerical calculation, we find that typically grgi, so one can use the approximation

(S75) pρ1πgiρσ2(λσ2gr)2+σ4gi2σ+1πgrρ2σ4gi(λσ2gr)3σ

Recall Equation S66

(S76) p(λ,ρ)=1πσ2gi(λσ2gr)2+σ4gi2σ,

Since p(λ,ρ) is very small for large λ, a more appropriate measure is to examine

(S77) logpρ1ppρgiρ1gi+2grρσ4(λσ2gr)3σσ2(λσ2gr)2σ

Considering the large eigenvalue case λσ2gr (the numerical value of gr is on the order of 1), we perform Taylor expansion and arrive at

(S78) σ2(λσ2gr)2σσ2λ2+2σ4grλ3+3σ6gr2λ4σ
(S79) σ4(λσ2gr)3σσ4λ3+3σ6grλ4σ

Note σ2σE(σ2) is normalized to 1.

(S80) logpρgiρ1gi+2grρσ4(λσ2gr)3σσ2(λσ2gr)2σgiρ1gi+2grρσ4σ+3grλσ6σλ+2grσ4σ+3gr2λσ6σ

By examining Equations S71 and S73, we find that when λgr, ab, θπ, gr decays weakly with ρ while gi increases weakly with ρ (also confirmed by numerical calculation, Figure 1B, C)

grρ<0, giρ>0.

It is therefore straightforward to see from 80 that the higher-order moment (e.g., E(σ4)) in the activity variance contributes to reducing the ρ-dependence in the eigendensity function.

The relationship between CI and eigendensity

In this section, we show how the CI introduced in Methods is related to Equation S80, namely how the eigendensity changes with the neuronal density in the functional space. Recall the definition of CI in Equation S13:

CI:=1log(q0/q1)logq1logq0|logλ(q)logρ|dlogq

where

q(λ)=λp(λ)dλ

we used implicit differentiation to compute logλ(q)logρ. For clarity, we write the function q(λ,ρ) explicitly involving λ and ρ as Q(λ,ρ) in Equations S81–S83.

(S81) F(λ(q,ρ),q,ρ)=Q(λ(q,ρ),ρ)q0
(S82) dF(λ(q,ρ),q,ρ)dρ=F(λ(q,ρ),q,ρ)ρ+F(λ(q,ρ),q,ρ)λλ(q,ρ)ρ=0
(S83) λ(q,ρ)ρ=F(λ(q,ρ),q,ρ)ρF(λ(q,ρ),q,ρ)λ=Q(λ(q,ρ),ρ)ρQ(λ(q,ρ),ρ)λ

Now we can write CI as

(S84) logλ(q,ρ)logρ=ρλ(q,ρ)λ(q,ρ)ρ=ρλ(q,ρ)q(ρ,λ)ρq(ρ,λ)λ

from which we arrive at Equation S15 in Methods:

(S85) CI=1log(q0/q1)logq1logq0|logλ(q,ρ)logρ|dlogq=1log(q0/q1)q1q0|ρqλqρqλ|dq=1log(q0/q1)λ(q1)λ(q0)|ρqλqρqλ|qλdλ=1log(q0/q1)λ(q0)λ(q1)|ρqλqρ|dλ

Finally, we can rewrite CI as a function of pρ using a double integral:

(S86) CI=1log(q0/q1)λ(q0)λ(q1)|ρqλqρ|dλ=1log(q0/q1)λ(q0)λ(q1)dλ1|ρqλ1λ1dλ2p(λ2)ρ|=1log(q0/q1)λ(q0)λ(q1)1λ1dλ1|λ1dλ2p(λ2)lnp(λ2)lnρλ1dλ2p(λ2)|

Compare high-density theory and Gaussian variational method

This section aims to determine the conditions under which the high-density approximation aligns with the simulation results. To this end, we begin by comparing the kernel operator G~h(k) in the high-density quadratic action and G~v(k) in the variational approximation. We identify the condition when high-density theory would agree with the variational method as well as the numerical simulation, namely zddk(2π)dG~v(k). Secondly, we give a precise re-derivation of the high-density result by incorporating this condition into the variational approximation. Finally, we substitute ddk(2π)dG~v(k) with ddk(2π)dG~h(k) and estimate the parameter regime where the high-density theory would agree with numerical simulation. This analysis yields a deeper understanding of the relationship between high-density theory and variational method, and how they relate to simulation results.

A simple comparison of the two methods

For the sake of simplicity, we consider the correlation matrix with p(σ)=δ(σ1) in this section. Returning to the explicit result Equation S29–S31 ,

(S87) Ξn(z)=(detf)n/2(z)Nn2+D[ψ^α]eS1

In the high-density approximation (Equation S35)

(S88) Sh=LLddxVN2zα=1nψ^α(x)212α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x)=12α=1n+dxdxGh1(xx)ψ^α(x)ψ^α(x)

Here, we introduce Gh as the kernel operator in the high-density quadratic action.

(S89) Gh1(xy)=f1(xy)NVzδ(xy)

Fourier transform of Gh leads to

(S90) G~h(k)=f~(k)1ρzf~(k)

In the variational method Equation S63, we have

(S91) G~v(k)=f~(k)1Cf~(k),C=ρzddk(2π)dG~v(k),

where we introduce Gv as the kernel operator in the variational quadratic action. Clearly, the condition that G~v(k) approaches G~h(k) is given by

Cρz,zddk(2π)dG~v(k)

The function ratiov(z) is defined as:

ratiov(z)=1zddk(2π)dG~v(k)

As ratiov(z) approaches 0, G~v(k) becomes identical to G~h(k). Note that G~v(k) is difficult to compute; instead, we can compute and analyze ddk(2π)dG~h(k) (see Appendix 2)

(S92) ratioh(z)=1zddk(2π)dG~h(k)

A re-derivation of the high-density result using the grand canonical ensemble

In this section, we re-derive the high-density result from the grand canonical ensemble and the variational method. The derivation also allows us to reproduce the approximation condition discussed in the previous section.

Let us recall the calculation of the free energy Fv (Equation S60) in the variational approximation with p(σ)=δ(σ1)

(S93) Fv=V2Trnddk(2π)dG~(k)f~(k)N(z)n2exp(12Trnlog(11zddk(2π)dG~(k)))V2ddk(2π)dα,βlog(G~αβ(k))=Vn2ddk(2π)dG~(k)f~(k)N(z)n2[11zddk(2π)dG~(k)]n2Vn2ddk(2π)dlogG~(k)
(S94) limn0Fv=Vn2ddk(2π)dG~(k)f~(k)+Nn2log[zddk(2π)dG~(k)]Vn2ddk(2π)dlogG~(k)+N

Following Equations S61 and S63:

(S95) δFvδG~=0
(S96) 1f~(k)ρzddk(2π)dG~(k)1G~(k)=0
(S97) g(z)=limn02nNddzF1limn02nNddzFv=limn02nN(zFv+ddk(2π)dG~(k)zG~(k)Fv)=limn02nNzFv=1zddk(2π)dG~(k)

We can perform the same calculation in the high-density theory by considering the limit ratiov(z)=1zddk(2π)dG~v(k)0:

(S98) limn0limratiov(z)0Fv=Vn2ddk(2π)dG~(k)f~(k)+Nn2log[zddk(2π)dG~(k)]Vn2ddk(2π)dlogG~(k)+N=Vn2ddk(2π)dG~(k)f~(k)+Nn2log(z)Nn21zddk(2π)dG~(k)Vn2ddk(2π)dlogG~(k)+N

Therefore, we can define the free energy Fh in the high-density theory as

(S99) Fh=Vn2ddk(2π)dG~(k)f~(k)+Nn2log(z)Nn21zddk(2π)dG~(k)Vn2ddk(2π)dlogG~(k)+N

then

(S100) δFhδG~=0
(S101) 1f~(k)ρz1G~(k)=0

This is precisely Equation S90 derived in the previous section.

(S102) g(z)limn02nNzFh=1z+1z2ddk(2π)dG~(k)=1z+1z2ddk(2π)df~(k)1ρzf~(k)=1z[1+ddk(2π)df~(k)zρf~(k)]=1z[1ρddk(2π)dzρf~(k)zρf~(k)+ddk(2π)df~(k)zρf~(k)]=1ρddk(2π)d1zρf~(k)

which is the resolvent of high-density approximation (Equation S37).

Explicit expression for the integral

In this section, we provide an explicit expression for the integral ddk(2π)dG~h(k) instead of ddk(2π)dG~v(k), which is implicit and can not be calculated analytically. Like the derivation in Appendix 1—figure 9, we consider the lower and upper limits of integration for ddk(2π)dG~h(k) as [0,πϵ]. We then approximate the Fourier transform f~(k) as a power-law function. To ensure that the singularity f~(ks)=zρ of G~h(k) falls within the integration range of [0,πϵ], we introduce a simple correction xϵ=C(πϵ)μd to f~(k):

(103) f~(k)=Ckμdxϵ

where C=C0ϵμ, C0=2dμπd2Γ(dμ2)Γ(μ2) are all constants depending on the parameters d, μ, and ϵ. Then we compute the contour integral (Appendix 2—figure 1A) by Taylor expansion. As a result, we have

(S104) ddk(2π)dG~h(k)=0πϵddk(2π)df~(k)1ρzf~(k)=12π(μd)CPzρ(j=0xϵ1P+j(zρ+xϵ)1j1P+jπcot(π(1P))(zρ+xϵ)P)12π(μd)CPzρ(j=0xϵ1P+j(zρ+xϵ)1jP+jπcot(π(P))(zρ+xϵ)P1xϵ)=12π(dμ)CPzρ(j=0xϵ1P+j(zρ+xϵ)1j(P1j)(Pj)πcot(πP)(zρ+xϵ)Pzz+ρxϵ)=πd1z2(dμ)ρϵd(j=0(zρxϵ+1)1j(P1j)(Pj)πcot(πP)(zρxϵ+1)Pzz+ρxϵ)

where P=ddμ>1.

Now let us take a close look at the behavior of the function ratioh(z) Equation S92, plotted in Appendix 2—figure 2A,B . For small z, this function is negative. It then crosses zero and has a peak. As z, the ratioh approaches zero. This is because Equation S104 approaches a positive constant, which is given by

limzddk(2π)dG~h(k)=πd1C22(dμ)(P1)P,

where C2=C(πϵ)μ. For z1, we find the leading order expansion at j=1 already gives an accurate approximation (Appendix 2—figure 2A,B).

(S105) ddk(2π)dG~h(k)12π(dμ)CPzρ[xϵ1P(zρ+xϵ)1(P1)(P)+xϵ2P(zρ+xϵ)2(P2)(P1)πcot(πP)(zρ+xϵ)ddμzz+ρxϵ]

Estimate the parameter condition when the high-density theory best agrees with numerical simulation

By analyzing the properties of the function ddk(2π)dG~h(k), we think the high-density theory provides an accurate approximation when the zero-crossing of ddk(2π)dG~h(k) is near z=1 (the peak of low-density result Mézard et al., 1999) The root z0 of the integral ddk(2π)dG~h(k) is given by

(S106) ρxϵz0=g1(d,μ)

It is easy to see that g1(d,μ) is a function of P (or dμ) from Equation S104. We can rewrite Equation S106 as

(S107) ρxϵz0=g2(dμ)

Here, we can also see that z0 can be expressed as:

z0=c0πμdρϵdg1(d,μ)

Using this expression for z0 and letting z0=1, we can derive the following equation for ρϵd, a dimensionless parameter that determines the condition when the high-density theory is an accurate approximation of our ERM model:

(S108) ρϵd=z0g2(dμ)Γ(μ2)2dμπμd2Γ(dμ2)

Appendix 2—figure 2C shows how ρϵd changes as a function d for a small and fixed μ/d. For example, when d=2, μ=0.5, ϵ=0.03125, we find

ρϵd=0.83,orρ=850

This estimate is also consistent with our numerical simulation (Figure 4—figure supplement 1).

Wick rotation

To ensure mathematical rigor in Appendix 2, we should make sure that the action S1 in Equation S46 is a real number. Here, we use Wick rotation to transform Equations S28–S30. The Gaussian integral Equation S29 can be divergent when G1(xy) is not positive definite, To address this issue, we can always write the partition function Ξn(z) as a Gaussian integral by choosing the appropriate axes with Wick rotation.

(S109) Ξn(z)=(2πi)Nn2(detf)n/2+D[ψ^α]eS1
S1=NlnAi2α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x)

We can now change the integration variables by diagonalizing ψ^α to ψ~α via ψ~α=Qψ~α ,where Q is Fourier base

(S110) Ξn(z)=(2πi)Nn2(detf)n/2+D[ψ~α]eS1
S1=NlnA~i2α=1n+ddkf~1(k)ψ~α(k)2
(S111) A~=ddkV(z)n2exp[i2zα=1nψ~α(k)2]

by letting L. Note that eS1 is analytic. Thus if

limψ~α(1i)eS1=0

and the convergence rate is faster than 1/ψ~2, we can apply the Wick rotation ψα(x)ψα(x)eiπ4: instead of computing the integral on the real axis C1, we now rotate the integral line 45 degree clockwise to C3 in the complex plane:

(S112) C1D[ψ^α]eS1=C3D[ψ^α]eS1

On the other hand, if

limψ~α(1+i)eS1=0

and the convergence rate is faster than 1/ψ~2, we can apply the Wick rotation ψα(x)ψα(x)eiπ4, namely to rotate the integral line 45 degree counterclockwise to C2:

(S113) C1D[ψ^α]eS1=C2D[ψ^α]eS1

As a simple example, consider a one-dimensional Gaussian integral

dxeikx2

When k>0, we can use the Wick rotation xxeiπ4

dxeikx2=eiπ4dxekx2=eiπ42πk=2πik

When k<0, we can use the Wick rotation xxeiπ4

dxeikx2=eiπ4dxekx2=eiπ42πk=2πik

Without loss of generality, we rotate ψα(x)ψα(x)eiπ4 in Appendix 2 for subsequent calculations.

Grand canonical ensemble

When using the Gaussian variational Approximation, we consider a critical extension from the canonical ensemble to the grand canonical ensemble when computing the partition function (Equation S9). We would like to justify this approximation in this section. Recall that the resolvent is given by

g(z)=2NzlnΞ(z)

where Ξ(z) can be viewed as the canonical partition function, the ... is the average over all random matrices C for a given N. Let us now generalize (Equation S9) into grand canonical ensemble, namely

(S114) g(z)=2NzlnΞ(z)N

where ...N indicates that we need to average over all possible random matrices and across all possible N , with the probability to have a matrix size N given by the Poisson distribution P(N)=eaaNN!, where a=N. When N is large, P(N) has a very sharp peak at N, and Equation S114 can be approximated as

(S115) g(z)2NzlnΞ(z)N

Using the replica trick, we recall Equation S12

g(z)=limn02NnzlnΞn(z)

Let us now define the grand canonical partition function as

(S116) Z=N=0ΞNn(z)aNN!,

Likewise, the resolvent in Equation S12 is generalized to

(S117) g(z)=limn02NnzlnZ.

To see whether this definition makes sense, we write

(S118) g(z)=limn02NnN=0zΞNn(z)aN/N!Z=limn02NnN=0z[1+nlnΞN(z)]aN/N!N=0ΞNn(z)aNN!=2NN=0zlnΞN(z)aN/N!N=0aNN!=2NzlnΞ(z)N,

where the second equality uses the identity

lnΞ=limn0Ξn1n,

and the last equality is indeed Equation S115 discussed earlier. Returning back to the explicit form of the grand canonical partition function in our ERM model (Equations S29–S31), we have

(S119) Z=+D[ψ^α]eS0+aA=+D[ψ^α]eS0+NA.

Here, ψ is the auxiliary fields (Equation S15), S0=12α=1n+dxdxf1(xx)ψ^α(x)ψ^α(x) and A are terms defined in Equation S119, Equations S29–S31 is used in Appendix 2 to compute the free energy.

E-I balanced asynchronized model summary

In this section, we discuss the E-I balanced asynchronized model (Renart et al., 2010), which predicts a different scaling D N under random sampling, since the variance Eijk(cij2) scales as 1/N and diminishes as N approaches large limit.

Model

The simulation of binary networks involves updating neuron states within a network architecture identical to analytical studies. The update rule is probabilistic, with neuron activities set based on synaptic currents and a firing threshold. The dynamics resolution improves with network size, with neuron time constants effectively representing changes in firing activity. Parameters for simulations include connection probabilities, mean rates, thresholds, and synaptic strengths, scaled appropriately for network size.

Update rule: σiα(t+1)=Θ(jJijαβσjβ(t)θiα)

Dynamics resolution: dt=τ3N

In the simulation of binary networks, the model’s dynamics are governed by a set of parameters, each with a specific role: σiα(t+1): This represents the state of neuron i in population α at the next time step t+1. The state is binary, where 1 indicates the neuron is active (firing) and 0 indicates it is inactive.

Θ(): The Heaviside step function used in the update rule. It determines the neuron’s next state by comparing the net input to the neuron against its firing threshold. If the net input exceeds the threshold, the neuron’s state is set to active; otherwise, it remains or becomes inactive.

jJijαβσjβ(t): This sum represents the total synaptic input to neuron i from all neurons j in population β at time t . Jijαβ is the synaptic weight from neuron j in population β to neuron i in population α, and σjβ(t) is the state of neuron j at time t.

θiα : The firing threshold of neuron i in population α. It is the value against which the net synaptic input is compared to determine whether neuron i will fire (transition to state 1) or not (remain in state 0).

α={E,I}, β={E,I,X}: Represents a specific population of neurons within the network. E: excitatory neurons, I: inhibitory neurons, or X: external source of neurons that provide inputs to the network but are not influenced by the network’s internal dynamics.

Firing Rate Correlation r

The mean firing rate correlation E(r) scales inversely with the network size N, specifically, E(r)1/N. The standard deviation σr of r decays only as 1/N (Renart et al., 2010).

Given that the variance of r, denoted as Var(r), is bN, and the expected value of r, denoted as E(r), is aN, where N is the sample size, and a and b are constants, we aim to calculate E(r2), the expected value of the square of the correlation coefficient r.

The term Eijk(cij2) in PR dimension is given by:

(S120) Var(r)=E(r2)[E(r)]2

Substituting Var(r)=bN and E(r)=aN into the equation, we get:

(S121) Eijk(cij2)=E(r2)=bN+(aN)21N

Thus in PR dimension DPR(C)=N2(E[σ2])2NE[σ4]+N(N1)Eij[cij2], the term NE[σ4] and N(N1)Eij[cij2] are of the same order, and the PR dimension will not reach the upper bound (E[σ2])2Eij[cij2].

Appendix 2—figure 2
Relationship between ratioh and z.

(A) ρ=1024, (B) ρ=256. Blue line: ratioh calculated numerically. Red line: 100-order expansion of Equation S104, which perfectly overlaps with the blue line. Green line: expansion to the first order. Other parameter: μ=0.5, d=2, ϵ=0.03125. (C) Relationship between ρϵd and dimension d with fixed μd Equation S108.

Appendix 2—figure 3
Wick rotation in complex plane.
Appendix 2—table 1
Table of notations.
NotationDescription
g(z)Resolvent Equation S5
The average across realizations of 𝐶 (i.e., random xis and σi2’s), Equation S4
Ξ(z)Canonical partition function, Gaussian integral representation of the determinant [det(zC)]1/2 , Equation S8
ϕIntermediate variable for Gaussian integral representation Ξ(z), Equation S8
ΨDensity field of ϕ
Ψ^Respective Lagrange multiplier fields of Ψ
S1The action in Ξ(z) (by analogy with the path integral formulation of quantum mechanics)
ShThe action in the high-density approximation of Ξ(z)
SvThe action in the variational approximation of Ξ(z)
ATerm in S1
f1The operator inverse of f, Equation S26
GQuadratic kernel in the Gaussian integral approximation of Ξ(z)
G1The operator inverse of G, same definition as f1
G~The Fourier transform of G

Data availability

The source code in this work can be found at https://github.com/wzz1999/ERM-scale (copy archived at Mak and Wang, 2025). The fish data collected and analyzed in this work can be found at https://doi.org/10.6084/m9.figshare.28721477.

The following data sets were generated
The following previously published data sets were used
    1. Steinmetz N
    2. Pachitariu M
    3. Stringer C
    4. Carandini M
    5. Harris K
    (2019) figshare
    Eight-probe Neuropixels recordings during spontaneous behaviors.
    https://doi.org/10.25378/janelia.7739750
    1. Stringer C
    2. Pachitariu M
    3. Reddy C
    4. Carandini M
    5. Harris KD
    (2018) figshare
    Recordings of ten thousand neurons in visual cortex during spontaneous behaviors.
    https://doi.org/10.25378/janelia.6163622

References

Article and author information

Author details

  1. Zezhen Wang

    School of Data Science, University of Science and Technology of China, Hefei, China
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Weihao Mai and Yuming Chai
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0009-0006-1401-9871
  2. Weihao Mai

    Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing
    Contributed equally with
    Zezhen Wang and Yuming Chai
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2320-2276
  3. Yuming Chai

    1. Hefei National Laboratory for Physical Sciences at the Microscale, Center for Integrative Imaging, University of Science and Technology of China, Hefei, China
    2. Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
    Contribution
    Resources, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft
    Contributed equally with
    Zezhen Wang and Weihao Mai
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0184-1824
  4. Kexin Qi

    1. Hefei National Laboratory for Physical Sciences at the Microscale, Center for Integrative Imaging, University of Science and Technology of China, Hefei, China
    2. Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
    Contribution
    Data curation, Formal analysis, Investigation, Methodology
    Competing interests
    No competing interests declared
  5. Hongtai Ren

    Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei, China
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  6. Chen Shen

    1. Hefei National Laboratory for Physical Sciences at the Microscale, Center for Integrative Imaging, University of Science and Technology of China, Hefei, China
    2. Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  7. Shiwu Zhang

    Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei, China
    Contribution
    Project administration
    Competing interests
    No competing interests declared
  8. Guodong Tan

    Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
    Contribution
    Methodology
    Competing interests
    No competing interests declared
  9. Yu Hu

    1. Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China
    2. Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong, China
    Contribution
    Conceptualization, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    mahy@ust.hk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0790-1605
  10. Quan Wen

    1. School of Data Science, University of Science and Technology of China, Hefei, China
    2. Hefei National Laboratory for Physical Sciences at the Microscale, Center for Integrative Imaging, University of Science and Technology of China, Hefei, China
    3. Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
    Contribution
    Conceptualization, Resources, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing
    For correspondence
    qwen@ustc.edu.cn
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0268-8403

Funding

Ministry of Science and Technology of the People's Republic of China (2022ZD0211900)

  • Zezhen Wang
  • Yuming Chai
  • Kexin Qi
  • Guodong Tan
  • Quan Wen

Research Grants Council, University Grants Committee (ECS-26303921)

  • Weihao Mai
  • Yu Hu

National Science Foundation of China (NSFC-32071008)

  • Zezhen Wang
  • Yuming Chai
  • Kexin Qi
  • Guodong Tan

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Acknowledgements

We are grateful to Liqun Luo and Changsong Zhou for their helpful suggestions on our manuscript. QW thanks Hideaki Shimazaki for the suggestion that the functional space could be the feature space for sensory coding. QW thanks Jia Lou for improving the illustration in Figures 1 and 3. QW was supported by the STI2030-Major Projects 2022ZD0211900 and the NSFC-32071008 from the National Science Foundation of China. YH was supported by ECS-26303921 from the Research Grants Council of Hong Kong.

Version history

  1. Sent for peer review:
  2. Preprint posted:
  3. Reviewed Preprint version 1:
  4. Reviewed Preprint version 2:
  5. Version of Record published:

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.100666. This DOI represents all versions, and will always resolve to the latest one.

Copyright

© 2025, Wang, Mai, Chai et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,594
    views
  • 196
    downloads
  • 1
    citation

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Citations by DOI

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Zezhen Wang
  2. Weihao Mai
  3. Yuming Chai
  4. Kexin Qi
  5. Hongtai Ren
  6. Chen Shen
  7. Shiwu Zhang
  8. Guodong Tan
  9. Yu Hu
  10. Quan Wen
(2025)
The geometry and dimensionality of brain-wide activity
eLife 14:RP100666.
https://doi.org/10.7554/eLife.100666.3

Share this article

https://doi.org/10.7554/eLife.100666