Granger causality analysis for calcium transients in neuronal networks, challenges and improvements
Abstract
One challenge in neuroscience is to understand how information flows between neurons in vivo to trigger specific behaviors. Granger causality (GC) has been proposed as a simple and effective measure for identifying dynamical interactions. At singlecell resolution however, GC analysis is rarely used compared to directionless correlation analysis. Here, we study the applicability of GC analysis for calcium imaging data in diverse contexts. We first show that despite underlying linearity assumptions, GC analysis successfully retrieves nonlinear interactions in a synthetic network simulating intracellular calcium fluctuations of spiking neurons. We highlight the potential pitfalls of applying GC analysis on real in vivo calcium signals, and offer solutions regarding the choice of GC analysis parameters. We took advantage of calcium imaging datasets from motoneurons in embryonic zebrafish to show how the improved GC can retrieve true underlying information flow. Applied to the network of brainstem neurons of larval zebrafish, our pipeline reveals strong driver neurons in the locus of the mesencephalic locomotor region (MLR), driving target neurons matching expectations from anatomical and physiological studies. Altogether, this practical toolbox can be applied on in vivo population calcium signals to increase the selectivity of GC to infer flow of information across neurons.
Editor's evaluation
This important paper provides an indepth analysis of the advantages and potential pitfalls of the application of Granger Causality (GC) to calcium imaging data, especially regarding various types of preprocessing. The authors' approach uses compelling rigor in arguing their points, and it is very clear how one would go about replicating their work. These results should be of interest to any researcher attempting to analyze calcium imaging data.
https://doi.org/10.7554/eLife.81279.sa0Introduction
The prompt integration of sensory inputs by motor command neurons in the nervous systems can become a matter of life and death, often requiring fast and precise movements in response to the stimulus. Understanding how large neuronal networks coordinate to generate cognitive activities requires knowing how information from the stimuli flows between neurons to result in the observed behavior. Recent developments in optogenetics (Deisseroth, 2011; Fenno et al., 2011), and wholebrain imaging with singlecell resolution (Ahrens et al., 2013; Nguyen et al., 2016; Venkatachalam et al., 2016; Cong et al., 2017) allow us to observe collective firing patterns of many neurons and give the basis for reconstructing a causality network that describes how neurons influence each other. However, despite the existing data, assessing the direction of neuronal communication or causality in order to understand how information flows in the brain during integrative reflexive behaviors remains a major challenge in neuroscience (Goulding, 2009).
Apart from perturbing individual neurons and observing the activity of downstream neurons, noninvasive methods for obtaining functional connectivity have been developed (Bastos and Schoffelen, 2016). The widelyused correlation analysis (Cohen and Kohn, 2011; Haesemeyer et al., 2018; Stringer et al., 2019) identifies pairs or groups of neurons that are active at the same time, suggesting either that they drive each other, or are driven by a common input. By contrast, Granger causality (GC) identifies directed interactions (Granger, 1969; Geweke, 1982; Geweke, 1984). Assuming that the two timeseries are well described by Gaussian autoregressive processes, GC quantifies the ability to predict the future values of the time series of a given neuron using prior values of the time series of other neurons. In the field of neuroscience, GC has been used to analyze data from diverse sources (Seth et al., 2015), including electroencephalography (EEG), magnetoencephalography (MEG) (Gow et al., 2008), functional magnetic resonance imaging (fMRI) (Roebroeck et al., 2005; Deshpande et al., 2009; Wen et al., 2013), and local field potentials (LFP) (Brovelli et al., 2004). Despite criticisms such as sensitivity to data preprocessing (Florin et al., 2010), internal dynamics of the neuron (Stokes and Purdon, 2017), additive noise that can be correlated among regions of interest or not (Nalatore et al., 2007; Vinck et al., 2015), and sampling frequency (Zhou et al., 2014; Barnett and Seth, 2017), GC and its extensions (Shojaie and Fox, 2021; Barnett et al., 2018) have been useful to construct functional connectomes, to identify important neurons acting as information hubs, or to classify different brain states (Nicolaou et al., 2012).
While most GC analyses in neuroscience focus on large scale ensemblelevel neuronal signals, it is unclear how well it can be adapted to address singlecell level descriptions, such as spiking (Kim et al., 2011; Gerhard et al., 2013; Sheikhattar et al., 2018) or population calcium imaging data (De Vico Fallani et al., 2014; Severi et al., 2018; Oldfield et al., 2020). Without the law of large numbers, these singlecell time series are no longer approximately Gaussian, the dynamics of signal propagation is increasingly nonlinear, and the sampling rate is typically much slower than the rate of information propagation. All of these points cast doubt on whether GC can correctly identify information flow. Recently however, De Vico Fallani et al., 2014 demonstrated in zebrafish embryos that, despite a low sampling frequency of acquisition for calcium imaging (4 Hz), GC can detect information flow consistent with the known rostrocaudal propagation of activity among motoneurons in the spinal cord. Oldfield et al., 2020 subsequently applied GC to grouped averages of calcium signals over large brain regions without single cell resolution, and detected the information flow correctly based on anatomy from the pretectum to the tectum in the visual system leading to subsequent activity of brainstem and motor circuits during prey capture in larval zebrafish. New variations of GC accounting for sparsity in the connection have also been developed and applied on singlecell level to both spiking and calcium signals in the primary auditory cortex of mice (Sheikhattar et al., 2018; Francis et al., 2018; Francis et al., 2022). Nonetheless, at the singlecell resolution, it is unclear to which extent the application of GC analysis in population calcium imaging is generalizable, and how GC analysis should be improved to accommodate intrinsic features of calcium signals.
In this manuscript, we identify a set of issues inherent to GC analysis when applied to population calcium imaging and propose solutions to address them. We first use synthetic neural signals and then exploit as examples population recordings of in vivo calcium signals in zebrafish with either correlated and noncorrelated noise. We build an improved analysis pipeline customized for calcium signals to address critical issues intrinsic to in vivo calcium imaging, such as the slow exponential decays of calcium transients, correlated noise, and nonlinear and nonGaussian statistics. We develop our method both on a synthetic network of neurons for which we know the interaction structure, as well as on real data from motoneurons in the embryonic zebrafish spinal cord that are recruited along a chain organized from rostral to caudal on either side of the spinal cord (De Vico Fallani et al., 2014). We first improve our analysis pipeline to optimally detect the information flowing between embryonic motoneurons in De Vico Fallani et al., 2014. Second, we apply our improved GC analysis to infer information flow from large neuronal populations in the brainstem of larval zebrafish performing optomotor response (Severi et al., 2018). In the context of large populations of neurons, we show that our method is able to reveal that motorcorrelated neurons within the recently identified mesencephalic locomotor region in CarboTano et al., 2022 drive the activity of downstream neurons in the reticular formation.
Results
Granger causality
Granger causality quantifies the ability to predict the future values of the discrete time series of neuron X of length $T$, $\{{x}_{t}\}$, for $t=1,2,\mathrm{\dots},T$, using prior values of the time series of neuron Y, $\{{y}_{t}\}$, for $t=1,2,\mathrm{\dots},T1$, assuming that the two timeseries are well described by Gaussian autoregressive processes (Granger, 1969; Seth et al., 2015). The method asks whether including information from Y to predict X has significant predictive power, by computing a pvalue assuming as a null model the probability that the dynamics of X can be explained without Y. The prior values of neuron X and Y itself used in that prediction are taken for a finite number of past time points, which defines a characteristic timescale.
With only the information of X itself, the dynamics of X, x_{t}, may be modeled as an autoregressive process, denoted as the ‘reduced model’,
where ${\stackrel{~}{a}}_{q}$ are the regression coefficients, the maximum lag $L$ is a hyperparameter subject to tuning, and ${\stackrel{~}{\u03f5}}_{t}$ is a residual term, assumed to be Gaussian.
Alternatively, one can also write down a linear regression model for x_{t} given the history of both neuron X and neuron Y. This “full model” is
where a_{q} and b_{q} are the regression coefficients for the contribution of neuron X and Y’s history to neuron X’s current state and ${\u03f5}_{t}$ is a residual term, also assumed to be Gaussian.
The key idea of Granger causality is to compare the residuals between the reduced and the full models. Intuitively, if neuron Y can influence X, including the dynamics of Y will improve the prediction of x_{t}, and hence reduce the residual term. As the reduced model is nested within the full model, to test whether the distributions of the residuals, $\{{\epsilon}_{t}\}$ and $\{{\stackrel{~}{\epsilon}}_{t}\}$, are significantly different requires a oneway analysis of variance (ANOVA) test carried out by computing the Ftest statistic:
where ${T}_{\text{regr}}=TL$ is the number of time points used in the regression, and ${M}_{\text{f}}=2L+1$ and ${M}_{\text{r}}=L+1$ are respectively the number of parameters in the full and reduced models. Under the null hypothesis that including y_{t} does not improve the prediction of x_{t}, and that the noise can be described by Gaussian i.i.d., the Fstatistics will follow an Fdistribution with defined degrees of freedom, $\mathcal{F}(F;{M}_{\text{f}}{M}_{\text{r}},{T}_{\text{regr}}{M}_{\text{f}})$. Hence, testing the computed Fstatistic ${F}_{Y\to X}$ against the Fdistribution will distinguish whether Y significantly Grangercauses X.
To measure the amount of reduction in the residual, the bivariate Granger causality value (BVGC) is defined as
where the sample variance requires a correction using the number of samples in the regression (${T}_{\text{regr}}=TL$) minus the number of regression parameters (${M}_{\text{r}}$ or ${M}_{\text{f}}$). Mathematically, the GC value is related to the $F$test statistic through a monotonic transformation:
A significantonly GC value, $G{C}_{\text{Y}\to \text{X}}^{\text{sig}}=G{C}_{\text{Y}\to \text{X}}$ is called if the null test of the Fstatistics is rejected, and zero otherwise.
In multicomponent systems with more than two variables, one needs to consider the influence of other neurons or of a common stimulus, which we collectively denote by $\mathbf{\text{Z}}=\{{\text{Z}}_{1},\dots ,{\text{Z}}_{N}\}$. For example, to analyze a time series of $N+2$ neurons $\{{x}_{t},{y}_{t},{\mathbf{\text{z}}}_{t}\}$, one needs to distinguish direct information flow from Y to X, from indirect information flow from Y to Z and from Z to X, or from a common but delayed drive from Z to Y and from Z to X. Geweke, 1984 extended Granger causality to a multivariate version (MVGC) to measure the conditional dependence between variables. In this case, we explicitly include the variables describing all possible other neurons and external stimuli, $\mathbf{\text{z}}}_{t}=\{{z}_{1,t},\dots ,{z}_{N,t}\$, as regressed variables in both the reduced model and the full model. The current state of neuron X, x_{t}, now also depends on the past of Z, such that the reduced and the full models become
and
where ${\stackrel{~}{c}}_{j,q}$ and ${c}_{j,q}$ are the regression coefficients from the conditioned signal z_{j}. The Ftest statistics $F}_{\text{Y}\to \text{X}\mathbf{\text{Z}}$ and the GC values $G{C}_{\text{Y}\to \text{X}\mathbf{\text{Z}}},$ have identical expression as in Equation 3 and Equation 4, with the number of parameters adjusted accordingly in the reduced and the full models. The significance test is adjusted using multiple comparisons by applying a Bonferroni correction to the pvalue threshold.
It is important to point out that, although GC is based on linear regression of time series, its use is not limited to data generated by linear dynamics. In fact, as long as the data can be approximated by a vector autoregressive process, GC can be useful in revealing the information flow (Seth et al., 2015). For analysis of calcium signals, the intrinsic nonlinearity of calcium decay means it is not obvious a priori whether GC can be useful in revealing the information structure. As we will show in the following, the success of information retrieval depends on the details of the biological network, such as the interaction strength between pairs of neurons and the connection density, which leads to a complicated interplay among different timescales, such as the firing rate, the calcium decay time constant, and the time scale of information transfer.
GC analysis of a small synthetic neuronal net
To gain some intuition about how GC identifies links, we test it on synthetic data generated using small networks of 10 neurons, where we predefine the true connectivity (Figure 1A, B). One of the simplest types of dynamics is given by the same model that underlies the GC analysis, the vector autoregressive model (VAR), where the neural activities ${f}_{i,t}$ evolve according to
where ${\mathrm{\Gamma}}_{ij,q}$ is a memory kernel capturing the effect from neuron $j$ at time $tq$ on neuron $i$ at time $t$, and ${\xi}_{i,t}$ a Gaussian noise with zero mean and variance ${\sigma}^{2}$. For simplicity, we assume the memory kernel acts only within the preceding two time steps, ${L}_{\text{true}}=2$, and set ${\mathrm{\Gamma}}_{ij,q}={A}_{ij}{s}_{j}c$ for $q\u2a7d{L}_{\text{true}}$, and zero, otherwise. ${A}_{ij}$ is the adjacency matrix: ${A}_{ij}=1$ if neuron $j$ is directly presynaptic to neuron $i$, and ${A}_{ij}=0$ otherwise. The sign of interaction s_{j} shows whether neuron $j$ is excitatory (${s}_{j}=1$) or inhibitory (${s}_{j}=1$). The parameter $c$ is the connection strength, which we set to be positive. To ensure that the synthetic time series is stationary, we set the number of excitatory cells and inhibitory cells to be equal. The simulation time step is set as the unit time, ${\mathrm{\Delta}}_{\text{sim}}=1$.
We apply the GC analysis to synthetic data generated using the VAR model (Figure 1C) to see if we can recover the true network connectivity. For each neuron pair $(i,j)$, we compute the BVGC values following Equation 4, identifying ${x}_{t}={f}_{i,t}$ and ${y}_{t}={f}_{j,t}$, and computing the regression coefficients a_{q} and b_{q} using the standard least squares method. Analogously, we also compute the MVGC values, using all the remaining neuron signals ${{f}_{k,t}}_{k\ne i,j}$ as the conditional variables $\mathit{z}}_{t$.
As shown in Figure 1D, MVGC successfully identifies all the links. However, the bivariate GC assigns spurious links, as it cannot distinguish direct links from indirect ones. In general, MVGC has a lower false positive rate (identified nonexisting links) and false negative rate (missing true links) compared to BVGC when we vary the connection strength $c$ (Figure 1I). It is interesting to point out that, at small connection strengths, both MVGC and BVGC will fail to identify true links because the signaltonoise ratio is too small to detect them. In comparison to both GC analyses, the equaltime correlation completely misses the structure (see Figure 1D and Materials and Methods). While crosscorrelation with the lag fixed to the model maximum lag ${L}_{\text{true}}$ performs better, it still fails to reproduce the true connectivity as faithfully as MVGC (see Figure 1—figure supplement 4).
To gain intuition if GC can work for more realistic singleneuron dynamics, which are typically very different from the linear VAR model, we perform the same analysis with dynamics generated with generalized linear models (GLM). We assume the spike count of neuron $i$ at time t, ${\sigma}_{i,t}\sim \text{Poiss}({\lambda}_{i,t})$ is governed by a Poisson process with spiking rate ${\lambda}_{i,t}$.
where ${\mu}_{i}$ is the base rate for spiking, and the memory kernel ${\mathrm{\Gamma}}_{ij,q}$ is identical to the VAR model. We can further convolve the spikes with an exponential decay function to simulate calcium signals (GLMcalcium; see Materials and Methods for details).
Although the GLM and GLMcalcium dynamics are highly nonlinear and nonGaussian (Figure 1E and G), MVGC successfully identifies true links (see Figure 1FGJ). In general, for large correlation strengths, the false negative rate (missing true links) is less than 0.1. Interestingly, for the GLMcalcium model, the false positive rate (identified nonexistant links) at large connection strength is smaller than in the VAR model, as is the difference of errors between the MVGC and BVGC. A possible explanation is that for neurons with nonlinear interactions, the propagation of information in indirect links is less well described by linear models in direct links. In comparison, due to large autocorrelation caused by decay of the calcium signal, the correlation and crosscorrelation performs much worse for the nonlinear dynamics than for the VAR model (see Figure 1—figure supplement 5 and Figure 1—figure supplement 6).
This analysis of synthetic data shows that GC can reveal underlying connectivity structures from calcium signals, and that the performance difference between BVGC and MVGC is smaller in nonlinear models.
In the brain, a population of neurons can encode redundant signals. If the redundancy is in the structure of the network, for example, when two neurons share the identical input and output, the stochasticity will lead to different neuronal dynamics, and MVGC is able to identify the true connection. However, if two neurons have exactly the same activity, MVGC will underestimate the causality, while BVGC still reveal these connections (see Figure 1—figure supplement 7).
In GC analysis, it is essential to choose the maximum time lag $L$. If the maximum time lag $L$ is much less than the true interaction timescale ${L}_{\text{true}}$, then the reconstructed information flow will be incomplete. On the other hand, if $L$ is too large, we will run into the risk of overfitting. Here, we briefly discuss the effect of the maximum lag $L$ on the resulting GC network, using data generated by the synthetic network given by Figure 1A, simulated with VAR dynamics with true maximum lag ${L}_{\text{true}}=3$.
As we vary $L$, the maximum time lag used in the GC regression models, we observe that increasing the maximum time lag does not always improve our ability to identify the network connectivity (Figure 2A, B). Going beyond the true maximum time lag leads to errors, increases both the false positive rate (Figure 2) and the false negative rate (Figure 2B). Compared to MVGC, BVGC is more prone to detect spurious links, even at the correct maximum time lag.
Interestingly, we observe that the average GC value as a function of the number of maximum lags can be used to identify the true maximum time lag ${L}_{\text{true}}$. As shown by Figure 2, both the average BVGC and MVGC values increase steeply as $L$ approaches the true maximum lag, and reach a plateau for a few maximum lag values before overfitting significantly increases the average GC values. Based on this observation, we select the maximum time lag to be the knee of the average GC value as a function of the maximum lag, as this maximum time lag captures enough information flow while the number of fitted parameters is kept as small as possible.
GC analysis of motoneuron data
The data
In order to test the performance of Granger causality on population calcium imaging data, we focus our analyses on previously published calcium transients in motoneurons of embryonic zebrafish (Tg(s1020t:Gal4; UAS:GCaMP3), data published in De Vico Fallani et al., 2014), where the underlying structure of information flow is known. The calcium transients were recorded using fluorescent tags in vivo for 250 s at 4 Hz and with singlecell resolution, in ventral spinal cord motoneurons while the zebrafish was showing spontaneous coiling activity of its tail (Figure 2E–G). During these fictive coiling events, the activity of motoneurons follows a wave from the rostral to the caudal side of the spinal cord (Buss and Drapeau, 2001; Masino and Fetcho, 2005; Warp et al., 2012). Upon activation along the rostrocaudal axis, spinal motoneurons that project to the axial skeletal muscle fibers elicit a rostrocaudal wave of muscle contractions that alternates between the left and right side of the body and propels fish forward. Accordingly, the calcium signals show steady oscillations alternating between the left and right sides of the spinal cord, leading to leftright muscle activation (Figure 2G). We use this data set in order to see if we recover the expected sequential information flow, from the rostral spinal cord to the caudal spinal cord (arrows in Figure 2EF, networks in Figure 2H).
To encode the underlying information structure of the neuronal network, we label the recorded $N$ neurons with index $i$, and the side of the neurons by ${s}_{i}\in \{\text{L},\text{R}\}$. Assume that there are ${N}_{l}$ neurons in the left chain and ${N}_{r}$ neurons in the right chain, we order the indices such that the odd neurons are found on the left side, with ${s}_{i}=\text{L}$ for $i=1,3,\mathrm{\dots},2{N}_{l}1$, and the neurons with $i=2,4,\mathrm{\dots},2{N}_{r}$ are found in the right chain with ${s}_{i}=\text{R}$. The indices are then ordered by the neurons position along each of the two chains, such that if $s}_{i}={s}_{j$, neuron $i$ is more rostral than neuron $j$ if $i<j$. The fluorescence signal for the $i$th cell at time $t$ is denoted by ${f}_{i}(t)$, or equivalently by $D{F}_{i}/{F}_{i}(t)$ to represent the normalized fluorescence activity.
We will compute the GC network for multiple trials of calcium transients recorded in five fishes. Following the nomenclature of the dataset published in De Vico Fallani et al., 2014, we refer to the dataset recorded for fish $A$ and trial $B$ as $fAtB$; for example, $f3t2$ denotes trial number 2 recorded for fish number 3. From the dataset, we exclude recordings for fish number 2 because the recorded time is 2 min, much less than that of the other recordings of 4 min.
Naive GC
We apply the GC algorithm as described in Granger Causality to the calcium transients observed in embryonic motoneurons, and reproduce the results obtained in De Vico Fallani et al., 2014 for the dataset f3t2. We choose the maximum time lag at the knee of average GC value, $L=3$, corresponding to a duration of 750 ms, for both BVGC and MVGC analysis (Figure 2D). This knee corresponds to the maximum lag with the best balance between accuracy and complexity: further increasing the maximum lag yields very similar results but the GC analysis becomes much more computationally expensive (Figure 2—figure supplement 1). In order to verify that the length of the motoneuron dataset is sufficient to estimate the functional connectivity using Granger causality (see Figure 1—figure supplement 2 and Figure 1—figure supplement 3 for illustrations using synthetic data), we compared the two networks obtained by applying GC analysis to each half of the calcium imaging time series. The Pearson correlation coefficient between the GC values from the two halves is 0.87 for BVGC, and 0.75 for MVGC (see Figure 2—figure supplement 2), indicating the trial length is long enough for a consistent estimation of the GC values.
The GC network is then represented by directed graphs (Figure 2I). The nodes of the graph represent the motoneurons and the circle size is proportional to the nodal delta centrality as defined by De Vico Fallani et al., 2014: the larger the circle, the more central is the neuron in terms of its tendency to act as a transmitter hub (red color, positive value) or receiver hub (blue color, negative value) of information flow. Arrows display the functional connectivity links, that is the flow of information from one neuron to another. In the expected network assuming perfect information flow (Figure 2H), all the links are ipsilateral and represent rostrocaudal information flow.
Compared to De Vico Fallani et al., 2014, we identify the same functional connectivity matrices yielding the same connectivity networks. While the previous study focuses on recovering ipsilateral links and ignores contralateral links (depicted in dim gray in the network figures, see De Vico Fallani et al., 2014), we investigate all contralateral links and ipsilateral links (Figure 2I) as we aim to study why spurious contralateral connections occur. We represented the statistically significant direct GC links by arrows, proportional in size and color intensity to the GC value. The networks resulting from the GC analyses both for BVGC and MVGC show many other links than just rostrocaudal ones on either side.
In the MVGC analysis however, one would have expected only direct links to be significant, since conditioning on the other neuronal signals should prevent indirect links from being significant. This expectation is very different from the MVGC network in Figure 2I, where spurious contralateral links prevails.
Nonetheless, it is well known that MVGC is sensitive to additive noise, lack of data points, etc. To assess the impact of these artifacts on the resulting GC networks for motoneuron data, we define measures of information flows as following.
Defining directional biases
The sequential and biaxial information flow in the motoneuron network provides an excellent testing ground for GC. Specifically for the motoneuron datasets, to quantify the success of reconstructing the information flow in the motoneuron network, and to compare the information flow across different motoneuron datasets, we define the following measures of directional biases. First, we define the ratio of normalized total weights of ipsilateral links compared to all links
where ${G}_{ij}$ is the Granger causality values between neurons $i$ and $j$. If all connections are equal, then ${W}_{\text{IC}}=0.5$. We also define the ratio of ipsilateral rostraltocaudal links compared to all ipsilateral links as
This ratio gives a measure of the directionality of information flow. A bias in rostrocaudal links results in a departure from ${W}_{\text{RC}}\sim 0.5$. Notice that based on the anatomy, the ground truth of information flow can be smaller than ${W}_{\text{IC}}=1$, because motoneuron activity has been shown to rely on command neurons in the brainstem that activate motoneurons sequentially in the spinal cord, not on motor neurons synapsing onto each other. With the two measures of directional biases of information flow, now we proceed to develop an improved pipeline for GC analysis. Using different datasets of the embryonic motoneurons as examples, we will show how to improve GC to analyze calcium transients of neuronal populations. Our goal is to recover a more biologically reasonable information flow compared to the naive GC network.
These measures are only constructed for the motoneuron networks. For more general networks with known structures, one can use the distance between the true and the GClearned connectivity matrix to test the efficacy of the GC method. For networks with unknown structure, one can compare the GClearned information flow by using different partitions of the data.
Correcting for motion artifacts
In calcium imaging experiments, small movements of the animal body or the experimental setup can result in artificial fluorescence changes in the recording. In the calcium transients of data f3t1, we observe a big drop in fluorescence at a single time point out of 1000 (Figure 3B top panel) in the calcium traces of all neurons in the recording. As seen in the top panels Figure 3C, this is sufficient to corrupt the results of the GC analyses: neuron 10 (Figure 3A) appears to drive almost all other neurons, including contralateral neurons. We correct the motion artifact by setting the outlying time point’s fluorescence value to the average of the previous and next values (Figure 3B bottom panel). The strongly dominant drive by neuron 10 disappears (Figure 3C bottom panels) and ${W}_{\text{IC}}$ increases from 0.39 to 0.66 for BVGC and from 0.43 to 0.68 for MVGC (Table 1). This shows that GC analysis is sensitive to artifacts at single time points.
The motion artifact is a global noise on the observed neurons, a source that is wellknown to create spurious GC links (Vinck et al., 2015; Nalatore et al., 2007). After careful examination of the calcium traces, we notice that the dominant neuron before motionartifact correction, neuron 10, is characterized by its small dynamic range. Because the rest of the activity is low for neuron 10 and that a single time point seems to have a strong effect on other calcium traces, GC links driven by this neuron are high. We looked at other recordings containing a motion artifact and found that removing it did not always eliminate the spurious dominant neurons. The larger the artifact, the better we eliminate the spurious links: in Figure 3, the artifact represents a deviation of about 80% of the $DF/F$ and its impact on GC links is large. When the deviation is lower, the strong spurious links are not necessarily due to the motion artifact, therefore removing the artifact does not improve the expected network results.
Removing strange neurons
In order to improve the GC analysis results, we want to keep only clean calcium signals and remove the uncharacteristically behaving whose activities do not follow the stereotypical profiles of rise and falls. This is especially important for the MVGC analysis, in which traces of all other neurons are considered in the calculation of a GC value between two given neurons. Although BVGC values are not affected by neurons outside of the observed pair, having extra neurons might also influence the network structure through the GC values of the links.
In general, cells whose calcium transients showed time decay of tens of seconds were identified as non neuronal cells and excluded from GC analysis (see overlays of calcium transients in Figure 3—figure supplement 1). In the embryonic zebrafish traces f5t2, shown in Figure 3E, we remove neuron 13 (Figure 3D) that displays only one long calcium transient lasting for tens of seconds, and neuron 21, which does not exhibit an oscillating pattern.
GC networks are drawn for the neuronal populations before (Figure 3F top panels) and after removal of these outlier neurons (Figure 3F bottom panels). Before the removal, neuron 21 appears to drive many other neurons, likely due to its lower signaltonoise ratio (SNR) and the correlated noise between calcium traces. Once removed, ${W}_{\text{RC}}$ increases, meaning that a mostly rostrocaudal propagation is observed, as information in motoneurons flows from the rostral to the caudal spinal cord (Table 2). We also observe a small decrease in ${W}_{\text{IC}}$ upon removal of neuron 21, due to the fact that the ipsilateral links going out from neuron 21 are not present anymore, but spurious contralateral links remain.
Smoothing
One known limitation of traditional Granger causality analysis is its sensitivity to noise, especially correlated noise across variables (Vinck et al., 2015; Nalatore et al., 2007). In empirical calcium imaging data, correlated noise is common due to systemwise measurement noise and spurious correlations coming from the slow sampling rate.
To illustrate the problem of the noise, we return to the synthetic models in Granger Causality, a simulated system of 10neurons with the VAR and the GLMcalcium dynamics, where the true underlying signal is ${f}_{i}(t)$, for each neuron $i$. We add a systemwide noise, such that the noisecorrupted signal is
and $\zeta (t)$ is Gaussian white noise, with $\u27e8\zeta (t)\u27e9=0$, and $\u27e8\zeta (0)\zeta (t)\u27e9={\sigma}_{\zeta}^{2}\delta (t)$. As shown in Figure 4—figure supplement 2 and Figure 4—figure supplement 3, large correlated noise significantly decreases the ability of the Granger causality analysis to identify true connections.
We exploit the stereotypical shape of calcium signals – a fast onset and a slow decay – to parametrize the structure of the noise for empirical calcium signals. We assume that any correlation in the decay phase is the correlation of the noise. Additionally, we apply total variation differentiation regularization to denoise the fluorescence time series for each neuron (Chen et al., 2019) (see Materials and Methods and Chartrand, 2011 for details). Briefly, for each neuronal signal $f(t)$, total variation differentiation regularization assumes the noise is Gaussian and white, and that the temporal variations in the derivative are exponentially distributed and only weekly correlated in time, which is wellsuited to describe calcium data with episodes of continuous firing and continuous silence. Compared to usual total variation denoising that regulates the temporal variations in the signal, regulating the variation of the derivative is better at detecting sudden change of the signal strength, and is better at keeping the causality when applied to multivariate signals. As an example, Figure 4 shows the original and the smoothed fluorescence signal for two neurons in the same fish (dataset f3t2 from De Vico Fallani et al., 2014), as well as the residual noise. The noise in the motoneuron is correlated with correlation coefficients close to 1 for some pairs of neurons (Figure 4B). The large correlated noise is likely the cause of our problematic results in the MVGC analysis (see Figure 2E and Figure 4C). Denoising the calcium time series before the GC analysis eliminates links, especially in the MVGC approach (Figure 4C). The signal as quantified in terms of the relative weight of ipsilateral links ${W}_{\text{IC}}$ is also closer to the biological expectation (${W}_{\text{IC}}\sim 1$) than before smoothing, with MVGC giving better results (Figure 4D).
We note that total variation regularization is a global denoising method that acts on the entire time series. Different from a lowpass filter, total variation regularization is able to keep the fine structure of the fast rise in the calcium signal, while smoothing out the noise. Nonetheless, the assumption is that the noise is Gaussian, which should be verified before smoothing the data by examining the highfrequency end of the signal’s power spectrum. To address the problem of highfrequency correlated noise, one may propose to use spectral Granger causality (Geweke, 1982; Geweke, 1984), and only focus on the resulting GC values at low frequencies. However, we show that spectral GC applied to the unsmoothed calcium signal is unable to recover the information flow for any of the frequencies (Figure 4figure supplement 4). Finally, in cases where correlated noise cannot be removed, there are alternative methods to reject false links, for example using Granger causality computed on the timereversed data as the null hypothesis (Vinck et al., 2015).
Slow calcium timescale
One potential pitfall of performing Granger causality analysis on singlecell calcium signals is that the timescale of the calcium indicator and the sampling are typically much slower than that of the information transfer. The onset of calcium signals can occur in few milliseconds, comparable with the timescale of information transfer, that is the time difference between the firing of a presynaptic neuron and a postsynaptic neuron downstream. Nonetheless, based on the intrinsic buffer capability of neurons and the fluorescent properties of the calcium indicator, calcium signals typically decay on the timescale of hundreds of milliseconds to seconds (Tian et al., 2012). In motoneurons, neurons can fire in bursts of spikes, which may further hinder the resolution of information propagation. In this section, we investigate how the interplay of these slow and fast timescales impacts the accuracy of Granger causality analysis to identify information flow.
In the embryonic spinal cord, motoneurons are recruited along the rostrocaudal axis sequentially. The muscle contractions alternate between the left and right side of the body, and propel fish forward. Correspondingly, the left and right sides of the motoneurons are active as a steady oscillation from the head to the tail, leading to an expected information flow from rostral to caudal (Warp et al., 2012). The spikes are followed by a plateau that lasts on the order of seconds (SaintAmant and Drapeau, 2001). Using a combined statistical and dynamical approach, we simulate artificial “motoneuron” data (see example traces in Figure 5A), as two chains of five neurons, driven externally by two binary stimuli, ${I}_{\text{L}}(t)$ and ${I}_{\text{R}}(t)$, that randomly flip between an on ($I=1$) and an off ($I=0$) state, following the same statistics as recorded in the data. The stimulus of the left side, ${I}_{\text{L}}(t)$ is generated by randomly choosing time intervals when the stimulus state is on, ${\tau}_{\text{on}}$, and when it is off, ${\tau}_{\text{off}}$, uniformly from the sets $[\mathrm{min}({\tau}_{\text{on}}^{\text{data}}),\mathrm{max}({\tau}_{\text{on}}^{\text{data}})]$ and $[\mathrm{min}({\tau}_{\text{off}}^{\text{data}}),\mathrm{max}({\tau}_{\text{off}}^{\text{data}})]$, respectively. Here, $\{{\tau}_{\text{on}}^{\text{data}}\}$ is the set of duration of all “on”states in the experiment (De Vico Fallani et al., 2014) that we find by the positive finite difference method in the fluorescent signal smoothed by totalvariational regularization (Chartrand, 2011; Chen et al., 2019). These sets of longlasting time blocks, where the neurons are always active, simulates the plateau activity of the embryonic motoneurons. After ${I}_{\text{L}}(t)$ is sampled, the stimulus for the rightside neurons, ${I}_{\text{R}}(t)$, is generated such that the time delay between the “on”state of ${I}_{\text{L}}(t)$ and ${I}_{\text{R}}(t)$ is sampled from $[\mathrm{min}({\tau}_{\text{delay}}^{\text{data}}),\mathrm{max}({\tau}_{\text{delay}}^{\text{data}})]$, where the delay time ${\tau}_{\text{delay}}$ matches the empirical stimulus distribution. We model the sequential activation of the motoneurons by imposing the spike rate as a function of ${I}_{\text{L}}(t)$ and ${I}_{\text{R}}(t)$ (see Materials and Methods) and we use a standard Poisson process with these rates to generate the spike train for each neuron, $\{{\sigma}_{i,t}\}$. The spikes are then convolved with an exponential decay with time scale ${\tau}_{\text{ca}}$ to simulate the slow decay time of calcium signals.
The timescales in this model include the sampling time ${\tau}_{\text{sampling}}$, the calcium decay time constant ${\tau}_{\text{ca}}$, and the timescale of information propagation ${\tau}_{\text{info}}$. The sampling time is given by the experimental setup for the motoneuron data we analyze (De Vico Fallani et al., 2014), with ${\tau}_{\text{sampling}}=0.25\text{s}$. As shown in Figure 3—figure supplement 1, the empirical calcium decay time constant can be found by overlaying all epochs of decays of the calcium signals from different bursts and different neurons: fitting the decay of calcium signals to exponential functions gives ${\tau}_{\text{ca}}=2.5\text{s}$. Notice this decay timescale is limited by both the buffering capability of the sensor, here GCaMP3, and the intrinsic buffering capability of the cell (Tian et al., 2012), that is having a faster sensor does not always decreases this decay time scale. The information propagation timescale is chosen such that $\tau}_{\text{info}}<{\tau}_{\text{sampling}$. The baseline spike rate is set to ${\lambda}_{0}=32{\text{s}}^{1}$. Finally, the time resolution for the simulation is chosen to be ${\mathrm{\Delta}}_{\text{sim}}=0.0125s$, such that $\lambda {\mathrm{\Delta}}_{\text{sim}}<1$ to add stochasticity in the simulation.
Applying bivariate and multivariate GC analysis to the simulated data with the above parameters, we see that when the information propagation time scale is similar to the sampling time scale, both GC methods retrieve the correct information flow (Figure 5B). In general, multivariate GC performs better in identifying information flow compared to bivariate GC. Even when the information propagation time scale is much smaller than that of the sampling rate (${\tau}_{\text{info}}\le {\tau}_{\text{sampling}}$), the information flow can still be correctly retrieved, and the multivariate GC gives the correct input ${W}_{\text{ipsi}}$ value 1, and ${W}_{\text{RC}}$ significantly larger than the null value of 0.5.
The success of retrieval of information flow also depends on the decay constant, although the effect is not strong. As shown in Figure 5C, the performance is almost perfect for short calcium decay times, with ${W}_{\text{IC}}=1$ and ${W}_{\text{RC}}=1$, and longer decay results in slightly worse performance. For the empirical calcium decay rate in the GCaMP3expressing motoneuron data (indicated by the red line), ${W}_{\text{RC}}^{\text{MVGC}}=0.81$. Naturally, using fast calcium indicators in the experiments facilitates the analysis, but the improvement may not be significant.
Adaptive threshold for Fstatistics
We now compute the Granger causality strength (Equation 4) on the smoothed calcium time series. Naively, the Ftest is used to determine whether the Granger causality is significant (Equation 3). This test is exact if the residual, or the prediction error, is Gaussiandistributed and independent. For calcium signals, these assumptions do not hold: the noise is not necessarily Gaussian, and is not independent. While subsequent versions adapting GC to nonlinear and nongaussian dynamics with assumptions of the underlying dynamics have been shown to have wellknown asymptotic null distributions (Kim et al., 2011; Sheikhattar et al., 2018), it did not account for additional irregular stimuli. To develop a general method that can be applied to datasets with diverse stimuli, we use a datadriven method to construct the null distribution.
To test if we can still use the Ftest as a test of significance, we generate data consistent with a null hypothesis by cyclically shuffling the time series of the driving neuron (see Figure 6A). Specifically, for the bivariate GC value from neuron j to another neuron i, $G{C}_{j\to i}$, and the multivariate conditioned on the activity of the rest of neurons $\left\{{f}_{k}\right\},k\ne i,j$, we compute the Fstatistics with the time series ${f}_{j}(t)$ replaced by signal shuffled by a random time constant $\mathrm{\Delta}{t}^{rand}$:
If the significance test remains valid for the calcium time series, we expect almost no significant links and the distribution of the Fstatistics of the shuffled data should be the Fdistribution. However, as Figure 6BC shows using the example dataset f3t2 from De Vico Fallani et al., 2014, for both bivariate and multivariate GC, the distribution of the Fstatistics for the shuffled data, ${\mathcal{F}}^{\text{shuffle}}$, has a shifted support that is larger than that of the Fdistribution. This means that if we use the original Ftest based on the Fdistribution as a null model, even if there are no Gcausal links between two neurons, we will classify the link as significant.
To improve specificity, we instead define the null expectation as the distribution for the Fstatistics of the cyclically shuffled data to test for the significance of GC links. A GC link from the data is significant, if it has an Fstatistics that is above the $1p$ percentile of the null Fstatistics distribution, generated from the cyclically shuffled data. The null Fstatistics distribution can be approximated nonparameterically by Monte Carlo reshuffles. However, since the significance threshold depends on the tail of the Fstatistics distribution, to get an accurate estimation requires a large number of reshuffles and is computationally undesirable.
Instead, to accelerate the computation, we notice that the bulk of the ${\mathcal{F}}^{\text{shuffle}}$ distribution is well approximated by an Fdistribution, with outliers explained by the pseudoperiodicity of the data. We parameterize the null empirical distribution by $\mathcal{F}(F;\alpha ,\beta )$, and find the parameters by maximizing the likelihood (see Materials and methods for details). Repeating the significance test with the parametrized Fdistribution, we find a resulting causality network with fewer links (Figure 6D).
Pipeline summary and GC analysis of motoneuron data
Putting together all the observations of the previous sections, we developed an improved Granger causality analysis pipeline (Figure 7A) to overcome the limitations of naive GC for the investigation of directional information flow in neuron calcium data. These improvements consist of both preprocessing the calcium traces (see GC analysis of motoneuron data) and determining a custom threshold for the significance testing of the presence of a GC link, to avoid the many false positive links that the naive GC analysis finds (see Adaptive threshold for Fstatistics). Preprocessing the data aims to remove undesirable effects of noisy fluorescence signals and of motion artifacts on the GC results. The first step consists of selecting only neurons displaying calcium traces that match their expectation (Removing strange neurons): calcium traces with a low signaltonoise ratio or inconsistent behavior should be disregarded. This step is particularly important for MVGC analysis, in which the traces of all neurons are used to determine the presence of a link between any pair of neurons. Second, we correct the instantaneous change in fluorescence caused by motion artifacts (Correcting for motion artifacts). In the final preprocessing step, the calcium signals are smoothed to remove remaining noise (Smoothing), especially noise that is correlated across neurons and that leads to spurious links in the BVGC approach, and missing links in the MVGC analysis. In the BVGC analysis, correlation in the noise might be perceived as a link between two neurons. In the MVGC case, the correlation in the noise of the receiving neuron and in the noise of neurons on which the GC is conditioned can hide the actual influence of the signal of the driving neuron on the signal of the receiving neuron.
After these preprocessing steps, one can apply the GC algorithm to these calcium traces. In order to avoid calculating GC for many maximum time lags, the choice of this parameter can be chosen to match biological delay of information flow – here 750ms. For each pair of neurons, we timeshift one of the calcium traces in the GC analysis of a pair of neurons to remove causality between the two signals and use this new null model to determine the significance threshold, as motivated in Adaptive threshold for Fstatistics. Notice that for some datasets, the resulting MVGC network has more connections than the corresponding BVGC network. This is because the adaptive thresholds for the significance test are chosen separately for BVGC and MVGC analysis.
We take advantage of anatomical knowledge of the expected network for the motoneuron data set to compare the naive GC results to our customized GC results. The improved GC pipeline significantly increases the ${W}_{\text{IC}}$ ratio from $0.62\pm 0.15$ to $1.00\pm 0.00$ for both BVGC and MVGC ($p<0.001$). ${W}_{\text{IC}}=1$ indicates that there are no contralateral flow, corresponding to what is known from anatomy. On the other hand, the ${W}_{\text{RC}}$ ratio does not significantly increase ($0.59\pm 0.09$ to $0.59\pm 0.15$ for BVGC and $0.52\pm 0.08$ to $0.53\pm 0.10$ for MVGC, $p>0.05$) (Figure 7B). ${W}_{\text{RC}}$ close to 0.5 shows that GC analysis is unable to capture the rostrocaudal information flow pattern.
The improvement of the ipsilateral ratio can be explained by the disappearance of most contralateral links that were spuriously detected by the naive GC analysis. The rostrocaudal ratio ${W}_{\text{RC}}$ is however not significantly improved. The temporal resolution of the recording (4 Hz leading to one frame every 250ms) may not be sufficient to correctly determine the directionality of the information flow as is much slower than the actual propagation speed (a few milliseconds, see Masino and Fetcho, 2005). In a similar manner, the decay timeconstant of GCaMP3 is large compared to the refractory period between two spikes. Results could be further improved by recording at higher frequency and using novel fluorescent calcium sensors such as GCaMP6f, which has faster kinetics.
GC analysis of hindbrain data
In order to investigate the information flow between neurons in a more complex dataset, we analyzed population recordings from neurons in the hindbrain of larval zebrafish performing optomotor response (OMR; data from Severi et al., 2018, also see Figure 8A and B), in which the true information flow is unknown. On each horizontal plane along the dorsoventral axis across the whole hindbrain, neurons were recorded in vivo using twophoton laser scanning calcium imaging at 5.81 Hz while Tg(elavl3:GCaMP5G) transgenic larval zebrafish (previously known as Tg(HUC:GCaMP5G)) responded to a moving grading on their right side ($n=139$ neurons in the reference plane, Figure 8A). Upon motion of the grading, zebrafish larvae responded by swimming forward, often with a left bias (Figure 8C, right). We classified hindbrain neurons in three categories: motorcorrelated neurons in the medial V2a stripe (red in Figure 8A and B, Kinkhabwala et al., 2011; Kimura et al., 2013; Pujala and Koyama, 2019), other motorcorrelated neurons (blue in Figure 8A and B) and non motorcorrelated neurons (gray in Figure 8A and B). Because V2a neurons localized in the medial hindbrain are wellknown command neurons driving locomotion (Kimura et al., 2013; Pujala and Koyama, 2019), we first focused our analyses on medial neurons whose activity was correlated to swimming (red neurons in the rectangle, ${n}_{\text{med}}=20$, in Figure 8A and B). The corresponding calcium traces (Figure 8B) were exempt of motion artifacts and smoothing was not necessary as the scanning of the infrared laser leads to noise that is not correlated between neurons. When applying the naive bivariate Granger causality using the original threshold, we found almost all links to be significant (Figure 8D), suggesting numerous spurious links due to the high overall correlation in neuron activity. Motorcorrelated neurons exhibited a higher drive (Figure 8E), defined as the sum of all its outgoing GC links:
where $j$ spans all other neurons. Critically, the neuronal drive was not correlated to the neuron’s position along the scanning direction (correlation of $r=0.15$ across all 139 neurons and $r=0.03$ across medial swimcorrelated neurons, $p>0.05$, Figure 8—figure supplement 1A), implying that a correction for the scanning delay of the laser scanning microscope was not necessary. In contrast, the drive for a given neuron was correlated to the signaltonoise ratio (SNR) of its calcium trace, computed following details in Materials and Methods, with a Pearson’s correlation coefficient of $r=0.45$ across all neurons and $r=0.69$ across medial neurons ($p<0.001$, Figure 8E). This effect could be corrected using an adaptive threshold tailored to each neuron pair to normalize the GC values. Interestingly, we noticed that the drive was higher for motorcorrelated neurons, especially the ones located in the medial stripe and assumed to likely correspond to V2a reticulospinal neurons (Kinkhabwala et al., 2011; Kimura et al., 2013; Pujala and Koyama, 2019), further justifying focusing our attention on this subset of neurons.
To remove the spurious connections, we computed, as introduced before, the custom threshold for the significance testing of GC links. The difference from motoneuron dataset is that the activities of neurons in the hindbrain are driven by a periodic stimulus. Therefore, to generate the null statistics, we randomize the neural activities of the driver neurons by shuffling the neural signals, while keeping the stimulus fixed. More specifically, we divide the entire recording into 19 epochs of consecutive onset (10 s) and offset (5 s) of the stimulus (see Figure 8B), and generate shuffled calcium traces by randomizing the order of these 19 epochs. For each neuron pair $(i,j)$, we apply $m$ such random shuffles to the calcium trace of the driver neuron $i$, and compute the Fstatistics $\{{F}_{i\to j}^{\text{shuffled}}\}=\{{F}_{i\to j}^{{s}_{1}},\mathrm{\dots},{F}_{i\to j}^{{s}_{m}}\}$. For BVGC, $m=1000$, and for MVGC, $m=100$, we denote the empirical distribution of $\{{F}_{i\to j}^{\text{shuffled}}\}$ by ${\mathcal{F}}^{\text{shuffled}}$.
As in the previous examples, we need to parameterize ${\mathcal{F}}_{i\to j}^{\text{shuffled}}$. As before, for each neuron pair $(i,j)$, we noticed that the distribution ${\mathcal{F}}_{i\to j}^{\text{shuffled}}$ was welldescribed by a constantrescaled Fdistribution (Figure 8—figure supplement 2). Applying an adaptive threshold on ${F}_{i\to j}$ reduces to applying the original threshold on the normalized ${\stackrel{~}{F}}_{i\to j}$, defined by dividing the naive $F$statistics with the expectation value of the $F$statistics generated by the shuffled data. Mathematically,
Finally, by analogy, we use Equation 5 to compute the normalized version of the GC values, directly from the normalized Fstatistics. The final BVGC matrix is shown in Figure 8F. Normalizing the BVGC values allowed the reduction of the correlation between the drive and the SNR, from 0.69 to now 0.49 for medial motorcorrelated neurons (Figure 8G). With this new threshold for significance, most spurious links disappeared. The remaining significant BVGC links between medial motorcorrelated neurons revealed strong information flow between these key command neurons. The MVGC resulted in a much smaller number of significant links (Figure 8H), suggesting a strong global correlation between the neurons, with only the strongest links remaining.
In order to quantify the direction of global information flow, we computed the distribution of the angles of the significant GC links. We normalized the raw distribution by the distribution of all possible GC links in the network, to suppress the influence of neuron positions on the shape of the histogram (see Materials and methods). The normalized distribution (Figure 8I) revealed a biased righttoleft information flow. Because the visual grading was presented on the right side of the fish, moving forward from tail to head, one expects the fish to turn more to the left. As expected, we observe in the plane of interest a biased distribution of the tail angles towards the left (Figure 8C, right). Such righttoleft flow of information was systematic across all motorcorrelated neurons in the same plane (Figure 8, blue), as well as across the 11 planes in the same reference fish, and for all planes of the 10 fish (Figure 8—figure supplement 4, blue, purple, gray respectively).
The observed flow of information from right to left in the hindbrain could reflect the recruitment of excitatory command neurons in order to recruit spinal motor neurons on the left, and start swimming with a left bend. Upstream of command neurons lies a critical high motor center that has been identified by electrical stimulations across multiple vertebrate species and is referred to as the mesencephalic locomotor region, or MLR (Shik et al., 1966). We recently identified, functionally and anatomically, the locus of the MLR in larval zebrafish (CarboTano et al., 2022). We therefore investigated the GC links across three ventral and horizontal planes where numerous command neurons are located in the caudal hindbrain (Kimura et al., 2013; Pujala and Koyama, 2019): the reference plane analyzed in Figure 8, and the adjacent planes respectively 10 µm more dorsal and more ventral than the reference plane (Figure 9A–C, left). In the behavioral recordings acquired for all three planes, we observed a biased distribution of the tail angles towards the left (Figure 9A–C, top right). Accordingly, a righttoleft flow of information was found across motorcorrelated neurons (Figure 9A–C, bottom right). Remarkably, we also detected strong driver neurons (Figure 9A–C, left) in the locus of the right MLR centered in rhombomere 1 (CarboTano et al., 2022, Figure 9B, left). As expected from anatomy (CarboTano et al., 2022) and physiology (Brocard et al., 2010, Cabelguen et al., 2003, Ryczko and Dubuc, 2013, Ryczko et al., 2016), neurons in the right MLR locus drove the activity of neurons in the left MLR locus as well as more caudally located neurons in the ipsilateral and contralateral hindbrain.
Discussion
Granger causality is a simple and effective method for information flow retrieval in large brain areas (Shojaie and Fox, 2021; Barnett et al., 2018; Nicolaou et al., 2012). However, to apply GC analysis on calcium signals from population of neurons at singlecell level requires a careful reexamination of the method, because of the nonlinearity, nonGaussianity, and the possible predominance of correlated noise in the data depending on the imaging method used. We have developed an improved analysis pipeline and demonstrated its usefulness in retrieving information flow in zebrafish either within a dozen of motoneurons (De Vico Fallani et al., 2014) and tens of hindbrain neurons (Oldfield et al., 2020). Since the properties of calcium signals in neuronal populations are comparable across animal models, the tailored analysis pipeline we are proposing is not limited to zebrafish, and can be applied to calcium imaging data across animal species.
The first concern of applying GC analysis to calcium signals is how well GC, a measure of linear dependence among signals, can retrieve information flow in these nonlinear and nonGaussian imaging data. Using synthetic data on small networks with different types of dynamics mimicking calcium signals, from linear autoregressive models to spiking dynamics with exponential decay, we have demonstrated that GC can successfully reconstruct the underlying true dynamical interactions among neurons for a wide range of dynamical variables. In the case of motoneurons that fire repeatedly in bursts, we modeled network activity with two chains of neurons using empirically identified dynamical parameters. In this simulation, we showed that GC can retrieve the expected ipsilateral flow of information with rostraltocaudal directionality, even when the information propagation occurs at a faster scale than the experimental sampling time and the time decay of the calcium transient. Altogether our results on simulations show that GC is a useful analysis method for population recordings of singlecell level calcium signals.
Throughout the examples used, we compared results from bivariate GC and multivariate GC analysis. While BVGC overestimates the true number of causal links, we find that MVGC is prone to a winnertakeall phenomenon that may represent just one of many plausible systemlevel models that can account for the observed data. This is especially problematic when the signals are strongly correlated, for example, in the presence of redundant signals. In addition, MVGC is more prone to the issue of overfitting. We therefore suggest using BVGC on datasets with large numbers (hundreds) of neurons, and MVGC on datasets with few tens of neurons. In general, starting with a small number of neurons of special interest seems wise.
After improving the confidence on the GC method, we applied it to real calcium signals. Because real data is noisy and noise can be correlated or not depending on the imaging method used, GC analysis requires careful examination of multiple pitfalls originating from these calcium signals that we addressed in our analysis pipeline. Our pipeline illustrates the preprocessing of the signals and the postprocessing of the GC results. This effort is complementary to previous studies where new variations of GC were introduced to take into account the nonlinear and nongaussian dynamics in interacting neurons, for example by modifying the underlying VAR dynamics with GLM or VAR with sparsity constraints, and subsequently using adaptive thresholds for significance tests (Kim et al., 2011; Sheikhattar et al., 2018; Francis et al., 2018; Francis et al., 2022). Future work of interest should combine these variations of GC with the pipeline we built here.
Preprocessing of calcium imaging data. As reported previously in Vinck et al., 2015; Nalatore et al., 2007, we find that GC analysis is sensitive to correlated noise, which can often occur in calcium signals, due to, for example, motion artifacts or fluctuations of the imaging laser for nonscanning imaging. We highlight that a motion artifact occurring at a single time point can create spurious GC links and should therefore be removed before proceeding to GC analysis. Similarly, correlated noise across neurons throughout the measurement will also results in spurious links, and will be especially problematic for multivariate GC analysis. The pitfalls associated with correlated noise can be resolved by identifying statistical features of the noise during the decay of calcium signals, and subsequently removing them using total variation differentiation regularization as we used here, or other nonlinear filtering methods. Future versions of the smoothing should be applied to all neurons simultaneously, assuming that the noise is correlated, to best preserve the timeordering of the signal.
Adaptive threshold for significance test in GC analysis. The naive GC analysis uses a significance test that compares the Fstatistics with the null test, a Fdistribution. While this null model is exact for Gaussian processes, for real data with nonlinear dynamics, nonGaussian noise or data with periodic stimuli, the null distribution often deviates and needs to be learned through randomization of the data. In order to develop a more stringent test of significance for GC links, we generated null models of the Fstatistics using random cyclic shuffles of the real data. By using a pairspecific threshold for the Fstatistics, we were able to correct for the effect of the signaltonoise ratio on the GC value.
When real calcium traces were acquired with a spinning disk microscope, that is using a camera chip for detection of all pixels at once illuminated by a laser, the noise was correlated because all pixels over the field of view were recorded simultaneously. In this case, smoothing the fluorescence traces was consequently necessary. In contrast, this step is not advised with twophoton laser scanning microscopy or high speed volumetric lightsheet imaging in which respectively different pixels and planes are simultaneously scanned across the sample. Furthermore, we found that the scanning delay in laser scanning microscopy acquired at 5.81 Hz did not influence the GC results: neurons recorded at the beginning of the scanning cycle did not appear as driver neurons of neurons recorded at the end of the cycle. We found however that the drive of a given neuron computed from the naive GC was correlated with the signaltonoise ratio (SNR) of the calcium trace. Using an adaptive threshold for the significance test, the correlation between the drive and SNR decreased.
Applied to motoneuron data, the new pipeline led to an improvement in removing the spurious contralateral links, but did not significantly improve the directionality of the information flow from rostral to caudal — likely due to the small temporal resolution of the data (4 Hz) while activity propagates in few tens of milliseconds between segments.
In population recordings of brainstem neurons with unknown underlying connectivity, we first focused on neurons whose activity was correlated with the motor output and were located in the area of important command neurons, referred to as the V2a medial stripe (Kinkhabwala et al., 2011) that was previously shown to be critical for generating motor output (Kimura et al., 2013). We found that V2a neurons whose activity was correlated to the motor output were more likely to drive the activity of other neurons than neurons whose activity did not encode the motor output. In the condition where the stimulus is presented on one side of the fish, resulting in the fish often performing contralateral turns, the adapted multivariate Granger causality analysis revealed that the information flows mostly from the ipsilateral to the contralateral side in the brainstem. In order to further understand the origin of the drive underlying this effect, we extended the analysis to all neurons that had correlated activity with the motor output, including neurons outside the V2a medial stripe. Remarkably, we found a cluster of neurons that contained strong driver neurons on the side where the visual stimulus was presented and in the locus of the MLR, a motor center that recruits neurons in the V2a medial stripe to generate motor output (CarboTano et al., 2022). Our observations suggest that MLR cells on the ipsilateral side mainly drive the activity of contralateral MLR cells and motorcorrelated neurons to elicit swimming. Throughout vertebrate species, the MLR has been defined functionally based on electrical stimulations effective at eliciting movement, but had not been yet identified from neuronal recordings during locomotion due to the difficulty to access this deep region in the brainstem during active locomotion. To our knowledge, our result is therefore the first evidence in vertebrates of recruitment of cells in the MLR locus during active locomotion.
We have focused in this manuscript on issues that are specific to calcium imaging. We only briefly discuss problems that are also common to other neural data types such as fMRI. Firstly, we emphasize that with current experimental techniques, only a small portion of the entire nervous system is recorded. While newer versions of GC have been adapted to take into consideration latent variables which contribute equally to all neurons (Guo et al., 2008), or through subsequent removal of causal links (Verny et al., 2017), GC does not in general allow identification of the true underlying connection in the presence of unobserved neurons. However, improved GC is able to identify information flow and an effective functional causal network for the observed neurons. Another issue specific to multivariate GC is the problem of overfitting (Seth et al., 2015), especially when the number of neurons is large compare to the number of independent samples in the data. In this situation, one would need to consider group the activities of neurons (Meshulam et al., 2019), either according to their correlation, or by adjacency, and apply GC to this coarsegrained network. One can also apply GC to subnetworks of observed neurons, and construct a distribution of information flows. Note that other approaches to identify causal links than GC have been developed and applied in genomic data (Verny et al., 2017), gene regulatory networks (Meinshausen et al., 2016) and largescale neural networks (Ke et al., 2019). Further work is now needed to explore whether and how these approaches could be adapted with our improved pipeline to calcium recordings.
Overall, comparing to common practices of using correlation to extract a functional network from calcium neural signals, our GC pipeline provides more information by informing on the direction of information flow. Additionally, compared to existing methods to overcome the pitfalls of calcium data, such as addressing the global additive noise using an inversetime GC (Vinck et al., 2015), our pipeline preprocesses calcium signals taking advantage of their slow decay, and keeps the GC approach as a simple method. By writing down GC value as a function of the Fstatistics, normalized by null models directly constructed using shuffled data, we were able to correct the GC network by adjusting the values of individual links, and thereby to reduce the influence of signaltonoise ratio in a simple yet effective way. The improved pipeline makes it possible to apply Granger causality analysis to data from experiments that simultaneously record from high numbers of neurons in larger networks.
Materials and methods
Synthetic dynamics on small networks
Request a detailed protocolWe simulate artificial neural networks with three different dynamics, evolving at a time resolution of ${\mathrm{\Delta}}_{\text{sim}}$. For simplicity, we set the information flow to occur at the same time delay. The memory kernel, or the influence of neuron $j$ on neuron $i$ separated by time lag of $q$ (in the unit of ${\mathrm{\Delta}}_{\text{sim}}$), is denoted as ${\mathrm{\Gamma}}_{ij,q}$. The three dynamics are:
Vector autoregression models (VAR)
${f}_{i,t}=\sum _{q=1}^{{L}_{\text{true}}}\sum _{j}{\mathrm{\Gamma}}_{ij,q}{f}_{j,tq}+{\xi}_{i,t},$where ${\xi}_{i,t}$ is a white noise. The vector autoregression model is identical to the underlying model for Granger causality analysis, and allows GC to best reproduce the network structure.
Spike dynamics for generalized linear model (GLM)
We simulate spiking neurons with nonlinear interaction is through the generalized linear model (GLM), where the spiking rates of the neurons are given by
${\lambda}_{i,t}=\mathrm{exp}\left({\mu}_{i}+\sum _{q=1}^{{L}_{\text{true}}}\sum _{j}{\mathrm{\Gamma}}_{ij,q}{\sigma}_{j,tq}\right)$and the spike counts
${\sigma}_{i,t}\sim \text{Poiss}({\lambda}_{i,t}).$Here, ${\mu}_{i}$ gives a base rate for spiking, ${\mathrm{\Gamma}}_{ij,q}$ is how the spike rates depend on the neurons’ past history. The spike counts ${\sigma}_{i,t}$ are what we consider as synthetic data, and what we will use for Granger causality analysis.
Spike dynamics regressed by exponential decay (GLMCalcium)
To simulate Calcium neuronal data, which has a fast onset and a slow decay, we add an exponential regression to the spike simulation with the GLM. The resulting trace is
${f}_{i,t}=\sum _{q=0}^{\mathrm{\infty}}{\sigma}_{i,tq}{e}^{q/{\tau}_{\text{ca}}}{\mathrm{\Delta}}_{\text{sim}}.$
We simulate the above three dynamics with the interaction network, ${\mathrm{\Gamma}}_{ij,q}$. The network has equal numbers of excitatory and inhibitory neurons to ensure that the dynamics is stationary. For convenience, the strength of the connection is set to be identical, $\mathrm{\Gamma}}_{ij,q}{}_{q<{L}_{\text{true}}}=c{A}_{ij$, where $c$ is the connection strength, and ${A}_{ij}$ is the signed adjacency matrix.
Crosscorrelation
Request a detailed protocolThe simplest approach of finding links in networks is through crosscorrelation,
The (equaltime) correlation matrix given by ${C}_{ij}(\mathrm{\Delta}t=0)$ is commonly used in calcium signal analysis as a functional network. Nonetheless, it is symmetric, and does not reveal directional information.
Smooth calcium trace using total variation regularization
Request a detailed protocolWe follow the same procedure as in Chen et al., 2019 to smooth calcium traces using total variation regularization. Existing MATLAB packages and custom code are used to reconstruct the smooth trace Chartrand, 2011. Here, we outline this smoothing method briefly.
Consider a raw calcium fluorescence signal $f(t)$ from neurons that are known to generate episodic spikes and silences. These episodic activities can be characterized by the underlying derivative signal $u(t)$ which governs the onset and offset of the episodes, which we assume have an exponentially distributed temporal variations, and is only weakly correlated in time. If we further assume that the noise in $f$ is Gaussian and white, the maximum likelihood reconstruction of the signal $f(t)$ is equivalent to minimizing.
where $A$ is the antiderivative operator, the combination ${\sigma}_{n}^{2}{\tau}_{n}$ is the spectral density of noise floor that we can retrieve from the power spectrum of $f$ at high frequencies, ${\sigma}_{f}$ is the total standard deviation of the signal, and ${\tau}_{f}$ is the typical time scale of these variations. We determine the one unknown parameter ${\tau}_{f}$ by asking that, after smoothing, the cumulative power spectrum of the residue $Auf$ has the least root mean square difference from the cumulative power spectrum of the extrapolated white noise.
Compute signaltonoise ratio
Request a detailed protocolWe compute the signaltonoise ratio (SNR) for the fluorescence trace of a neuron, $f(t)$, using the smoothing procedure above. Namely, for each trace $f(t)$, we first apply the smoothing procedure. Then, the power of the noise, ${P}_{\text{noise}}$, is given by the extrapolated white noise, and equivalently by the sum of the power spectrum of the residue $Auf$. The sum of the power spectrum of $f$ gives the sum of the power of the noise and the power of the signal, ${P}_{\text{signal}}+{P}_{\text{noise}}$. Finally, the SNR is computed as the ratio of ${P}_{\text{signal}}/{P}_{\text{noise}}$.
Synthetic statistical data generation
Request a detailed protocolTo study the effect of slow calcium decays on the accuracy of GC analysis, we generate synthetic data statistically. Specifically, we consider two chains of five neurons, driven externally by block stimuli with progressing timedelay across each chain, and alternates between the left and the right side. The stimulus of the leftside is generated by choosing intervals, ${\tau}_{\text{on}}$, for consecutive onstate (and ${\tau}_{\text{off}}$ for consecutive offstates) uniformly from the set $[\mathrm{min}({\tau}_{\text{on}}^{\text{data}}),\mathrm{max}({\tau}_{\text{on}}^{\text{data}})]$, where the $\{{\tau}_{\text{on}}^{\text{data}}\}$ is the set of duration of all ‘on’states in a real zebrafish experiment, founded by positive finite difference in the fluorescent signal smoothened by totalvariational regularization. The stimulus of the rightside follows with the delay time ${\tau}_{\text{delay}}$ also matches the uniform distribution on the support of the empirical distribution.
We model the sequential activation of the motoneurons by imposing the spike rate as
where $k$ is the ordered index of neurons on each side, ${\tau}_{\text{info}}$ is the time scale of information propagation, and ${\lambda}_{0}$ is a large basal spike rate to simulate the plateau in the neural activity. We use a standard Poisson process with these rates to generate the spike train for each neuron $\{{\sigma}_{i},t\}$. Finally, the observed calcium signal is computed by convoluting the spikes with exponential decays for each neuron $i$,
and downsampling at a sampling frequency ${\tau}_{\text{sampling}}^{1}$.
Neuron identification in the hindbrain dataset
Request a detailed protocolWe analyzed the raw fluorescence movies from Severi et al., 2018 using the suite2p software (Pachitariu et al., 2017), a pipeline for processing twophoton calcium imaging data. The pipeline first performs 2D registration of the plane recording to remove motion artifacts. Regions of interest (ROIs) are detected automatically, classified into cell and noncell, and their calcium traces are extracted by the software. Following the automatic neuron classification, we performed a manual verification to remove spurious cells and add missing cells. These were selected based on the shape of the detected ROI and on the shape of the extracted fluorescence trace. The V2a neurons are identified by their positions and motorcorrelated calcium activitites. The MLR neurons are identified based on their location within the MLR locus as determined in CarboTano et al., 2022, and specifically not using any information about their shape nor the fact that their activity is motorcorrelated.
Normalized polar histogram of Granger causality links
Request a detailed protocolWe computed the distribution of the angles of the significant GC links to quantify the direction of global information flow. To suppress the influence of neuron positions on the direction distribution, we normalized the distribution by dividing it over the distribution of the angles of all possible connections given the position of the neurons.
Data availability
The current manuscript is a computational study, so no data have been generated for this manuscript. All data used in our manuscript have been previously published. Modelling code is available on the github: https://github.com/statbiophys/zebrafishGC, (copy archived at Statistical Biophysics Consortium, 2023). Data files are available at: https://doi.org/10.5281/zenodo.6774389.

ZenodoData access for figures of Chen, Ginoux, Wyart, Mora & Walczak.https://doi.org/10.5281/zenodo.6774389
References

The MVGC multivariate granger causality toolbox: a new approach to grangercausal inferenceJournal of Neuroscience Methods 223:50–68.https://doi.org/10.1016/j.jneumeth.2013.10.018

Detectability of granger causality for subsampled continuoustime neurophysiological processesJournal of Neuroscience Methods 275:93–121.https://doi.org/10.1016/j.jneumeth.2016.10.016

A tutorial review of functional connectivity analysis methods and their interpretational pitfallsFrontiers in Systems Neuroscience 9:175.https://doi.org/10.3389/fnsys.2015.00175

Synaptic drive to motoneurons during fictive swimming in the developing zebrafishJournal of Neurophysiology 86:197–210.https://doi.org/10.1152/jn.2001.86.1.197

Numerical differentiation of noisy, nonsmooth dataISRN Applied Mathematics 2011:1–11.https://doi.org/10.5402/2011/164564

Searching for collective behavior in a small brainPhysical Review. E 99:052418.https://doi.org/10.1103/PhysRevE.99.052418

Measuring and interpreting neuronal correlationsNature Neuroscience 14:811–819.https://doi.org/10.1038/nn.2842

Hierarchy of neural organization in the embryonic spinal cord: grangercausality graph analysis of in vivo calcium imaging dataIEEE Transactions on Neural Systems and Rehabilitation Engineering 23:333–341.https://doi.org/10.1109/TNSRE.2014.2341632

Multivariate granger causality analysis of fMRI dataHuman Brain Mapping 30:1361–1373.https://doi.org/10.1002/hbm.20606

The development and application of optogeneticsAnnual Review of Neuroscience 34:389–412.https://doi.org/10.1146/annurevneuro061010113817

Successful reconstruction of a physiological circuit with known connectivity from spiking activity alonePLOS Computational Biology 9:e1003138.https://doi.org/10.1371/journal.pcbi.1003138

Measurement of linear dependence and feedback between multiple time seriesJournal of the American Statistical Association 77:304–313.https://doi.org/10.1080/01621459.1982.10477803

Measures of conditional linear dependence and feedback between time seriesJournal of the American Statistical Association 79:907–915.https://doi.org/10.1080/01621459.1984.10477110

Circuits controlling vertebrate locomotion: moving in a new directionNature Reviews. Neuroscience 10:507–518.https://doi.org/10.1038/nrn2608

Partial granger causality  eliminating exogenous inputs and latent variablesJournal of Neuroscience Methods 172:79–93.https://doi.org/10.1016/j.jneumeth.2008.04.011

A granger causality measure for point process models of ensemble neural spiking activityPLOS Computational Biology 7:e1001110.https://doi.org/10.1371/journal.pcbi.1001110

Fictive swimming motor patterns in wild type and mutant larval zebrafishJournal of Neurophysiology 93:3177–3188.https://doi.org/10.1152/jn.01248.2004

Coarse graining, fixed points, and scaling in a large population of neuronsPhysical Review Letters 123:178103.https://doi.org/10.1103/PhysRevLett.123.178103

Mitigating the effects of measurement noise on granger causalityPhysical Review. E, Statistical, Nonlinear, and Soft Matter Physics 75:031123.https://doi.org/10.1103/PhysRevE.75.031123

The multifunctional mesencephalic locomotor regionCurrent Pharmaceutical Design 19:4448–4470.https://doi.org/10.2174/1381612811319240011

The mesencephalic locomotor region sends a bilateral glutamatergic drive to hindbrain reticulospinal neurons in a tetrapodThe Journal of Comparative Neurology 524:1361–1383.https://doi.org/10.1002/cne.23911

Granger causality analysis in neuroscience and neuroimagingThe Journal of Neuroscience 35:3293–3297.https://doi.org/10.1523/JNEUROSCI.439914.2015

Upravlenie khod’boĭ i begom posredstvom elektricheskoĭ stimulatsii srednego mozgaBiofizika 11:659–666.

Granger causality: a review and recent advancesAnnual Review of Statistics and Its Application 9:289–319.https://doi.org/10.1146/annurevstatistics040120010930

Imaging neuronal activity with genetically encoded calcium indicatorsCold Spring Harbor Protocols 2012:647–656.https://doi.org/10.1101/pdb.top069609

Learning causal networks with latent variables from multivariate information in genomic dataPLOS Computational Biology 13:e1005662.https://doi.org/10.1371/journal.pcbi.1005662

Analysis of sampling artifacts on the granger causality analysis for topology extraction of neuronal dynamicsFrontiers in Computational Neuroscience 8:75.https://doi.org/10.3389/fncom.2014.00075
Decision letter

Gordon J BermanReviewing Editor; Emory University, United States

Timothy E BehrensSenior Editor; University of Oxford, United Kingdom
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Granger causality analysis for calcium transients in neuronal networks: challenges and improvements" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Ronald Calabrese as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) It was not clear to the reviewers what lessons are specific to the system they are studying and which ones are to be taken as more general lessons. Certainly, dealing with slow calcium dynamics, motion artifacts, and smoothing, are general problems in calcium imaging, but the reviewers were puzzled a bit about how to decide which neurons are "strange" without a lot of systemspecific knowledge. This seems to be a rather important effect, and having a bit more guidance as to this point in the discussion would be useful.
2) It was not entirely sure what results one should take home from the hindbrain analysis. It is clear that there is a moreorless global signal modulating all neural activity, but this is a common occurrence in population recordings (often, one subtracts this off via PCA or another means before proceeding). Is the general lack of causal links (via the MVGC at least) a generic phenomenon in recurrent networks, or is there something more systemspecific here? Accordingly, it might be interesting to run a recurrent neural network simulation with similar properties to the hindbrain (and perhaps with correlated driving) to see what GC/MVGC would predict. Is there any hope of these methods finding information flow in recurrent networks, or should we restrict the method to networks where we expect the primary mode of information transmission to be feedforward? Either performing simulations or limiting the stated results to feedforward networks would be important.
3) The reviewers pointed out several studies that adapt the classical GC for both electrophysiology data and calcium imaging data (e.g., [1] A. Sheikhattar et al., "Extracting Neuronal Functional Network Dynamics via Adaptive Granger Causality Analysis", PNAS, Vol. 115, No. 17, E3869E3878, 2018. [2] N. A. Francis et al., "Small Networks Encode DecisionMaking in Primary Auditory Cortex", Neuron, Vol. 97, No. 4, 2018. [3] N. A. Francis et al., "Sequential Transmission of TaskRelevant Information in Cortical Neuronal Networks", Cell Reports, Vol. 39, No. 9, 110878, 2022.). In reference [1], a variation of GC based on GLM loglikelihoods is proposed that addresses the issues of nonlinearity, nonstationarity, and nonGaussianity of electrophysiology data. In [2] and [3], a variation of GC using sparse multivariate models is introduced with application to calcium imaging data. In particular, all three references use the sparse estimation of the MVAR parameters in order to mitigate overfitting and also use corrections for multiple comparisons that also reduce the number of spurious links. We suggest discussing these relevant references in the introduction (paragraphs 2 and 3) and discussion.
4) A major issue of GC applied to calcium imaging data is that the trials are typically limited in duration, which results in overfitting of the MVAR parameters when using least squares (See references [2] and [3] above, for example). The authors mention on page 4 that they use least squares to estimate the parameters. However, for the networks of ~10 neurons considered in this work, stationary trials of a long enough duration are required to estimate the parameters correctly. We suggest that the authors discuss this point and explicitly mention the trial durations and test whether the trial durations suffice for stable estimation of the MVAR parameters (this can be done by repeating some of the results on the synthetic data and using different trial lengths and then assessing the consistency of the detected GC links).
5) The definition of the "knee" of the average GC values as a function of the lag L needs to be a bit more formalized. In Figure 2H using the synthetic data, the "knee" effect is more clear, but in the real data shown in Figure 2I, the knee is not obvious, given that the confidence intervals are quite wide. Is there a way to quantify the "knee" by comparing the average GC values as well as their confidence bounds along the lag axis?
6) Another key element of existing GC methods applied to largescale networks is dealing with the issue of multiple comparisons: for instance, in Figures 2, 3, 4, 6, 7, and 8, it seems like all arrows corresponding to all possible links are shown, where the colormap indicates the GC value. However, when performing multiple statistical tests, many of these links can be removed by a correction such as the BenjaminiHochberg procedure. It seems that the authors did not consider any correction of multiple comparisons; we suggest doing so and adding this to your pipeline.
7) In Figure 5C, the values of W_IC for the MV cases seem to be more than 1, whereas by definition they should be less than or equal to 1. Please clarify.
8) Is there evidence that the lateralized and rostrocaudal connectivity of the motoneurons occurs at the timescale of ~750 ms? Given that this time scale is long enough for multiple synapses, it could be the case that some contralateral and nonrostrocaudal connections could be "real", as they reflect multihop synaptic connections. Please clarify.
9) Redundant signals: throughout the brain, it's expected that a population of neurons can encode the same information. It's unclear how GC (both the original and the modified versions) can handle this redundancy. Given how pervasive redundant signals are in the brain, this should be addressed in both simulation and experimental data. For example, in one of the manuscript's simulated networks, replace one neuron with 10 copies of it, each with identical inputs and outputs but with the weights scaled by 1/10. Such a network is functionally equivalent to the original but may pose some challenges for the various versions of GC. This issue may also account for the MVGC results in the hindbrain dataset. It might be more appropriate to apply GC to groups of neurons (as indeed the authors cited), instead of applying it at the singlecell level with redundant signals.
10) Both BVGC and MVGC appear to be extremely sensitive to any outlier signals. The most worrying aspect is that the authors developed their corrections and pipelines with the benefit of knowing the structure of the underlying system, whereas in the case where GC would be most useful, the user would be unable to rely on prior knowledge of the underlying structure. For instance, the motion artifact in Figure 3ac was a helpful example of a vulnerability of naive GC, but one could easily imagine scenarios involving an unmeasured disturbance (e.g. the table is bumped) causing a similar artifact, but if the experimenter is unaware of such unmeasured disturbances then they will not be included in Z, and hence can result in the detection of widespread spurious links. There is a circularity here that's concerning. It seems that one already needs to have the answer (e.g. circuit connectivity) in order to clean up the data sufficiently for BVGC or MVGC to work effectively. Perhaps the authors would be interested in incorporating ideas from the systems identification literature, which can include the estimation of unmeasured disturbances, perhaps in conjunction with L1 regularization on the GC links. This is certainly out of scope for the present work, but it would be worth acknowledging the difficulties of unmeasured disturbances and deferring a general solution to future work. Similar considerations apply to a common unmeasured neuronal input (e.g. from a brain region not included in the field of view of the imaging).
11) Interpretation – would it be correct to state that BVGC identifies plausible causal links, while MVGC identifies a plausible systemlevel model? We think these interpretations, carefully stated, might provide a helpful way of thinking about the two GC approaches. Taking the results of the paper together, neither BVGC nor MVGC is definitive – BVGC may overestimate the true number of causal links but MVGC is prone to a winnertakeall phenomenon that may represent just one of many plausible systemlevel models that can account for the observed data. This should be more clearly stated in the manuscript.
Reviewer #3 (Recommendations for the authors):
There are numerous typos in the manuscript and some odd choices in the citations in the introduction. Given that there are still some major conceptual issues to address, I'll defer these to the future.
https://doi.org/10.7554/eLife.81279.sa1Author response
Essential revisions:
1) It was not clear to the reviewers what lessons are specific to the system they are studying and which ones are to be taken as more general lessons. Certainly, dealing with slow calcium dynamics, motion artifacts, and smoothing, are general problems in calcium imaging, but the reviewers were puzzled a bit about how to decide which neurons are "strange" without a lot of systemspecific knowledge. This seems to be a rather important effect, and having a bit more guidance as to this point in the discussion would be useful.
We thank the reviewer for pointing out the lack of quantitative definition of the “strange” neurons. In this revision, we introduce a quantitative criteria to identify atypical calcium traces to exclude from the analysis, by recognizing that calcium transients of neurons decay typically with the same profile, i.e. similar decaying time constant when fitted to an exponential function.
If a calcium trace has a decay constant much longer (> tens of s) than an average trace (~ 0.15 s), we infer that it’s not the trace of a neuron but likely a glial cell.
This method can be applied generally to systems with neurons expected to have the same characteristics in terms of excitability and calcium buffer. This is the method we already used to identify the time constant of calcium decay (τ = 2.5s) in Figure 3—figure supplement 1 (Figure S7 from first submission) and Results: GC analysis of motoneuron data: slow calcium timescale. In this figure, we now include the overlay of calcium transients from typical neurons as well as from atypical traces (the “strange” neurons). We have also changed the text in Results: GC analysis of motoneuron data: Removing strange neurons as follows:
“In general, cells whose calcium transients showed time decay of tens of seconds were identified as non neuronal cells and excluded from GC analysis (see overlays of calcium transients in Figure 3—Figure Supplement 1)”.
We also included in the discussion that this method can be generally applied to any other systems relying on calcium imaging to infer neuronal activity.
2) It was not entirely sure what results one should take home from the hindbrain analysis. It is clear that there is a moreorless global signal modulating all neural activity, but this is a common occurrence in population recordings (often, one subtracts this off via PCA or another means before proceeding). Is the general lack of causal links (via the MVGC at least) a generic phenomenon in recurrent networks, or is there something more systemspecific here? Accordingly, it might be interesting to run a recurrent neural network simulation with similar properties to the hindbrain (and perhaps with correlated driving) to see what GC/MVGC would predict. Is there any hope of these methods finding information flow in recurrent networks, or should we restrict the method to networks where we expect the primary mode of information transmission to be feedforward? Either performing simulations or limiting the stated results to feedforward networks would be important.
The reviewers are right. We have now replaced the network in Figure 1 with a more realistic network with feedback. We have also added a figure in the Supplementary Information (Figure 1—figure supplement 1), that tests the performance of GC on VAR dynamics run for randomly connected neurons (N = 10). Each underlying true interaction structure is sampled based on, P_{connect}, the probability that one neuron has a synaptic connection with the other, and also the connection strength c. As shown in Figure 1— figure supplement 1, MVGC performs well in identifying the true and only the true links when the interaction strength is large.
3) The reviewers pointed out several studies that adapt the classical GC for both electrophysiology data and calcium imaging data (e.g., [1] A. Sheikhattar et al., "Extracting Neuronal Functional Network Dynamics via Adaptive Granger Causality Analysis", PNAS, Vol. 115, No. 17, E3869E3878, 2018. [2] N. A. Francis et al., "Small Networks Encode DecisionMaking in Primary Auditory Cortex", Neuron, Vol. 97, No. 4, 2018. [3] N. A. Francis et al., "Sequential Transmission of TaskRelevant Information in Cortical Neuronal Networks", Cell Reports, Vol. 39, No. 9, 110878, 2022.). In reference [1], a variation of GC based on GLM loglikelihoods is proposed that addresses the issues of nonlinearity, nonstationarity, and nonGaussianity of electrophysiology data. In [2] and [3], a variation of GC using sparse multivariate models is introduced with application to calcium imaging data. In particular, all three references use the sparse estimation of the MVAR parameters in order to mitigate overfitting and also use corrections for multiple comparisons that also reduce the number of spurious links. We suggest discussing these relevant references in the introduction (paragraphs 2 and 3) and discussion.
We thank the reviewers for pointing out these references. While our GC pipeline focuses on preprocessing the calcium signal by denoising, and postprocessing the resulting GC matrix by using an adaptive threshold, these previous studies mentioned by the reviewers emphasize more on modifying the underlying dynamics for GC in order to address the nonlinear and nongaussian dynamics in interacting neurons. In future work, it is of our interest to combine these two approaches. To bring this understanding to the readers, we have incorporated the following change of text. We have included in the introduction (paragraph 3) the following:
“New variations of GC accounting for sparsity in the connection have also been developed and applied on singlecell level to both spiking and calcium signals in the primary auditory cortex of mice (Sheikhattar et al. 2018; Francis et al., 2018, 2022).”
We also modified the Discussion section to include the above reference and added in the text, after we review the two important steps of our GC pipeline, the paragraph connecting our approach better to the body of literature:
“Our pipeline focuses on the preprocessing of the signals and the postprocessing of the GC results. This effort is complementary to previous studies where new variations of GC were introduced to take into account the nonlinear and nongaussian dynamics in interacting neurons, for example by modifying the underlying VAR dynamics with GLM or VAR with sparsity constraints, and subsequently using adaptive thresholds for significance tests (Kim et al., 2011; Sheikhattar et al., 2018; Francis et al., 2018, 2022). Future work of interest includes combining these variations of GC with our pipeline.”
4) A major issue of GC applied to calcium imaging data is that the trials are typically limited in duration, which results in overfitting of the MVAR parameters when using least squares (See references [2] and [3] above, for example). The authors mention on page 4 that they use least squares to estimate the parameters. However, for the networks of ~10 neurons considered in this work, stationary trials of a long enough duration are required to estimate the parameters correctly. We suggest that the authors discuss this point and explicitly mention the trial durations and test whether the trial durations suffice for stable estimation of the MVAR parameters (this can be done by repeating some of the results on the synthetic data and using different trial lengths and then assessing the consistency of the detected GC links).
We thank the reviewers for pointing out the importance of the length of the time series in the GC calculation and the information missing to describe our dataset lengths. We address this comment with three additional pieces of supporting evidence.
First, we perform the simulation suggested by the reviewer. With the same underlying structure, we simulate two trials with length T, measure the GC values using each trial, and compute the Pearson’s correlation coefficient between the two sets of GC values as a measure of consistency of detecting GC links. The results are included in the Figure 1—figure supplement 3, showing that the stronger the coupling is, the smaller the sample size is required to reconstruct a consistent map of information flow.
Second, to compare with the ground truth, we use synthetic data to check the error rate of link detection as a function of trial lengths, the result of which is included in Figure 1—figure supplement 2. Specifically, we simulate GLMCalcium dynamics on randomly connected networks, with a fixed probability of connection and a range of connection strength, for different trial durations, and we compute the accuracy of link detection. For MVGC, the accuracy depends on the connection strength. If the connection strength is large, then the true links are mostly detected. If the connection strength is small, then more true links are missed.
These numerical results prompt us to test how consistent the inferred GC matrix is for the zebrafish data, if we use only half of the data. If half of the existing time points are already sufficient to give consistent GC structures, then using all data (and more data) suffices for stable estimation. This is the case for the motoneuron data, which we show in the scatter plot of GC values inferred using the first halves vs. the second halves of each experimental trial. The GC values from the two halves have a Pearson’s correlation coefficient of 0.75 for MVGC, and 0.87 for BVGC. The results for motor neurons in the embryonic spinal cord are shown in an additional supplementary figure, Figure 2—figure Supplement 2.
For hindbrain data, the consistency is not as good. The Pearson’s correlation coefficient is 0.41 for MVGC and 0.76 for BVGC. We note that the similarity is higher for the motoneuron dataset due to the cyclic activity pattern leading to two similar halves.
We added in Results: GC analysis of motoneuron data: Naive GC the above results, specifically addressing the point of trial duration:
“… In order to verify that the length of the motoneuron dataset is sufficient to estimate the functional connectivity using Granger causality (see Figure 1—figure Supplement 2 and Figure 1—figure supplement 3 for illustrations using synthetic data), we compared the two networks obtained by applying GC analysis to each half of the calcium imaging time series. The Pearson's correlation coefficient between the GC values from the two halves is 0.87 for BVGC, and 0.75 for MVGC (see Figure 2—figure Supplement 2), indicating the trial length is long enough for a consistent estimation of the GC values.”
We also added a supplementary figure to discuss the overfitting issue (Figure 8—figure Supplement 3). Specifically, we compare the resulting GC links for the hindbrain data, using VAR dynamics, and VAR plus regularization. We show that in MVGC, not observing many links is not purely a problem of overfitting. And that if including regularization, one still needs to use the adaptive threshold to increase the connection.
5) The definition of the "knee" of the average GC values as a function of the lag L needs to be a bit more formalized. In Figure 2H using the synthetic data, the "knee" effect is more clear, but in the real data shown in Figure 2I, the knee is not obvious, given that the confidence intervals are quite wide. Is there a way to quantify the "knee" by comparing the average GC values as well as their confidence bounds along the lag axis?
We thank the reviewers for asking for clarification. We can compare the average GC values as suggested, or the individual pairwise GC values when adding an extra lag. Using 3 lags or more yields very similar results, so we took the decision to keep the simplest model with fewest lags. To clarify, we added in Results: GC analysis of motoneuron data: Naive GC:
“This knee corresponds to the maximum lag with the best balance between accuracy and complexity: further increasing the maximum lag yields very similar results but the GC analysis becomes much more computationally expensive (Figure 2—figure supplement 1).”
The additional Figure 2—figure supplement 1 shows the “Comparison of Granger causality analysis results at different maximum lags. Correlation between the GC values obtained for each pair of neurons using max lag L and max lag L + 1. All fish (n=9) of the motoneuron data set are used and represented by color (same as Figure 7). The similarity in Granger causality values, measured by the Pearson correlation coefficient r, increases as L increases and then stabilizes after maximum lag L = 3 (r = 0.97 both for BVGC and MVGC between L = 3 and L = 4, as well as between L = 4 and L = 5), justifying our decision to use L = 3. Using a larger maximum lag does not bring more information but brings more complexity, and increases the computation time (especially for MVGC).”
6) Another key element of existing GC methods applied to largescale networks is dealing with the issue of multiple comparisons: for instance, in Figures 2, 3, 4, 6, 7, and 8, it seems like all arrows corresponding to all possible links are shown, where the colormap indicates the GC value. However, when performing multiple statistical tests, many of these links can be removed by a correction such as the BenjaminiHochberg procedure. It seems that the authors did not consider any correction of multiple comparisons; we suggest doing so and adding this to your pipeline.
In our original analysis, we had performed multiple comparisons by applying a Bonferroni correction to the pvalue threshold. We now clarify this in the main text by adding in Results: Granger Causality, after Equation 7,
“…with the number of parameters adjusted accordingly in the reduced and the full model. The significant test is adjusted using multiple comparisons by applying a Bonferroni correction to the pvalue threshold.”
7) In Figure 5C, the values of W_IC for the MV cases seem to be more than 1, whereas by definition they should be less than or equal to 1. Please clarify.
We thank the reviewers for carefully reading our submission. In our first submission, W_{IC} can be more than 1 when the averaged GC value for contralateral links is negative (Equation 10). This negative GC value can happen when the residuals of fitting from the full and from the reduced model are similar, and after dividing by the degrees of freedom, (T_{regr –} M_{r}) and (T_{regr –} M_{f}), respectively, leads to $var(\stackrel{~}{\epsilon})<var(\epsilon )$. We did not impose that GC value has to be nonnegative in our definition.
To improve clarity, we have introduced in the revisions the nonnegativeness in the definition of GC in Equation 4 and 5 such that GC is defined as the maximum between 0 and the log of ratio of residuals. This way, the W_{IC} and W_{RC} will always be less than or equal to 1. We have changed Figure 5C accordingly. The problem of negative GC value does not occur in the results of GC applied to real motoneuron and hindbrain data.
8) Is there evidence that the lateralized and rostrocaudal connectivity of the motoneurons occurs at the timescale of ~750 ms? Given that this time scale is long enough for multiple synapses, it could be the case that some contralateral and nonrostrocaudal connections could be "real", as they reflect multihop synaptic connections. Please clarify.
In the spinal cord, motor neurons form synapses onto muscle fibers and are not synaptically connected among themselves. At the embryonic stage, motor neurons are part of an electronicallycoupled network relying on GAP junctions. During twitches, motor neurons are recruited by excitatory neurons located in the hindbrain and enable their sequential activation from head to tail. We therefore use this dataset to infer flow of information, not synaptic connectivity. This important point is now added to the result section.
Physiologically, the two chains of motor neurons are not directly connected to each other. However, all motor neurons are controlled on each side by common driver neurons, in order to be sequentially activated (St Amant and Drapeau, Neuron 2001). Overall, each chain of neurons on one side can be recruited at a frequency up to 2hz, so with a typical 250ms time difference (Warp et al., Current Biology 2012; Wan et al., Cell 2019 https://pubmed.ncbi.nlm.nih.gov/31564455/). Depending on the position of a neuron in the focal plane of the spinning disk microscope, we have good reasons to think that the kinetics of the calcium transient could appear fast (in the plane) versus slow (out of the focal plane). This artifact may lead to variations in the apparent pattern of recruitment of motor neurons in the rostral caudal axis. This could explain why in the data, we sometimes observe differences in the recruitment pattern leading to a caudal motor neuron being activated a few hundreds milliseconds before a rostral one.
This physiological structure is explained in Results: GC analysis for motoneuron data: The data, when the motoneuron dataset is introduced. To further clarify that the ground truth is not strictly W_{RC} = 1, we added in Results: GC analysis for motoneuron data: Defining directional biases, after the definition of
W_{RC}:
“Notice that based on the anatomy, the ground truth of information flow can be smaller than W_{RC} = 1, because motoneuron activity has been shown to rely on command neurons in the brainstem that activates motoneurons sequentially in the spinal cord, not that the motor neurons synapse onto each other.”
9) Redundant signals: throughout the brain, it's expected that a population of neurons can encode the same information. It's unclear how GC (both the original and the modified versions) can handle this redundancy. Given how pervasive redundant signals are in the brain, this should be addressed in both simulation and experimental data. For example, in one of the manuscript's simulated networks, replace one neuron with 10 copies of it, each with identical inputs and outputs but with the weights scaled by 1/10. Such a network is functionally equivalent to the original but may pose some challenges for the various versions of GC. This issue may also account for the MVGC results in the hindbrain dataset. It might be more appropriate to apply GC to groups of neurons (as indeed the authors cited), instead of applying it at the singlecell level with redundant signals.
We agree with the reviewer that the problem of redundant signals is subtle. Redundant signals are not equivalent to repeated signals. A repeated signal is an exact copy of another signal, which experimentally could come from a mispartition of a single neuron. In this case, GC is unable to detect the correct links, especially the connection between the repeated neuron and the others. In contrast, redundant encoding is more in the structure of the network. Because of stochasticity, the dynamics of two neurons – with identical input and output weights to other neurons – can be different, as we test it on the network in Figure 1 with VAR and with GLMCalcium dynamics. Such a difference in the dynamics allows MVGC to detect the true structure despite redundant encoding. We have added a paragraph in Results: GC analysis of a small synthetic neuronal net, and an additional supplementary figure (Figure 1 —figure supplement 7) to clarify this issue. The added text is:
“Redundant signals. In the brain, a population of neurons can encode redundant signals. If the redundancy is in the structure of the network, for example, when two neurons share the identical input and output, the stochasticity will lead to different neuronal dynamics, and MVGC is able to identify the true connection. However, if two neurons have exactly the same activity, MVGC will underestimate the causality, while BVGC can still reveal these connections (Figure 1— figure Supplement 7).”
The additional Figures 1 —figure supplement 7 shows:
Presence of redundant signals harms the performance of MVGC. GC analysis performed on synthetic data generated using VAR dynamics on the network structure in the main figure, for T = 5000 data points. GC reveals all links for networks with redundant structures, where an “artificial” neuron 11 is created with identical input and output strength as neuron 1 (redundant structure, middle column). However, MVGC fails to identify causal links when the signal from neuron 1 is copied to create another “artificial” neuron 11 (redundant signal, right most column). BVGC is able to identify the underlying true connectivity.
For the hindbrain data in the last figure, we previously tried grouping the neurons. However, we find that the most correlated neurons are spatially located far from each other, as one would expect given that pairs of pre and postsynaptic neurons may be connected through a very long axon. This makes grouping difficult: grouping neurons by correlation will disable us finding the spatial information flow, while grouping neurons by spatial proximity washes away meaningful signals.
10) Both BVGC and MVGC appear to be extremely sensitive to any outlier signals. The most worrying aspect is that the authors developed their corrections and pipelines with the benefit of knowing the structure of the underlying system, whereas in the case where GC would be most useful, the user would be unable to rely on prior knowledge of the underlying structure. For instance, the motion artifact in Figure 3ac was a helpful example of a vulnerability of naive GC, but one could easily imagine scenarios involving an unmeasured disturbance (e.g. the table is bumped) causing a similar artifact, but if the experimenter is unaware of such unmeasured disturbances then they will not be included in Z, and hence can result in the detection of widespread spurious links. There is a circularity here that's concerning. It seems that one already needs to have the answer (e.g. circuit connectivity) in order to clean up the data sufficiently for BVGC or MVGC to work effectively. Perhaps the authors would be interested in incorporating ideas from the systems identification literature, which can include the estimation of unmeasured disturbances, perhaps in conjunction with L1 regularization on the GC links. This is certainly out of scope for the present work, but it would be worth acknowledging the difficulties of unmeasured disturbances and deferring a general solution to future work. Similar considerations apply to a common unmeasured neuronal input (e.g. from a brain region not included in the field of view of the imaging).
We corrected for unmeasured inputs by preprocessing the data using characteristic features for calcium transients, and not assuming any underlying network structures. As we expect, all calcium transients follow a fast rise and a slow decay that can be fitted by exponentials, disturbances such as the table being bumped can be identified and corrected. We also emphasize that knowing the underlying structure of the motoneuron network only prompts us to clean up the correlated noise, and that this method developed is applicable to general datasets without a priori known structures, such as the hindbrain dataset.
We thank the reviewers for pointing out literature in system identification theory. To provide more detailed context, we added in the Discussion section sentences on recent development of GC and other causality measures that considers hidden input, and discussed their advantages and disadvantages:
“Firstly, we emphasize that with current experimental techniques, only a small portion of the entire nervous system is recorded. While newer versions of GC have been adapted to take into consideration latent variables which contribute equally to all neurons [49], or through subsequent removal of causal links [50], GC does not in general allow identification of the true underlying connection in the presence of unobserved neurons. However, it is able to identify information flow and an effective functional causal network for the observed neurons.”
11) Interpretation – would it be correct to state that BVGC identifies plausible causal links, while MVGC identifies a plausible systemlevel model? We think these interpretations, carefully stated, might provide a helpful way of thinking about the two GC approaches. Taking the results of the paper together, neither BVGC nor MVGC is definitive – BVGC may overestimate the true number of causal links but MVGC is prone to a winnertakeall phenomenon that may represent just one of many plausible systemlevel models that can account for the observed data. This should be more clearly stated in the manuscript.
We agree with the referees, and have added the following sentences to Discussion:
“We also compared results from bivariate GC and multivariate GC analysis. While BVGC overestimate the true number of causal links, MVGC is prone to a winnertakeall phenomenon that may represent just one of many plausible systemlevel models that can account for the observed data. This is especially problematic when the signals are strongly correlated, for example, in the presence of redundant signals. In addition, MVGC is more prone to the issue of overfitting: we suggest using BVGC on datasets with large numbers (hundreds) of neurons, and MVGC on datasets with few tens of neurons. In general, starting with a small number of neurons of interest seems wise.”
https://doi.org/10.7554/eLife.81279.sa2Article and author information
Author details
Funding
European Research Council (COG 724208)
 Aleksandra M Walczak
European Research Council (COG 101002870)
 Claire Wyart
New York Stem Cell Foundation (NYSCFRNI39)
 Claire Wyart
Human Frontier Science Program (RGP0063/2018)
 Claire Wyart
Fondation pour la Recherche Médicale (FRM EQU202003010612)
 Claire Wyart
Fondation BettencourtSchueller (FBSdon0031)
 Claire Wyart
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Urs Böhm and Kristen Severi for providing feedback on the hindbrain data, Jenna Sternberg for the data on motoneurons, and Fabrizio De Vico Fallani, Mario Chavez (Paris Brain Institute) and Moritz GrosseWentrup (University of Wien) for insightful feedback. This work was supported by the European Research Council Consolidator Grants n. 724208 (AMW) and n. 101002870 (CW), the New York Stem Cell Foundation (NYSCF) Robertson Award 2016 #NYSCFRNI39 (CW), the Human Frontier Science Program (HFSP) Research Grant #RGP0063/2018 (CW), the Fondation pour la Recherche Médicale Team grant (FRM EQU202003010612) and the Fondation BettencourtSchueller #FBSdon0031 (CW).
Senior Editor
 Timothy E Behrens, University of Oxford, United Kingdom
Reviewing Editor
 Gordon J Berman, Emory University, United States
Version history
 Received: June 22, 2022
 Preprint posted: June 29, 2022 (view preprint)
 Accepted: February 6, 2023
 Accepted Manuscript published: February 7, 2023 (version 1)
 Version of Record published: March 15, 2023 (version 2)
Copyright
© 2023, Chen, Ginoux et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,235
 Page views

 178
 Downloads

 1
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Neuroscience
Ultrasonic vocalizations (USVs) fulfill an important role in communication and navigation in many species. Because of their social and affective significance, rodent USVs are increasingly used as a behavioral measure in neurodevelopmental and neurolinguistic research. Reliably attributing USVs to their emitter during close interactions has emerged as a difficult, key challenge. If addressed, all subsequent analyses gain substantial confidence. We present a hybrid ultrasonic tracking system, Hybrid Vocalization Localizer (HyVL), that synergistically integrates a highresolution acoustic camera with highquality ultrasonic microphones. HyVL is the first to achieve millimeter precision (~3.4–4.8 mm, 91% assigned) in localizing USVs, ~3× better than other systems, approaching the physical limits (mouse snout ~10 mm). We analyze mouse courtship interactions and demonstrate that males and females vocalize in starkly different relative spatial positions, and that the fraction of female vocalizations has likely been overestimated previously due to imprecise localization. Further, we find that when two male mice interact with one female, one of the males takes a dominant role in the interaction both in terms of the vocalization rate and the location relative to the female. HyVL substantially improves the precision with which social communication between rodents can be studied. It is also affordable, opensource, easy to set up, can be integrated with existing setups, and reduces the required number of experiments and animals.

 Neuroscience
How does the human brain combine information across the eyes? It has been known for many years that cortical normalization mechanisms implement ‘ocularity invariance’: equalizing neural responses to spatial patterns presented either monocularly or binocularly. Here, we used a novel combination of electrophysiology, psychophysics, pupillometry, and computational modeling to ask whether this invariance also holds for flickering luminance stimuli with no spatial contrast. We find dramatic violations of ocularity invariance for these stimuli, both in the cortex and also in the subcortical pathways that govern pupil diameter. Specifically, we find substantial binocular facilitation in both pathways with the effect being strongest in the cortex. Nearlinear binocular additivity (instead of ocularity invariance) was also found using a perceptual luminance matching task. Ocularity invariance is, therefore, not a ubiquitous feature of visual processing, and the brain appears to repurpose a generic normalization algorithm for different visual functions by adjusting the amount of interocular suppression.