Spectrally and temporally resolved estimation of neural signal diversity

  1. Department of Computing, Imperial College London, UK
  2. Department of Psychology, University of Cambridge, UK
  3. Department of Informatics, University of Sussex, UK
  4. Centre for Psychedelic Research, Department of Brain Sciences, Imperial College London, UK
  5. Centre for Complexity Science, Imperial College London, UK
  6. Centre for Eudaimonia and Human Flourishing, University of Oxford, UK
  7. Division of Anaesthesia, School of Clinical Medicine, University of Cambridge, UK
  8. Montreal Neurological Institute, McGill University, Canada
  9. Department of Psychology, Queen Mary University of London, UK
  10. Sussex Centre for Consciousness Science, Department of Informatics, University of Sussex, UK
  11. CIFAR Program on Brain, Mind, and Consciousness, Canada
  12. Psychedelics Division, Neuroscape, University of California San Francisco, San Francisco, USA

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Valentin Wyart
    Inserm, Paris, France
  • Senior Editor
    Andre Marquand
    Radboud University Nijmegen, Nijmegen, Netherlands

Reviewer #1 (Public Review):

In this paper, the authors attempt to overcome the "fundamental limitations" of Lempel-Ziv complexity by developing and testing a complexity estimator based on state-space modelling (CSER) that they argue allows higher temporal resolution and spectral decomposition of entropy directly. They test the performance of this approach using MEG, EEG, and ECoG data from monkeys and humans. Although in principle, these developments might be useful for those already using LZ complexity in their analyses, these developments ignore much of the non-LZ entropy community which has already developed related solutions to the same issues. It is thus not clear currently whether this approach is necessary or unique per se:

• As the authors intimate, LZ is a relatively crude but efficient estimator; it leverages a simple binarization of time points above and below the time series mean to look at patterns (in turn disregarding the magnitude of the signal itself). The unique benefit of LZ in and of itself is not at all clear to this reviewer. It is nearly guaranteed that LZ will be extremely highly correlated with various other common measures of "discrete" entropy (especially permutation entropy, which ranks all time-series points prior to computing motifs/patterns rather than anchor anything by the mean (as does LZ), but nevertheless ignores the value range of the signal). The general appeal of the authors' intended developments to further improve LZ specifically would dramatically boost should they be able to make a case that LZ is somehow special, to begin with.

• Beyond this, we can now turn to the authors' rationale for the LZ developments proposed. Despite the authors' statement in the abstract that LZ complexity is "the most widely used approach complexity of neural dynamics," to my knowledge, sample entropy (and its multiscale variant, MSE) is much more commonly used in cognitive neuroscience. Such measures of entropy already enjoy several benefits over LZ. First, the continuous magnitude of the signal is relevant in sample entropy (i.e., it is not discrete in the same way as LZ because the values of each data point matter prior to the estimation of patterns). This is important for people in that community because electrophysiologists/neuroimagers often assume the values of the signal to matter (e.g., for ERPs, the magnitude of power, etc.). Ignoring the magnitude of signal values altogether, as in LZ, is a somewhat dramatic choice, especially if the authors then end up arguing that the spectral decomposition of entropy itself is valuable (after signal value ranges have been ignored!). In any case, as far as I know, LZ has never been shown the be more sensitive than e.g., sample entropy/MSE in relation to any outcome variable, but perhaps the authors can provide evidence for this and argue what LZ should practically do that is unique. Second, the use of MSE more easily allows (although not without its challenges) to directly compare spectral power and single/multiscale entropy straight away, which has been done in quite some depth already without the need for a state-space model of any kind (e.g., Kosciessa et al., 2020, PLOS CB). Instead of using a standard spectral power approach and comparing to entropy, the authors propose the spectrally decompose CSER entropy time series directly. Why? What should this do over standard multi-scale entropy approaches (like MSE, which estimate "fast" and "slow" complexity dynamics), which do not require a Fourier? And if they already believe that the spectrum cannot capture entropy (hence rationalizing the use of LZ-type measures in general), why do they want to invoke spectral estimation assumptions into the estimation of entropy when they could just compare the standard spectrum to entropy to begin with, without any complex modelling in between? I just don't see the need for a lot of what is proposed here; the authors provide solutions to problems that (at least for several in this community) may not exist at all.

• Figure 2: the authors show results descriptively comparing LZ and CSER, but without comparing the two measures directly. The patterns overall look extremely similar; why not correlate the values from the two measures in each dataset to make a case for what CSER is adding here? By eye test, it appears they will be extremely highly correlated, which leaves the reader wondering what CSER (with all of its model complexities and assumptions) has added.

• On the logic of and evidence for the use of CSER: The use of a state space model to allow estimation of "prediction errors" appears to be akin to a latent autocorrelation model with a lag/step size of 1 time-point, and trained only on prestim baseline data. When a successive time point is "deviant" from that autocorrelative function, the authors argue that this provides a measure of instantaneous entropy. This seems simple at first glance, but it is very difficult for this reviewer to wrap their head around. This approach anchors stim-related entropy estimation to prestim entropy for every subject, disallowing the direct comparison of values across subjects during the stimulus phase itself. This does not directly provide a measure of instantaneous task-related entropy, but a mixture of pre and post stim sources based on a state-space model. Does it need to be this complicated? Why does a simple window-based function not suffice to generate temporal dynamics of entropy without coupling the task-based signal to the prestim period? There are many such approaches already existing in the field.

• Figure 3: The authors show that gamma-band CSER is the most sensitive. Isn't it true that this is the exact inverse of the dominance of typical spectral effects under such conditions (that across the literature in psychedelics, sleep, and anaesthesia, there are dominant shifts in low-frequency spectral power)? Although low-frequency power is expected to be a dominant determinant of entropy in the entire signal (see Kosciessa et al., 2020, PLOS CB), something else appears to be happening here. At face value, because gamma is the spectral band with the lowest power in every imaging modality we know of, there is inherently less repeatability/autocorrelation in that same signal, which necessarily should produce more "prediction error/instantaneous entropy" in any condition. When the authors then take the "mean difference" of gamma-based entropy values from each of the two conditions in each sample, any condition-based shift in entropy should inherently be easier to detect. In any case, why not simply show these CSER spectral results next to a standard spectrum over the same conditions and then directly compare the unique utility of e.g., gamma power to CSER gamma? And if you compute something like the percent change between conditions for each spectral band, do you maintain gamma dominance?

Reviewer #2 (Public Review):

This paper presents a novel measure of complexity that can be applied to recorded neurophysiological time series. The paper first introduces an existing measure, Lempel-Ziv complexity, reviewing its computation, application, and potential issues. They then present their new metric: CSER. They show CSER values change similarly to LZ under psychedelics, sleep, and general anaesthesia. A key advantage of CSER is that it can be decomposed in both time and frequency. They give example applications for each of these. They show the differences in CSER in the previous examples are mostly located in the gamma band. For a temporal example, they consider monkey ecog in an oddball task and so CSER changes between oddballs and deviants.

Major comments
Most of the technical details are rightly in the methods, but it would be nice as a reader to have more of a concrete idea of the type of state space model used in the main text, the assumptions underlying this, and typical orders used perhaps with a schematic diagram etc. I appreciate they have written the paper to appeal to a broad general audience, but it seems like this is an important part of the method that anyone using the method should understand in more detail.

It might be nice to cover some other methods of signal variation e.g. as reviewed in Washke et al. Neuron 2021 and how CSER fits into the broader taxonomy of measures of neural variability (even if restricted to information-theoretic ones e.g. multi-scale entropy and permutation entropy, which have also been linked to prediction in the brain Washke et al. elife 2019).

While the examples are clear and well-motivated, the novel parts could be more developed in terms of interpretation, or linking to existing measures. For example, the frequency results show the complexity changes in "gamma" which is defined as >25Hz. From a biological point of view, it would be nice to understand this better, perhaps splitting low gamma (including 40Hz oscillations) from high gamma (ie MUA). How is the frequency measure affected by the width of the frequency band considered? I understand the sum of the shown terms equals the broadband result but e.g. in Figure 3 if the values were normalised by the bandwidth of each band, gamma might not stand out so much (as it is by far the widest band, 75Hz vs 3Hz for the delta). So if gamma is not contributing more per-unit of frequency, the interpretation might be different. What is it about the gamma band activity that is changing between the conditions: autocorrelation of power, more variability in phase procession? What would this measure give for simulated systems with known changes (for example, changes in oscillatory power, or changes in 1/f slope). What sort of system would give the profiles in Figure 3?

For the temporal example, the result is a nice proof of concept. It looks quite reminiscent of "novel mutual information" time-course (e.g. compare the absolute value of CSER difference to Figure 13, Ince et al HBM 2017, which also showed two peaks of novel information at the time where the gradient of the ERP starts to change, 20-30ms prior to the ERP peak, but in a task with no predictive component). It might be nice to explicitly compare the statistical power to this existing method (conditional mutual information between signal+gradient and experimental condition, conditioning out the selection of previous time points with peak conditional MI). Deviant stimuli initially seem to decrease entropy - by eye, it's surprising this isn't significant (stands out a lot from baseline). Was a two-sided or one-sided (matching the prior hypothesis) test performed here? Could it be that the change in entropy rate is a property of any ERP signal (ie it looks like the change in CSER reflects the following difference in peak ERP - for the first negative peak, the deviant amplitude is lower, for the second positive peak the deviant amplitude is higher), and a lower level signal interpretation (ie amplitude of CSER difference is related to the difference in ERP amplitude, rather than directly reflecting neural mechanisms of prediction).

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation