Efficient coding explains neural response homeostasis and stimulus-specific adaptation

  1. Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Tatyana Sharpee
    Salk Institute for Biological Studies, La Jolla, United States of America
  • Senior Editor
    Joshua Gold
    University of Pennsylvania, Philadelphia, United States of America

Reviewer #1 (Public review):

This work derives a general theory of optimal gain modulation in neural populations. It demonstrates that population homeostasis is a consequence of optimal modulation for information maximization with noisy neurons. The developed theory is then applied to the distributed distributional code (DDC) model of the primary visual cortex to demonstrate that homeostatic DDCs can account for stimulus-specific adaptation.

What I consider to be the most important contribution of this work is the unification of efficient information transmission in neural populations with population homeostasis. The former is an established theoretical framework, and the latter is a well-known empirical phenomenon - the relationship between them has never been fully clarified. I consider this work to be an interesting and relevant step in that direction.

The theory proposed in the paper is rigorous and the analysis is thorough. The manuscript begins with a general mathematical setting to identify normative solutions to the problem of information maximization. It then gradually builds towards questions about approximate solutions, neural implementation and plausibility of these solutions, applications of the theory to specific models of neural computation (DDC), and finally comparisons to experimental data in V1. Such a connection of different levels of abstraction is an obvious strength of this work.

Overall I find this contribution interesting and assess it positively. At the same time, I have three major points of criticism, which I believe the authors should address. I list them below, followed by a number of more specific comments and feedback.

Major comments:

(1) Interpretation of key results and relationship between different parts of the manuscript. The manuscript begins with an information-transmission ansatz which is described as "independent of the computational goal" (e.g. p. 17). While information theory indeed is not concerned with what quantity is being encoded (e.g. whether it is sensory periphery or hippocampus), the goal of the studied system is to *transmit* the largest amount of bits about the input in the presence of noise. In my view, this does not make the proposed framework "independent of the computational goal". Furthermore, the derived theory is then applied to a DDC model which proposes a very specific solution to inference problems. The relationship between information transmission and inference is deep and nuanced. Because the writing is very dense, it is quite hard to understand how the information transmission framework developed in the first part applies to the inference problem. How does the neural coding diagram in Figure 3 map onto the inference diagram in Figure 10? How does the problem of information transmission under constraints from the first part of the manuscript become an inference problem with DDCs? I am certain that authors have good answers to these questions - but they should be explained much better.

(2) Clarity of writing for an interdisciplinary audience. I do not believe that in its current form, the manuscript is accessible to a broader, interdisciplinary audience such as eLife readers. The writing is very dense and technical, which I believe unnecessarily obscures the key results of this study.

(3) Positioning within the context of the field and relationship to prior work. While the proposed theory is interesting and timely, the manuscript omits multiple closely related results which in my view should be discussed in relationship to the current work. In particular:

A number of recent studies propose normative criteria for gain modulation in populations:

- Duong, L., Simoncelli, E., Chklovskii, D. and Lipshutz, D., 2024. Adaptive whitening with fast gain modulation and slow synaptic plasticity. Advances in Neural Information Processing Systems
- Tring, E., Dipoppa, M. and Ringach, D.L., 2023. A power law describes the magnitude of adaptation in neural populations of primary visual cortex. Nature Communications, 14(1), p.8366.
- Młynarski, W. and Tkačik, G., 2022. Efficient coding theory of dynamic attentional modulation. PLoS Biology
- Haimerl, C., Ruff, D.A., Cohen, M.R., Savin, C. and Simoncelli, E.P., 2023. Targeted V1 co-modulation supports task-adaptive sensory decisions. Nature Communications
- The Ganguli and Simoncelli framework has been extended to a multivariate case and analyzed for a generalized class of error measures:
- Yerxa, T.E., Kee, E., DeWeese, M.R. and Cooper, E.A., 2020. Efficient sensory coding of multidimensional stimuli. PLoS Computational Biology
- Wang, Z., Stocker, A.A. and Lee, D.D., 2016. Efficient neural codes that minimize LP reconstruction error. Neural Computation, 28(12),

More detailed comments and feedback:

(1) I believe that this work offers the possibility to address an important question about novelty responses in the cortex (e.g. Homann et al, 2021 PNAS). Are they encoding novelty per-se, or are they inefficient responses of a not-yet-adapted population? Perhaps it's worth speculating about.

(2) Clustering in populations - typically in efficient coding studies, tuning curve distributions are a consequence of input statistics, constraints, and optimality criteria. Here the authors introduce randomly perturbed curves for each cluster - how to interpret that in light of the efficient coding theory? This links to a more general aspect of this work - it does not specify how to find optimal tuning curves, just how to modulate them (already addressed in the discussion).

(3) Figure 8 - where do Hz come from as physical units? As I understand there are no physical units in simulations.

(4) Inference with DDCs in changing environments. To perform efficient inference in a dynamically changing environment (as considered here), an ideal observer needs some form of posterior-prior updating. Where does that enter here?

(5) Page 6 - "We did this in such a way that, for all ν, the correlation matrices, ρ(ν), were derived from covariance matrices with a 1/n power-law eigenspectrum (i.e., the ranked eigenvalues of the covariance matrix fall off inversely with their rank), in line with the findings of Stringer et al. (2019) in the primary visual cortex." This is a very specific assumption, taken from a study of a specific brain region - how does it relate to the generality of the approach?

Reviewer #2 (Public review):

Summary:

Using the theory of efficient coding, the authors study how neural gains may be adjusted to optimize coding by noisy neural populations while minimizing metabolic costs. The manuscript first presents mathematical results for the general case where the computational goals of the neural population are not specified (the computation is implicit in the assumed tuning curves) and then develops the theory for a specific probabilistic coding scheme. The general theory provides an explanation for firing rate homeostasis at the level of neural clusters with firing rate heterogeneity within clusters, and the specific application further captures stimulus-specific and neuron-specific adaptation in the visual cortex.

The mathematical derivations, simulations, and application to visual cortex data are solid as far as I can tell.

In the current format, the significance is difficult to assess fully: the manuscript is a bit sprawling, in the first half the general theory is lengthy and technical, and then in the second half a few phenomena are addressed without a clear relation between them (rate homeostasis, rate heterogeneity, synaptic homeostasis, V1 adaptation, divisive normalization), requiring several ad-hoc choices and assumptions.

Strengths:

The problem of efficient coding is a long-standing and important one. This manuscript contributes to that field by proposing a theory of efficient coding through gain adjustments, independent of the computational goals of the system. The main result is a normative explanation for firing rate homeostasis at the level of neural clusters (groups of neurons that perform a similar computation) with firing rate heterogeneity within each cluster. Both phenomena are widely observed, and reconciling them under one theory is important.

The mathematical derivations are thorough as far as I can tell. Although the model of neural activity is artificial, the authors make sure to include many aspects of cortical physiology, while also keeping the models quite general.

Section 2.5 derives the conditions in which homeostasis would be near-optimal in the cortex, which appear to be consistent with many empirical observations in V1. This indicates that homeostasis in V1 might be indeed close to the optimal solution to code efficiently in the face of noise.

The application to the data of Benucci et al 2013 is the first to offer a normative explanation of stimulus-specific and neuron-specific adaptation in V1.

Weaknesses:

The novelty and significance of the work are not presented clearly. The relation to other theoretical work, particularly Ganguli and Simoncelli and other efficient coding theories, is explained in the Discussion but perhaps would be better placed in the Introduction, to motivate some of the many choices of the mathematical models used here.

The manuscript is very hard to read as is, it almost feels like this could be two different papers. The first half seems like a standalone document, detailing the general theory with interesting results on homeostasis and optimal coding. The second half, from Section 2.7 on, presents a series of specific applications that appear somewhat disconnected, are not very clearly motivated nor pursued in-depth, and require ad-hoc assumptions.

For instance, it is unclear if the main significant finding is the role of homeostasis in the general theory or the demonstration that homeostatic DDC with Bayes Ratio coding captures V1 adaptation phenomena. It would be helpful to clarify if this is being proposed as a new/better computational model of V1 compared to other existing models.

Early on in the manuscript (Section 2.1), the theory is presented as general in terms of the stimulus dimensionality and brain area, but then it is only demonstrated for orientation coding in V1.

The manuscript relies on a specific response noise model, with arbitrary tuning curves. Using a population model with arbitrary tuning curves and noise covariance matrix, as the basis for a study of coding optimality, is problematic because not all combinations of tuning curves and covariances are achievable by neural circuits (e.g. https://pubmed.ncbi.nlm.nih.gov/27145916/ )

The paper Benucci et al 2013 shows that homeostasis holds for some stimulus distributions, but not others i.e. when the 'adapter' is present too often. This manuscript, like the Benucci paper, discards those datasets. But from a theoretical standpoint, it seems important to consider why that would be the case, and if it can be predicted by the theory proposed here.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation