Endogenous Precision of the Number Sense

  1. Department of Economics, Columbia University, New York, USA

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Hang Zhang
    Peking University, Beijing, China
  • Senior Editor
    Michael Frank
    Brown University, Providence, United States of America

Reviewer #1 (Public review):

Summary:

The "number sense" refers to an imprecise and noisy representation of number. Many researchers propose that the number sense confers a fixed (exogenous) subjective representation of number that adheres to scalar variability, whereby the variance of the representation of number is linear in the number.

This manuscript investigates whether the representation of number is fixed, as usually assumed in the literature, or whether it is endogenous. The two dimensions on which the authors investigate this endogeneity are the subject's prior beliefs about stimuli values and the task objective. Using two experimental tasks, the authors collect data that are shown to violate scalar variability and are instead consistent with a model of optimal encoding and decoding, where the encoding phase depends endogenously on prior and task objectives. I believe the paper asks a critically important question. The literature in cognitive science, psychology, and increasingly in economics, has provided growing empirical evidence of decision-making consistent with efficient coding. However, the precise model mechanics can differ substantially across studies. This point was made forcefully in a paper by Ma and Woodford (2020, Behavioral & Brain Sciences), who argue that different researchers make different assumptions about the objective function and resource constraints across efficient coding models, leading to a proliferation of different models with ad-hoc assumptions. Thus, the possibility that optimal coding depends endogenously on the prior and the objective of the task, opens the door to a more parsimonious framework in which assumptions of the model can be constrained by environmental features. Along these lines, one of the authors' conclusions is that the degree of variability in subjective responses increases sublinearly in the width of the prior. And importantly, the degree of this sublinearity differs across the two tasks, in a manner that is consistent with a unified efficient coding model.

Comments:

(1) Modeling and implementation of estimation task

The biggest concern I have with the paper is about the experimental implementation and theoretical account of the estimation task. The salient features of the experimental data (Figure 1C) are that the standard deviations of subjects' estimated quantities are hump-shaped in the true stimulus x and that the standard deviation, conditional on the true stimulus x, is increasing in prior width. The authors attribute these features to a Bayesian encoding and decoding model in which the internal representation of the quantity is noisy, and the degree of noise depends on the prior - as in models of efficient coding (Wei and Stocker 2015 Nature Neuro; Bhui and Gershman 2018 Psych Review; Hahn and Wei 2024 Nature Neuro).

The concern I have is about the final "step" in the model, where the authors assume there is an additional layer of motor noise in selecting the response. The authors posit that the subject's selection of the response is drawn from a Gaussian with a mean set to the optimally decoded estimate x*(r), and variance set to a free parameter sigma_0^2. However, the authors also assume that the Gaussian distribution is "truncated to the prior range." This truncation is a nontrivial assumption, and I believe that on its own, it can explain many features of the data.

To see this, assume that there is no noise in the internal representation of x, there is only motor noise. This corresponds to a special case of the authors' model in which υ is set to 0. The model then reduces to a simple account in which responses are drawn from a Gaussian distribution centered at the true value of x, but with asymmetric noise due to the truncation. I simulated such a model with sigma_0=7. The resulting standard deviations of responses for each value of x (based on 1000 draws for each value of x), across the three different priors, reproduce the salient patterns of the standard deviation in Figure 1C: i) within each condition, the standard deviation is hump-shaped and peaks at x=60 and ii) conditional on x, standard deviation increases in prior width. The takeaway is that this simple model with only truncated motor noise - and without any noisy or efficient coding of internal representations - provides an alternative channel through which the prior affects behavior.

Of course, this does not imply that subjects' coding is not described by the efficient encoding and decoding model posited by the authors. However, it does suggest an important alternative mechanism for the authors' theoretical results in the estimation task. Moreover, some of the quantitative conclusions about the differences in behavior with the discrimination task would be greatly affected by the assumption of truncated motor noise.

Turning to the experiment, a basic question is whether such a truncation was actually implemented in the design. That is, was the range of the slider bar set to the range of the prior? (The methods section states that the size on the screen of the slider was proportional to the prior width, but it was unclear whether the bounds of the slider bar changed with the prior). If the slider bar range did depend on the prior, then it becomes difficult to interpret the data. If not, then perhaps one can perform analyses to understand how much the motor noise is responsible for the dependence of the standard deviation on both x and the prior width. Indeed, the authors emphasize that their model is best fit at α=0.48, which would seem to imply that the best fitting value of υ is strictly positive. However, it would be important to clarify whether the estimation procedure allowed for υ=0, or whether this noise parameter was constrained to be positive (i.e., clarify whether the estimation assumed noisy and efficient coding of internal representations).

(2) Differences across tasks

A main takeaway from the paper is that optimal coding depends on the expected reward function in each task. This is the explanation for why the degree of sublinearity between standard deviation and prior width changes across the estimation and discrimination task. But besides the two different reward functions, there are also other differences across the two tasks. For example, the estimation task involves a single array of dots, whereas the discrimination task involves a pair of sequences of Arabic numerals. Related to the discussion above, in the estimation task the response scale is continuous whereas in the discrimination task, responses are binary. Is it possible that these other differences in the task could contribute to the observed different degrees of sublinearity? It is likely beyond the scope of the paper to incorporate these differences into the model, but such differences across the two tasks should be discussed as potential drivers of differences in observed behavior.

If it becomes too difficult to interpret the data from the estimation task due to the slider bar varying with the prior range, then which of the paper's conclusions would still follow when restricting the analysis to the discrimination task?

(3) Placement literature

One closely related experiment to the discrimination task in the current paper can be found in Frydman and Jin (2022 Quarterly Journal of Economics). Those authors also experimentally vary the width of a uniform prior in a discrimination task using Arabic numerals, in order to test principles of efficient coding. Consistent with the current findings, Frydman and Jin find that subjects exhibit greater precision when making judgments about numbers drawn from a narrower distribution. However, what the current manuscript does is it goes beyond Frydman and Jin by modeling and experimentally varying task objectives to understand and test the effects on optimal coding. This contribution should be highlighted and contrasted against the earlier experimental work of Frydman and Jin to better articulate the novelty of the current manuscript.

Reviewer #2 (Public review):

Summary:

This paper provides an ingenious experimental test of an efficient coding objective based on optimization as a task success. The key idea is that different tasks (estimation vs discrimination) will, under the proposed model, lead to a different scaling between the encoding precision and the width of the prior distribution. Empirical evidence in two tasks involving number perception supports this idea.

Strengths:

- The paper provides an elegant test of a prediction made by a certain class of efficient coding models previously investigated theoretically by the authors.

The results in experiments and modeling suggest that competing efficient coding models, optimizing mutual information alone, may be incomplete by missing the role of the task.

Weaknesses:

- The claims would be more strongly validated if data were present at more than two widths in the discrimination experiment.

- A very strong prediction of the model -- which determines encoding entirely from prior and task -- is that Fisher Information is uniform throughout the range, strongly at odds with the traditional assumption of imprecision increasing with the numerosity (Weber/Fechner law). This prediction should be checked against the data collected. It may not be trivial to determine this in the Estimation experiment, but should be feasible in the Discrimination experiment in the Wide condition: Is there really no difference in discriminability at numbers close to 10 vs numbers close to 90? Figure 2 collapses over those, so it's not evident whether such a difference holds or not. I'd have loved to look into this in reviewing, but the authors have not yet made their data publicly available - I strongly encourage them to do so.

Importantly, the inverse u-shaped pattern in Figure 1 is itself compatible with a Weber's-law-based encoding, as shown by simulation in Figure 5d in Hahn&Wei [1]. This suggests a potential competing variant account, in apparent qualitative agreement with the findings reported: the encoding is compatible with Fisher's law, and only a single scalar, the magnitude of sensory noise, is optimized for the task for the loss function (3). As this account would be substantially more in line with traditional accounts of numerosity perception - while still exhibiting task-dependence of encoding as proposed by the authors - it would be worth investigating if it can be ruled out based on the data gathered for this paper.

References:

[1] Hahn & Wei, A unifying theory explains seemingly contradictory biases in perceptual estimation, Nature Neuroscience 2024

Reviewer #3 (Public review):

Summary:

This work demonstrates that people's imprecision in numeric perception varies with the stimulus context and task goal. By measuring imprecision across different widths of uniform prior distributions in estimation and discrimination tasks, the authors find that imprecision changes sublinearly with prior width, challenging previous range normalization models. They further show that these changes align with the efficient encoding model, where decision-makers balance expected rewards and encoding costs optimally.

Strengths:

The experimental design is straightforward, controlling the mean of the number distribution while varying the prior width. By assessing estimation errors and discrimination accuracy, the authors effectively highlight how imprecision adjusts across conditions.

The model's predictions align well with the data, with the exponential terms (1/2 and 3/4) of imprecision changes matching the empirical results impressively.

Weaknesses:

Some details in the model section are unclear. Specifically, I'm puzzled by the Wiener process assumption where r∣x∼N(m(x)T,s^2T). Does this imply that both the representation of number x and the noise are nearly zero at the beginning, increasing as observation time progresses? This seems counterintuitive, and a clearer explanation would be helpful.

The authors explore range normalization models with Gaussian representation, but another common approach is the logarithmic representation (Barretto-García et al., 2023; Khaw et al., 2021). Could the logarithmic representation similarly lead to sublinearity in noise and distribution width?

Additionally, Heng et al. (2020) found that subjects did not alter their encoding strategy across different task goals, which seems inconsistent with the fully adaptive representation proposed here. I didn't find the analysis of participants' temporal dynamics of adaptation. The behavioral results in the manuscript seem to imply that the subjects adopted different coding schemes in a very short period of time. Yet in previous studies of adaptation, experimental results seem to be more supportive of a partial adaptive behavior (Bujold et al., 2021; Heng et al., 2020), which might balance experimental and real-world prior distributions. Analyzing temporal dynamics might provide more insight. Noting that the authors informed subjects about the shape of the prior distribution before the experiment, do the results in this manuscript suggest a top-down rapid modulation of number representation?

Barretto-García, M., De Hollander, G., Grueschow, M., Polanía, R., Woodford, M., & Ruff, C. C. (2023). Individual risk attitudes arise from noise in neurocognitive magnitude representations. Nature Human Behaviour, 7(9), 1551-1567. https://doi.org/10.1038/s41562-023-01643-4

Bujold, P. M., Ferrari-Toniolo, S., & Schultz, W. (2021). Adaptation of utility functions to reward distribution in rhesus monkeys. Cognition, 214, 104764. https://doi.org/10.1016/j.cognition.2021.104764

Heng, J. A., Woodford, M., & Polania, R. (2020). Efficient sampling and noisy decisions. eLife, 9, e54962. https://doi.org/10.7554/eLife.54962

Khaw, M. W., Li, Z., & Woodford, M. (2021). Cognitive Imprecision and Small-Stakes Risk Aversion. The Review of Economic Studies, 88(4), 1979-2013. https://doi.org/10.1093/restud/rdaa044

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation