Efficient coding of natural scene statistics predicts discrimination thresholds for grayscale textures

  1. Tiberiu Tesileanu  Is a corresponding author
  2. Mary M Conte
  3. John J Briguglio
  4. Ann M Hermundstad
  5. Jonathan D Victor
  6. Vijay Balasubramanian
  1. Flatiron Institute, United States
  2. Weill Cornell Medical College, United States
  3. Howard Hughes Medical Institute, United States
  4. University of Pennsylvania, United States

Abstract

Previously, in (Hermundstad et al., 2014), we showed that when sampling is limiting, the efficient coding principle leads to a 'variance is salience' hypothesis, and that this hypothesis accounts for visual sensitivity to binary image statistics. Here, using extensive new psychophysical data and image analysis, we show that this hypothesis accounts for visual sensitivity to a large set of grayscale image statistics at a striking level of detail, and also identify the limits of the prediction. We define a 66-dimensional space of local grayscale light-intensity correlations, and measure the relevance of each direction to natural scenes. The 'variance is salience' hypothesis predicts that two-point correlations are most salient, and predicts their relative salience. We tested these predictions in a texture-segregation task using un-natural, synthetic textures. As predicted, correlations beyond second order are not salient, and predicted thresholds for over 300 second-order correlations match psychophysical thresholds closely (median fractional error < 0:13).

Data availability

All the code and data necessary to reproduce the results from the manuscript are available at https://github.com/ttesileanu/TextureAnalysis.

The following previously published data sets were used

Article and author information

Author details

  1. Tiberiu Tesileanu

    Center for Computational Biology, Flatiron Institute, New York, United States
    For correspondence
    ttesileanu@gmail.com
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3107-3088
  2. Mary M Conte

    Brain and Mind Institute, Weill Cornell Medical College, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. John J Briguglio

    Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Ann M Hermundstad

    Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0377-0516
  5. Jonathan D Victor

    Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medical College, New-York, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9293-0111
  6. Vijay Balasubramanian

    Department of Physics, University of Pennsylvania, Philadelphia, United States
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6497-3819

Funding

US-Israel Binational Science Foundation (2011058)

  • Vijay Balasubramanian

National Eye Institute (EY07977)

  • Mary M Conte
  • Jonathan D Victor
  • Vijay Balasubramanian

Swartz Foundation

  • Tiberiu Tesileanu

Howard Hughes Medical Institute

  • John J Briguglio
  • Ann M Hermundstad

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: This work was carried out with the subjects' informed consent, and in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and the approval of the Institutional Review Board of Weill Cornell. The IRB protocol number is 0904010359.

Copyright

© 2020, Tesileanu et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,506
    views
  • 177
    downloads
  • 20
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Tiberiu Tesileanu
  2. Mary M Conte
  3. John J Briguglio
  4. Ann M Hermundstad
  5. Jonathan D Victor
  6. Vijay Balasubramanian
(2020)
Efficient coding of natural scene statistics predicts discrimination thresholds for grayscale textures
eLife 9:e54347.
https://doi.org/10.7554/eLife.54347

Share this article

https://doi.org/10.7554/eLife.54347

Further reading

    1. Biochemistry and Chemical Biology
    2. Neuroscience
    Silvia Galli, Marco Di Antonio
    Insight

    The buildup of knot-like RNA structures in brain cells may be the key to understanding how uncontrolled protein aggregation drives Alzheimer’s disease.

    1. Neuroscience
    Paul I Jaffe, Gustavo X Santiago-Reyes ... Russell A Poldrack
    Research Article

    Evidence accumulation models (EAMs) are the dominant framework for modeling response time (RT) data from speeded decision-making tasks. While providing a good quantitative description of RT data in terms of abstract perceptual representations, EAMs do not explain how the visual system extracts these representations in the first place. To address this limitation, we introduce the visual accumulator model (VAM), in which convolutional neural network models of visual processing and traditional EAMs are jointly fitted to trial-level RTs and raw (pixel-space) visual stimuli from individual subjects in a unified Bayesian framework. Models fitted to large-scale cognitive training data from a stylized flanker task captured individual differences in congruency effects, RTs, and accuracy. We find evidence that the selection of task-relevant information occurs through the orthogonalization of relevant and irrelevant representations, demonstrating how our framework can be used to relate visual representations to behavioral outputs. Together, our work provides a probabilistic framework for both constraining neural network models of vision with behavioral data and studying how the visual system extracts representations that guide decisions.